>IMO I don't think the DirectRunner should depend directly on any specific >logging backend (at least, not in the compile or runtime scopes). I think it should depend on JUL in the test scope, so that there are logs when executing DirectRunner tests. >My reasoning: I can see in any binary version of Beam that the SDK, the >DirectRunner, and 1 or more other runners will all be on the classpath. >Ideally this should work regardless of whatever other runner is used; >presumably the DirectRunner would "automagically" pick up the logging config of the other runner. That sounds like a very plausible scenario and this would "protect" the runner's binding from an intruding binding from direct runner, since it would have no binding. However, there is also the scenario that a user runs the examples using direct runner, this is their first interaction with Beam, and they see no logs whatsoever, they would have to add a binding. We could solve this by adding a binding in the 'direct-runner' profile in examples module and the maven archetypes (And allow only one runner profile to be specified at a time, in case their logger binding clashes).
>I like the use of slf4j as it enables lots of publishers of logs, but I don't want to supply a default/required consumer of logs because that will restrict use cases in the future... I agree, forcing log4j binding might give the user a false sense of: "all runners use log4j" while this might not be true for future (and isn't true today, for Dataflow runner), but we can't assure that future runners could support this. So it seems we're left with: 1) Add documentation around logging in each runner. 2) Consider enabling a binding (JUL) for direct runner profile in examples module and maven archetypes. 3) Allow only one runner profile to be active at a time in examples and maven archetypes as their logger binding might clash. Thoughts? On Tue, Apr 4, 2017 at 8:51 AM Dan Halperin <dhalp...@google.com.invalid> wrote: > At this point, I'm a little unclear on what is the proposal. Can you > refresh a simplified/aggregated view after this conversation? > > IMO I don't think the DirectRunner should depend directly on any specific > logging backend (at least, not in the compile or runtime scopes). I think > it should depend on JUL in the test scope, so that there are logs when > executing DirectRunner tests. > > My reasoning: I can see in any binary version of Beam that the SDK, the > DirectRunner, and 1 or more other runners will all be on the classpath. > Ideally this should work regardless of whatever other runner is used; > presumably the DirectRunner would "automagically" pick up the logging > config of the other runner. > > I like the use of slf4j as it enables lots of publishers of logs, but I > don't want to supply a default/required consumer of logs because that will > restrict use cases in the future... > > On Mon, Apr 3, 2017 at 8:14 PM, Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > > > Fair enough. +1 especially for the documentation. > > > > Regards > > JB > > > > > > On 04/03/2017 08:48 PM, Aviem Zur wrote: > > > >> Upon further inspection there seems to be an issue we may have > overlooked: > >> In cluster mode, some of the runners will have dependencies added > directly > >> to the classpath by the cluster, and since SLF4J can only work with one > >> binding, the first one in the classpath will be used. > >> > >> So while what we suggested would work in local mode, the user's chosen > >> binding and configuration might be ignored in cluster mode, which is > >> detrimental to what we wanted to accomplish. > >> > >> So I believe what we should do instead is: > >> > >> 1. Add better documentation regarding logging in each runner, which > >> binding is used, perhaps examples of how to configure logging for > that > >> runner. > >> 2. Have direct runner use the most common binding among runners (this > >> appears to be log4j which is used by Spark runner, Flink runner and > >> Apex > >> runner). > >> > >> > >> On Mon, Apr 3, 2017 at 7:02 PM Aljoscha Krettek <aljos...@apache.org> > >> wrote: > >> > >> Yes, I think we can exclude log4j from the Flink dependencies. It’s > >>> somewhat annoying that they are there in the first place. > >>> > >>> The Flink doc has this to say about the topic: > >>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/ > >>> monitoring/logging.html > >>> > >>>> On 3. Apr 2017, at 17:56, Aviem Zur <aviem...@gmail.com> wrote: > >>>> > >>>> * java.util.logging could be a good choice for the Direct Runner > >>>>> > >>>> Yes, this will be great for users (Instead of having no logging when > >>>> > >>> using > >>> > >>>> direct runner). > >>>> > >>>> * Logging backend could be runner-specific, particularly if it needs > to > >>>>> integrate into some other experience > >>>>> > >>>> Good point, let's take a look at the current state of runners: > >>>> Direct runner - will use JUL as suggested. > >>>> Dataflow runner - looks like there is already no binding (There is a > >>>> binding in tests only). > >>>> Spark runner - currently uses slf4j-log4j12. does not require any > >>>> > >>> specific > >>> > >>>> logger, we can change this to no binding. > >>>> Flink runner - uses slf4j-log4j12 transitively from Flink > dependencies. > >>>> > >>> I'm > >>> > >>>> assuming this is not a must and we can default to no binding here. > >>>> @aljoscha please confirm. > >>>> Apex runner - uses slf4j-log4j12 transitively from Apex dependencies. > >>>> I'm > >>>> assuming this is not a must and we can default to no binding here. > @thw > >>>> please confirm. > >>>> > >>>> It might be a good idea to use a consistent binding in tests (Since > >>>> we'll > >>>> use JUL for direct runner, let this be JUL). > >>>> > >>>> On Wed, Mar 29, 2017 at 7:23 PM Davor Bonaci <da...@apache.org> > wrote: > >>>> > >>>> +1 on consistency across Beam modules on the logging facade > >>>> +1 on enforcing consistency > >>>> +1 on clearly documenting how to do logging > >>>> > >>>> Mixed feelings: > >>>> * Logging backend could be runner-specific, particularly if it needs > to > >>>> integrate into some other experience > >>>> * java.util.logging could be a good choice for the Direct Runner > >>>> > >>>> On Tue, Mar 28, 2017 at 6:50 PM, Ahmet Altay <al...@google.com.invalid > > > >>>> wrote: > >>>> > >>>> On Wed, Mar 22, 2017 at 10:38 AM, Tibor Kiss <tk...@hortonworks.com> > >>>>> wrote: > >>>>> > >>>>> This is a great idea! > >>>>>> > >>>>>> I believe Python-SDK's logging could also be enhanced (a bit > >>>>>> > >>>>> differently): > >>>>> > >>>>>> Currently we are not instantiating the logger, just using the class > >>>>>> > >>>>> what > >>> > >>>> logging package provides. > >>>>>> Shortcoming of this approach is that the user cannot set the log > level > >>>>>> > >>>>> on > >>>> > >>>>> a per module basis as all log messages > >>>>>> end up in the root level. > >>>>>> > >>>>>> > >>>>> +1 to this. Python SDK needs to expands its logging capabilities. > Filed > >>>>> > >>>> [1] > >>>> > >>>>> for this. > >>>>> > >>>>> Ahmet > >>>>> > >>>>> [1] https://issues.apache.org/jira/browse/BEAM-1825 > >>>>> > >>>>> > >>>>> > >>>>>> On 3/22/17, 5:46 AM, "Aviem Zur" <aviem...@gmail.com> wrote: > >>>>>> > >>>>>> +1 to what JB said. > >>>>>> > >>>>>> Will just have to be documented well as if we provide no binding > >>>>>> > >>>>> there > >>>>> > >>>>>> will > >>>>>> be no logging out of the box unless the user adds a binding. > >>>>>> > >>>>>> On Wed, Mar 22, 2017 at 6:24 AM Jean-Baptiste Onofré < > >>>>>> > >>>>> j...@nanthrax.net> > >>>>> > >>>>>> wrote: > >>>>>> > >>>>>> Hi Aviem, > >>>>>>> > >>>>>>> Good point. > >>>>>>> > >>>>>>> I think, in our dependencies set, we should just depend to > >>>>>>> > >>>>>> slf4j-api > >>>>> > >>>>>> and > >>>>>> > >>>>>>> let the > >>>>>>> user provides the binding he wants (slf4j-log4j12, slf4j-simple, > >>>>>>> > >>>>>> whatever). > >>>>>> > >>>>>>> > >>>>>>> We define a binding only with test scope in our modules. > >>>>>>> > >>>>>>> Regards > >>>>>>> JB > >>>>>>> > >>>>>>> On 03/22/2017 04:58 AM, Aviem Zur wrote: > >>>>>>> > >>>>>>>> Hi all, > >>>>>>>> > >>>>>>>> There have been a few reports lately (On JIRA [1] and on Slack) > >>>>>>>> > >>>>>>> from > >>>>>> > >>>>>>> users > >>>>>>> > >>>>>>>> regarding inconsistent loggers used across Beam's modules. > >>>>>>>> > >>>>>>>> While we use SLF4J, different modules use a different logger > >>>>>>>> > >>>>>>> behind it > >>>>>> > >>>>>>> (JUL, log4j, etc) > >>>>>>>> So when people add a log4j.properties file to their classpath > >>>>>>>> > >>>>>>> for > >>>> > >>>>> instance, > >>>>>>> > >>>>>>>> they expect this to affect all of their dependencies on Beam > >>>>>>>> > >>>>>>> modules, but > >>>>>> > >>>>>>> it doesn’t and they miss out on some logs they thought they > >>>>>>>> > >>>>>>> would > >>>> > >>>>> see. > >>>>>> > >>>>>>> > >>>>>>>> I think we should strive for consistency in which logger is used > >>>>>>>> > >>>>>>> behind > >>>>>> > >>>>>>> SLF4J, and try to enforce this in our modules. > >>>>>>>> I for one think it should be slf4j-log4j. However, if > >>>>>>>> > >>>>>>> performance > >>>> > >>>>> of > >>>>>> > >>>>>>> logging is critical we might want to consider logback. > >>>>>>>> > >>>>>>>> Note: SLF4J will still be the facade for logging across the > >>>>>>>> > >>>>>>> project. The > >>>>>> > >>>>>>> only change would be the logger SLF4J delegates to. > >>>>>>>> > >>>>>>>> Once we have something like this it would also be useful to add > >>>>>>>> documentation on logging in Beam to the website. > >>>>>>>> > >>>>>>>> [1] https://issues.apache.org/jira/browse/BEAM-1757 > >>>>>>>> > >>>>>>>> > >>>>>>> -- > >>>>>>> Jean-Baptiste Onofré > >>>>>>> jbono...@apache.org > >>>>>>> http://blog.nanthrax.net > >>>>>>> Talend - http://www.talend.com > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>> > >>> > >> > > -- > > Jean-Baptiste Onofré > > jbono...@apache.org > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > >