Hello, I agree with you JB, Log is a more appropriate name for the case of 'print', we can definitely create a richer transform with your ideas, and we will discuss the details later on when we start to work together.
The more abstract case which I call Debug since I didn't find a better name is a general transform that can be the base of many others who produce side effects but don't change the data in the PTransform, that's why I consider it a different (more abstract) Transform per se, and I implemented the general predicate + function application just to prove my point, and the Log/print case was just a test of a specific case. Since I am new to the Dataflow model I don't know which unintended consequences this transform can have (or which good practices a transform that side-effects must take care of), aditionally I have not thought about how to support more advanced features of the model (e.g. side inputs/outputs). Any ideas ? But well, this is my hello world in the Dataflow model, so we'll see what's to come :) -Ismaël On Sun, Mar 20, 2016 at 4:18 PM, Jean-Baptiste Onofré <[email protected]> wrote: > Hi, > > thanks for the update. > > IMHO, I would name Debug transform as Log: > > .apply(Log.withLevel("DEBUG")) > .apply(Log.withLevel("INFO").withPattern("%d %m ...")) > .apply(Log.withLevel("WARN").withMessage("Foo").withStream("System.out") > > It would more flexible and related to the actual behavior. > > I would mimic a bit the Camel log component for instance. > > If you don't mind, I will do it with you. > > Thanks > Regards > JB > > On 03/20/2016 12:07 PM, Ismaël Mejía wrote: > >> Hi, >> >> The code of the transform is here in a playground for Beam experiments I >> created (it is a bit alpha for the moment, and it does not have comments): >> >> >> https://github.com/iemejia/beam-playground/blob/master/src/main/java/org/apache/beam/transforms/Debug.java >> >> Since my initial goal was more of a test scenario in the >> DirectPipelineRunner I haven't considered yet more advanced logging >> capabilities and the possible issues of distribution (serialization, in >> particular of dependencies, as well as exceptions, etc), but of course >> it is something I expect to improve if there is interest. Do you see >> some immediate things to improve to try it with the distributed runners >> (I want to do this, as a excuse also to try the FlinkRunner). >> >> Best, >> -Ismael >> >> >> On Sun, Mar 20, 2016 at 11:13 AM, Jean-Baptiste Onofré <[email protected] >> <mailto:[email protected]>> wrote: >> >> By the way, for the "Integration" DSL, in addition of explicit debug >> transform, it would make sense to have an implicit "Tracer". It's >> something that I planned: it would allow us to have sampling on >> PCollection if the pipeline tracer is enabled (like we do in a Camel >> route with the tracer). >> >> Regards >> JB >> >> On 03/20/2016 10:14 AM, Ismaël Mejía wrote: >> >> Hello, >> >> I just started playing with Beam and I wanted to debug what >> happens >> between transforms in pipelines. I wrote a simple 'Debug' >> transform for >> this. >> The idea is to apply a function based on a predicate to any >> element in a >> collection without changing the collection, or in other words, a >> transform that >> does not transform but produces side effects. >> >> The idea is better illustrated with this simple example: >> >> .apply(FlatMapElements.via((String text) -> >> Arrays.asList(text.split(" "))) >> .withOutputType(new TypeDescriptor<String>() { >> })) >> .apply(Debug >> .when((String s) -> s.startsWith("A")) >> .with((String s) -> { >> System.out.println(s); >> return null; >> })); >> .apply(Filter.byPredicate((String text) -> text.length() > >> 5)) >> .apply(Debug.print()); // sugared method, same as above >> >> I think this can be useful (at least for debugging purposes), is >> there >> something >> like this already in the SDK ? If this is not the case, can you >> please >> give me some >> feedback/ideas to improve my transform. >> >> Thanks, >> -Ismael >> >> ps. You can find the code of the first version of the transform >> here: >> >> https://github.com/iemejia/beam-playground/blob/master/src/main/java/org/apache/beam/transforms/Debug.java >> >> >> >> -- >> Jean-Baptiste Onofré >> [email protected] <mailto:[email protected]> >> http://blog.nanthrax.net >> Talend - http://www.talend.com >> >> >> > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
