Hello,

I agree with you JB, Log is a more appropriate name for the case of 'print',
we can definitely create a richer transform with your ideas, and we will
discuss the details later on when we start to work together.

The more abstract case which I call Debug since I didn't find a better name
is a
general transform that can be the base of many others who produce side
effects
but don't change the data in the PTransform, that's why I consider it a
different (more abstract) Transform per se, and I implemented the general
predicate + function application just to prove my point, and the Log/print
case
was just a test of a specific case.

Since I am new to the Dataflow model I don't know which unintended
consequences
this transform can have (or which good practices a transform that
side-effects must
take care of), aditionally I have not thought about how to support more
advanced features of the model (e.g. side inputs/outputs). Any ideas ?

But well, this is my hello world in the Dataflow model, so we'll see what's
to
come :)

-Ismaël



On Sun, Mar 20, 2016 at 4:18 PM, Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi,
>
> thanks for the update.
>
> IMHO, I would name Debug transform as Log:
>
> .apply(Log.withLevel("DEBUG"))
> .apply(Log.withLevel("INFO").withPattern("%d %m ..."))
> .apply(Log.withLevel("WARN").withMessage("Foo").withStream("System.out")
>
> It would more flexible and related to the actual behavior.
>
> I would mimic a bit the Camel log component for instance.
>
> If you don't mind, I will do it with you.
>
> Thanks
> Regards
> JB
>
> On 03/20/2016 12:07 PM, Ismaël Mejía wrote:
>
>> Hi,
>>
>> The code of the transform is here in a playground for Beam experiments I
>> created (it is a bit alpha for the moment, and it does not have comments):
>>
>>
>> https://github.com/iemejia/beam-playground/blob/master/src/main/java/org/apache/beam/transforms/Debug.java
>>
>> Since my initial goal was more of a test scenario in the
>> DirectPipelineRunner I haven't considered yet more advanced logging
>> capabilities and the possible issues of distribution (serialization, in
>> particular of dependencies, as well as exceptions, etc), but of course
>> it is something I expect to improve if there is interest. Do you see
>> some immediate things to improve to try it with the distributed runners
>> (I want to do this, as a excuse also to  try the FlinkRunner).
>>
>> Best,
>> -Ismael
>>
>>
>> On Sun, Mar 20, 2016 at 11:13 AM, Jean-Baptiste Onofré <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     By the way, for the "Integration" DSL, in addition of explicit debug
>>     transform, it would make sense to have an implicit "Tracer". It's
>>     something that I planned: it would allow us to have sampling on
>>     PCollection if the pipeline tracer is enabled (like we do in a Camel
>>     route with the tracer).
>>
>>     Regards
>>     JB
>>
>>     On 03/20/2016 10:14 AM, Ismaël Mejía wrote:
>>
>>         ​Hello,
>>
>>         I just started playing with Beam and I wanted to debug what
>> happens
>>         between transforms in pipelines. I wrote a simple 'Debug'
>>         transform for
>>         this.
>>         The idea is to apply a function based on a predicate to any
>>         element in a
>>         collection without changing the collection, or in other words, a
>>         transform that
>>         does not transform but produces side effects.
>>
>>         The idea is better illustrated with this simple example:
>>
>>               .apply(FlatMapElements.via((String text) ->
>>         Arrays.asList(text.split(" ")))
>>                 .withOutputType(new TypeDescriptor<String>() {
>>                }))
>>               .apply(Debug
>>                 .when((String s) -> s.startsWith("A"))
>>                 .with((String s) -> {
>>                   System.out.println(s);
>>                   return null;
>>                 }));
>>               .apply(Filter.byPredicate((String text) -> text.length() >
>> 5))
>>               .apply(Debug.print());  // sugared method, same as above
>>
>>         I think this can be useful (at least for debugging purposes), is
>>         there
>>         something
>>         like this already in the SDK ? If this is not the case, can you
>>         please
>>         give me some
>>         feedback/ideas to improve my transform.
>>
>>         Thanks,
>>         -Ismael
>>
>>         ps. You can find the code of the first version of the transform
>>         here:
>>
>> https://github.com/iemejia/beam-playground/blob/master/src/main/java/org/apache/beam/transforms/Debug.java
>>
>>
>>
>>     --
>>     Jean-Baptiste Onofré
>>     [email protected] <mailto:[email protected]>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to