Re: Debugging pipelines

Jean-Baptiste Onofré Sun, 20 Mar 2016 08:19:10 -0700

Hi,

thanks for the update.


IMHO, I would name Debug transform as Log:

.apply(Log.withLevel("DEBUG"))
.apply(Log.withLevel("INFO").withPattern("%d %m ..."))
.apply(Log.withLevel("WARN").withMessage("Foo").withStream("System.out")

It would more flexible and related to the actual behavior.

I would mimic a bit the Camel log component for instance.

If you don't mind, I will do it with you.

Thanks
Regards
JB

On 03/20/2016 12:07 PM, Ismaël Mejía wrote:

Hi,

The code of the transform is here in a playground for Beam experiments I
created (it is a bit alpha for the moment, and it does not have comments):

https://github.com/iemejia/beam-playground/blob/master/src/main/java/org/apache/beam/transforms/Debug.java

Since my initial goal was more of a test scenario in the
DirectPipelineRunner I haven't considered yet more advanced logging
capabilities and the possible issues of distribution (serialization, in
particular of dependencies, as well as exceptions, etc), but of course
it is something I expect to improve if there is interest. Do you see
some immediate things to improve to try it with the distributed runners
(I want to do this, as a excuse also to  try the FlinkRunner).

Best,
-Ismael


On Sun, Mar 20, 2016 at 11:13 AM, Jean-Baptiste Onofré <[email protected]
<mailto:[email protected]>> wrote:

    By the way, for the "Integration" DSL, in addition of explicit debug
    transform, it would make sense to have an implicit "Tracer". It's
    something that I planned: it would allow us to have sampling on
    PCollection if the pipeline tracer is enabled (like we do in a Camel
    route with the tracer).

    Regards
    JB

    On 03/20/2016 10:14 AM, Ismaël Mejía wrote:

        Hello,

        I just started playing with Beam and I wanted to debug what happens
        between transforms in pipelines. I wrote a simple 'Debug'
        transform for
        this.
        The idea is to apply a function based on a predicate to any
        element in a
        collection without changing the collection, or in other words, a
        transform that
        does not transform but produces side effects.

        The idea is better illustrated with this simple example:

              .apply(FlatMapElements.via((String text) ->
        Arrays.asList(text.split(" ")))
                .withOutputType(new TypeDescriptor<String>() {
               }))
              .apply(Debug
                .when((String s) -> s.startsWith("A"))
                .with((String s) -> {
                  System.out.println(s);
                  return null;
                }));
              .apply(Filter.byPredicate((String text) -> text.length() > 5))
              .apply(Debug.print());  // sugared method, same as above

        I think this can be useful (at least for debugging purposes), is
        there
        something
        like this already in the SDK ? If this is not the case, can you
        please
        give me some
        feedback/ideas to improve my transform.

        Thanks,
        -Ismael

        ps. You can find the code of the first version of the transform
        here:
        
https://github.com/iemejia/beam-playground/blob/master/src/main/java/org/apache/beam/transforms/Debug.java



    --
    Jean-Baptiste Onofré
    [email protected] <mailto:[email protected]>
    http://blog.nanthrax.net
    Talend - http://www.talend.com


--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: Debugging pipelines

Reply via email to