Raghu, Kenneth, Yes, creating a separate class instead of inner one helped to overcome this issue with serialisation. Seems like this a bug in NexmarkLauncher, so I’ll create a jira for this. Thank you for help with this.
Btw, Raghu, are you going to submit a PR from your branch? I think this is exactly what BEAM-4048 <https://issues.apache.org/jira/browse/BEAM-4048> is about (with some adjustment according to what already was merged, for sure). WBR, Alexey > On 11 Apr 2018, at 19:53, Kenneth Knowles <[email protected]> wrote: > > Yea, this is a common issue with serializable anonymous inner classes in > general. It would be nice if Beam Java could have an overarching solution to > limiting the closure to things actually touched. > > Kenn > > On Wed, Apr 11, 2018 at 10:30 AM Raghu Angadi <[email protected] > <mailto:[email protected]>> wrote: > I noticed it too while adding KafkaIO support for Nexmark (this was in > parallel to another PR for KafkaIO that got merged recently). > The anonymous inner class for DoFn is not serializable. I moved it to a > static class in my branch, but didn't test it yet : > https://github.com/rangadi/beam/commit/b49a9eda9f6170ec0979f54438223ab8d2cd466f#diff-c9c486a395311f6f9ee8b9be0a92d907R756 > > <https://github.com/rangadi/beam/commit/b49a9eda9f6170ec0979f54438223ab8d2cd466f#diff-c9c486a395311f6f9ee8b9be0a92d907R756> > . > > One issue with Nexmark benchmarks with sources like Pubsub and KafkaIO is how > the tests are terminated... > Raghu. > > On Wed, Apr 11, 2018 at 9:54 AM Alexey Romanenko <[email protected] > <mailto:[email protected]>> wrote: > Hi all, > > For the moment, I'm working on BEAM-4048 > <https://issues.apache.org/jira/browse/BEAM-4048> to add Kafka source/sink > support with different modes to Nexmark, like it has for PubSub > (SUBSCRIBE_ONLY, PUBLISH_ONLY and COMBINED). It seems that the code will be > similar to what we have for PubSub so I wanted to do some refactoring and > reuse already existed code for Kafka. So, to make sure that nothing will be > broken after this refactoring, I want to run Nexmark with PubSub source/sink > in different modes before and after. > > I tried to do this with PubSub emulator but I have very strange issue related > to pipeline serialisation - it tries to serialise NexmarkLauncher, see this > output > <https://gist.github.com/aromanenko-dev/a42a0a011f9fbd10b8443bf2ca38c68a> > > Could anyone point me out if I do something wrong running this Nexmark > pipeline and how properly to do it with PubSub as source/sink? > > WBR, > Alexey
