Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-05-31 Thread Thomas Weise
Hi, I'm working on putting together a basic runner for Apache Apex. Hitting a couple of serialization related issues with running tests. Apex is using Kryo for serialization by default (and Kryo can delegate to other serialization frameworks). The inner classes of WindowedValue are private and h

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-02 Thread Thomas Weise
rg/apache/beam/runners/spark/translation/SparkRuntimeContext.java#L73 > > I'd be happy to assist with Kryo. > > Thanks, > Amit > > On Wed, Jun 1, 2016 at 7:10 AM Thomas Weise wrote: > > > Hi, > > > > I'm working on putting together a basic runner for Ap

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-03 Thread Thomas Weise
a#L94 > [4] > > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/WindowedValue.java#L515 > > On Thu, Jun 2, 2016 at 10:41 AM, Thomas Weise > wrote: > > > Hi Amit, > > > > Thanks for the help. I impleme

Re: Serialization for org.apache.beam.sdk.util.WindowedValue$*

2016-06-03 Thread Thomas Weise
Amit, Thanks for this pointer as well, CoderHelpers helps indeed! Thomas On Thu, Jun 2, 2016 at 12:51 PM, Amit Sela wrote: > Oh sorry, of course I meant Thomas Groh in my previous email.. But @Thomas > Weise this example > < > https://github.com/apache/incubator-beam/blob/maste

Re: 0.1.0-incubating release

2016-06-03 Thread Thomas Weise
Another consideration for potential future packaging/distribution solutions is how the artifacts line up as files in a flat directory. For that it may be good to have a common prefix in the artifactId and unique artifactId. The name for the source archive (when relying on ASF parent POM) can also

Using GroupAlsoByWindowViaWindowSetDoFn for stream of input element

2016-06-17 Thread Thomas Weise
Hi, I'm looking into using the GroupAlsoByWindowViaWindowSetDoFn to accumulate the windowed state with the elements arriving one by one (stream). Once the window is complete, I would like to emit an Iterable or another form of aggregation of the elements. Is the following supposed to lead to merg

Re: Using GroupAlsoByWindowViaWindowSetDoFn for stream of input element

2016-06-17 Thread Thomas Weise
to perform the incremental aggregation, without building an intermediate collection? Thomas On Fri, Jun 17, 2016 at 8:27 AM, Thomas Weise wrote: > Hi, > > I'm looking into using the GroupAlsoByWindowViaWindowSetDoFn to accumulate > the windowed state with the elements arriving o

Re: Using GroupAlsoByWindowViaWindowSetDoFn for stream of input element

2016-06-21 Thread Thomas Weise
ombineFns, while buffering is suitable for general > use. > > On Fri, Jun 17, 2016 at 9:52 AM, Thomas Weise > wrote: > > > The source for my windowed groupByKey experiment is here: > > > > > > > https://github.com/tweise/incubator-beam/blob/master/runner

Running examples with different runners

2016-06-21 Thread Thomas Weise
Hi, As part of the Apex runner, we have a few unit tests for the supported transformations. Next, I would like to test the WindowedWordCount example. Is there an example of configuring this pipeline for another runner? Is it recommended to supply such configuration as a JUnit test? What is the ge

Re: Running examples with different runners

2016-06-24 Thread Thomas Weise
ubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/TestSparkRunner.java > > > > Section of pom dedicated to enabling runnable on service tests: > > > > > https://github.com/apache/incubator-beam/blob/master/runners/spark/pom.xml#L54 >

Re: Running examples with different runners

2016-06-24 Thread Thomas Weise
specific needs. > > Finally, I believe Ken just made some changes which removed the requirement > to support View.YYY and replaced it with GroupByKey so the no translator > registered for View... may go away. > > > On Fri, Jun 24, 2016 at 4:52 PM, Thomas Weise > wrote: > >

OldDoFn - CounterSet replacement

2016-08-16 Thread Thomas Weise
I'm trying to rebase a PR and adjust for the DoFn changes. CounterSet is gone and there is now AggregatorFactory and I'm looking to fix an existing usage of org.apache.beam.sdk.util.DoFnRunners.simpleRunner. Given the instance of OldDoFn, what is the recommended way to obtain the aggregator facto

Re: OldDoFn - CounterSet replacement

2016-08-17 Thread Thomas Weise
establish the Apex runner baseline and proper support for aggregators isn't part of it, it's something I was planning to take up in subsequent round. Thomas On Wed, Aug 17, 2016 at 8:14 AM, Ben Chambers wrote: > Hi Thomas! > > On Tue, Aug 16, 2016 at 9:40 PM Thomas Weise

Apex Runner support for View.CreatePCollectionView

2016-09-15 Thread Thomas Weise
Hi, I'm working on the Apex runner ( https://github.com/apache/incubator-beam/pull/540) and based on the integration test results my next target is support for PCollectionView. I looked at the side inputs doc ( https://s.apache.org/beam-side-inputs-1-pager) and see that a suggested implementation

Re: Apex Runner support for View.CreatePCollectionView

2016-09-27 Thread Thomas Weise
; [WindowMappingFn] https://s.apache.org/beam-windowmappingfn-1-pager > <https://s.apache.org/beam-windowmappingfn-1-pager>[PR #737] > https://github.com/apache/incubator-beam/pull/737 > [SideInputHandler] https://github.com/apache/incubator-beam/blob/master/ > runners/core-java/src/mai

Re: Apex Runner support for View.CreatePCollectionView

2016-09-27 Thread Thomas Weise
value and can be tested. This > happens as part of the GroupAlsoByWindowViaWindowSetDoFn (the logic itself > is part of ReduceFnRunner), so if you have state and timers working, you > should see output. > > If this doesn't seem to be happening, maybe you can give some more details? >

Re: Apex Runner support for View.CreatePCollectionView

2016-10-01 Thread Thomas Weise
a new revision of the PR. Thomas On Tue, Sep 27, 2016 at 10:17 PM, Thomas Weise wrote: > I have GroupByKey working, here is a unit test: > > https://github.com/tweise/incubator-beam/blob/BEAM-261. > sideinputs/runners/apex/src/test/java/org/apache/beam/ > runn

Re: [REMINDER] Technical discussion on the mailing list

2016-10-05 Thread Thomas Weise
How about sending just the notifications for creating new JIRA and opening PR to dev@ so that those that are interested can start watching? Thanks, Thomas On Wed, Oct 5, 2016 at 5:33 PM, Dan Halperin wrote: > On Wed, Oct 5, 2016 at 5:13 PM, Daniel Kulp wrote: > > > I just want to give a little

Re: [KUDOS] Contributed runner: Apache Apex!

2016-10-17 Thread Thomas Weise
er lots of review and much thoughtful revision, pull request > #540 > > > has > > > > > been merged to the apex-runner feature branch today. Please do > take a > > > > look, > > > > > and help us put the finishing touches on it to get it ready

Re: [DISCUSS] Sources and Runners

2016-10-19 Thread Thomas Weise
+1 those are probably the most used sources. Hadoop FS has a number of different implementations, HDFS is one of them. On Wed, Oct 19, 2016 at 2:55 AM, Amit Sela wrote: > I agree with Aljoscha about Kafka. > > How about having one integration test for BoundedSource and one for > UnboundedSource

Re: [DISCUSS] Sources and Runners

2016-10-19 Thread Thomas Weise
structure. (See the > performance thread Jason started. IMO the most important issue to solve > first is infrastructure). Please help! > > Dan > > On Wed, Oct 19, 2016 at 7:37 AM, Thomas Weise wrote: > > > +1 those are probably the most used sources. Hadoop FS has a numb

Re: [DISCUSS] Sources and Runners

2016-10-19 Thread Thomas Weise
libs. > > > > On Wed, Oct 19, 2016 at 10:49 PM, Amit Sela > wrote: > > > > > The SparkRunner actually has an embedded Kafka for its unit tests. > > > > > > On Wed, Oct 19, 2016, 20:16 Thomas Weise wrote: > > > > > > &

Re: [ANNOUNCEMENT] New committers!

2016-10-22 Thread Thomas Weise
tors as our newest committers. They have significantly > > contributed > > > to the project in different ways, and we look forward to many more > > > contributions in the future. > > > > > > * Thomas Weise > > > Thomas authored the Apache Apex runner

Re: The Availability of PipelineOptions

2016-10-25 Thread Thomas Weise
+1 On Tue, Oct 25, 2016 at 3:03 AM, Jean-Baptiste Onofré wrote: > +1 > > Agree > > Regards > JB > > ⁣​ > > On Oct 25, 2016, 12:01, at 12:01, Aljoscha Krettek > wrote: > >+1 This sounds quite straightforward. > > > >On Tue, 25 Oct 2016 at 01:36 Thomas Groh > >wrote: > > > >> Hey everyone, > >>

Apex runner integration tests

2016-10-25 Thread Thomas Weise
The Apex runner has the integration tests enabled and that causes Travis PR builds to fail with timeout (they complete in Jenkins). What is the correct setup for this, when are the tests supposed to run? https://github.com/apache/incubator-beam/blob/apex-runner/runners/apex/pom.xml#L190 Thanks,

Re: [DISCUSS] Current ongoing work on runners

2016-10-25 Thread Thomas Weise
I'm planning to take up the discussion about Apex runner current state and proposed next steps in a separate thread. Thanks, Thomas On Tue, Oct 25, 2016 at 10:32 AM, Amit Sela wrote: > SparkRunner status: > > V1 (Spark 1.6.x - DStream/RDD API): > *Batch*: Full model support for batch, continuo

Apex runner status and next steps

2016-10-26 Thread Thomas Weise
Hi, The Apex runner is currently in a feature branch: https://github.com/apache/incubator-beam/tree/apex-runner Focus till here has been on functional completeness. It passes all the integration tests. Apex with its stateful stream processing architecture can support all of the concepts in the

Re: [DISCUSS] Merging master -> feature branch

2016-10-26 Thread Thomas Weise
+1 For a merge from master to the feature branch that does not require extra changes, RTC does not add value. It actually delays and burns reviewer time (even mechanics need some) that "real" PRs could benefit from. If adjustments are needed, then the regular process kicks in. Thanks, Thomas On

Re: Can we have more quick start examples ?

2016-10-27 Thread Thomas Weise
The Beam tutorials seem to address this: https://github.com/eljefe6a/beamexample/blob/master/README.md On Thu, Oct 27, 2016 at 8:04 AM, Manu Zhang wrote: > Hey guys, > > I find Beam examples under the examples folder are not easy to run due to > dependency on Google specific services. Even the

[PROPOSAL] Merge apex-runner to master branch

2016-11-08 Thread Thomas Weise
Hi, As per previous discussion [1], I would like to propose to merge the apex-runner branch into master. The runner satisfies the criteria outlined in [2] and merging it to master will give more visibility to other contributors and users. Specifically the Apex runner addresses: - Have at leas

Re: [PROPOSAL] Merge apex-runner to master branch

2016-11-08 Thread Thomas Weise
Nice. I'm +1 modulo one caveat below (hopefully easily addressed). > > On Tue, Nov 8, 2016 at 5:54 AM, Thomas Weise wrote: > > Hi, > > > > As per previous discussion [1], I would like to propose to merge the > > apex-runner branch into master. The runner satisfies t

Re: Configuring Jenkins

2016-11-15 Thread Thomas Weise
Very nice! On Tue, Nov 15, 2016 at 12:52 PM, Aljoscha Krettek wrote: > +1 I like this a lot! > > On Tue, 15 Nov 2016 at 10:37 Jean-Baptiste Onofré wrote: > > > Fantastic Davor ! > > > > I like this approach, I gonna take a deeper look. > > > > Thanks ! > > > > Regards > > JB > > > > On 11/15/2

Re: Including Apex runner in Beam tutorial at Strata - Singapore

2016-11-15 Thread Thomas Weise
The runner currently only executes in embedded mode and I'm not sure that will change prior to Strata due to dependency on next Apex core release. I would suggest to aim for the following: - Create a sample wordcount project that produces the packaging that is required to launch the pipeli

Re: [DISCUSS] Graduation to a top-level project

2016-11-22 Thread Thomas Weise
+1 I would like to mention the welcoming, growing community and the focus on solid processes and testing. On Tue, Nov 22, 2016 at 11:07 AM, Aljoscha Krettek wrote: > +1 > > I'm quite enthusiastic about the growth of the community and the open > discussions! > > On Tue, 22 Nov 2016 at 19:51 Jas

Re: Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Apex #4

2016-12-15 Thread Thomas Weise
Hi Kenn, You could mark it @Ignored. I would prefer to keep and fix it as it runs as part of the unit tests and provides basic coverage early on. If you have already ideas what is wrong with it please let me know. This test was seen as flaky before. Thanks, Thomas On Thu, Dec 15, 2016 at 12:53

Re: [VOTE] Release 0.4.0-incubating, release candidate #3

2016-12-20 Thread Thomas Weise
+1 - signatures - build from source - run quickstart with Apex runner To get more folks participate in the release verification, it may be helpful to publish verification steps/guidelines on the web site. Thanks, Thomas On Tue, Dec 20, 2016 at 2:22 AM, Sergio Fernández wrote: > +1 (ipmc bin