Re: contributor permission for Beam Jira tickets

2019-09-08 Thread Agboola Mukhtar
Thank you very much!

On 2019/09/08 05:25:38, Kenneth Knowles  wrote: 
> Welcome! I have added you to the "Contributor" role so you can be assigned
> Jiras.
> 
> Kenn
> 
> On Sat, Sep 7, 2019 at 9:06 PM Agboola Mukhtar 
> wrote:
> 
> > Hi,
> > I'm Agboola Mukhtar. I'm new to the world of open-source and would like to
> > contribute to Apache Beam.
> > I'd like to be added as a contributor for Beam's Jira issue track.
> > My username is Mukhtar
> >
> > Thanks,
> > Mukhtar
> >
> 


Re: [DISCUSS] Supporting multiple Flink versions vs. tech debt

2019-09-08 Thread Kenneth Knowles
In my suggestion, there is no hacking around with source sets at all. There
is no significant logic in the gradle configuration, which is a goal IMO.

On Sun, Sep 8, 2019 at 3:15 AM Maximilian Michels  wrote:

> One minor addition, the "*1.5_to_1.7_utils" directory you listed, is
> already present as the "1.5" directory. Admittedly, it could be renamed to
> make that clearer.
>
> *
> On 08.09.19 10:06, Maximilian Michels wrote:
> > Thanks for your thoughts, David!
> >
> > Thanks Luke, copybara looks interesting.
> >
> > > Thomas and Max expressed needs for addressing the issue with the
> > > current release.
> >
> > I think there is a misunderstanding. Thomas and I expressed the need to
> > review the PR properly before merging it to avoid introducing technical
> > debt. The expectation to open a feature PR a couple days before the
> > release cut and have it, a) reviewed, b) tested, and c) merged, is a lot
> > to ask for the maintainers. We are currently in stage (a) after the
> > first round of reviews.
> >
> > 1.5/1.6 and possibly 1.7 will be removed. This was announced earlier on
> > the mailing list and is tracked here:
> > https://jira.apache.org/jira/browse/BEAM-7962
> >
> > The amount of duplicated files code is very little, two very simple
> > files. After removing 1.5/1.6/1.7 there should be none left. In the PR
> > it looks like there were unnecessary duplications of files. That's why I
> > commented.
> >
> > Due to the above reasons, I don't see the need to change the current way
> > of cross-compiling for Flink versions. Out of David proposals, I still
> > think the current approach is the easiest and least error-prone.
> >
> > Kenn, I appreciate your thoughts. How is your approach different from
> > the current, expect introducing an additional factory method? The source
> > layout is effectively identical to the current. We still have to retain
> > different source directories for each Flink version where Flink internal
> > interfaces change.
> >
> > > Now the code indicates exactly what is going on and all builds are
> > > straightforward. All dependencies are actual build dependencies, not
> > > src dir hacks.
> >
> > Source directories are build dependencies. I can see a minor improvement
> > though in code readability (and that's always great), but still the
> > developer needs to know from which source directory the factory is going
> > to be loaded from, which is very similar to the current approach.
> >
> > Thanks,
> > Max
> >
> > On 07.09.19 20:37, Kenneth Knowles wrote:
> > > What about...
> > >
> > > *5) Architect the code for sharing using straightforward builds and
> > > dependencies*
> > >
> > > Each versioned directory has an implementation of FlinkRunner
> > > constructed from the common source, with version-specific pieces
> > > injected (not with a complex framework, but by just passing as a
> > > parameter).
> > >
> > > I just read over the actual amount of stuff you would have to inject.
> > > It is very small at least through 1.8. Basically:
> > >
> > >  FlinkRunnerBuilder(
> > >Function coderTypeSerializerFactory,
> > >Function
> > > encodedValueTypeSerializerFactory).build() // returns a FlinkRunner
> > >
> > > Now the code indicates exactly what is going on and all builds are
> > > straightforward. All dependencies are actual build dependencies, not
> > > src dir hacks.
> > >
> > > *flink/*
> > > + *common/ *# move src/ to common/src/ for hygiene; compiles against
> > > all supported Flink versions
> > > - build.gradle
> > > + *src/*
> > >- org/apache/beam/runners/flink/common*/FlinkRunnerBuilder.java*
> > > **- ... # the main runner implementation here
> > > + *1.5_to_1.7_utils/*/ # common pluggable parts that work for 1.5
> > > through 1.7/
> > > - build.gradle # maybe makes sense to cross-compile and unit test
> > > against 1.5 through 1.7, though dirs below give coverage too
> > > + *src/*
> > >
> - org/apache/beam/runners/flink/v1_5_to_1_7/*CoderTypeSerializerFactory.java*
> > >
> > >
> - 
> org/apache/beam/runners/flink/v1_5_to_1_7/*EncodedValueTypeSerializerFactory.java*
> > >
> > > + *1.5/*
> > > - build.gradle
> > > + *src/ */# implementation of FlinkRunner compatible with 1.5,
> > > could have some of its own logic plugged in to FlinkRunnerBuilder, and
> > > some 1.5_to_1.7 utils/*
> > > *
> > >- org/apache/beam/runners/flink/*FlinkRunner.java */#
> > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> > > + *1.6/*
> > > - build.gradle
> > > + *src/ */# implementation of FlinkRunner compatible with 1.6,
> > > actually has none of its own logic but it could/*
> > > *
> > >- org/apache/beam/runners/flink/*FlinkRunner.java */#
> > > FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> > > /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> > > + *1.7/*
> > > + *src/ */# implementation of 

Re: clickhouse tests failing

2019-09-08 Thread Lukasz Cwik
Is passing at head on Jenkins:
https://builds.apache.org/job/beam_PreCommit_Java_Cron/1771/testReport/org.apache.beam.sdk.io.clickhouse/

What are the failures your seeing at initialization? (the tests do rely on
setting up zookeeper and other stuff that could fail)

On Fri, Sep 6, 2019 at 12:36 PM Elliotte Rusty Harold 
wrote:

> At head I noticed the following:
>
>
> $ ./gradlew -p sdks/java/io/ check
> Configuration on demand is an incubating feature.
>
> > Task :sdks:java:io:clickhouse:test
>
> org.apache.beam.sdk.io.clickhouse.ClickHouseIOTest > classMethod FAILED
> java.lang.IllegalStateException
>
> org.apache.beam.sdk.io.clickhouse.ClickHouseIOTest > classMethod FAILED
> java.lang.NullPointerException
>
> org.apache.beam.sdk.io.clickhouse.AtomicInsertTest > classMethod FAILED
> java.lang.IllegalStateException
>
> org.apache.beam.sdk.io.clickhouse.AtomicInsertTest > classMethod FAILED
> java.lang.NullPointerException
>
> 29 tests completed, 4 failed
>
> > Task :sdks:java:io:clickhouse:test FAILED
>
> FAILURE: Build failed with an exception.
>
>
> Is anyone else seeing this? Are the tests expected to pass, or is
> there some requirement (e.g. Java 11) that I might be missing?
>
> --
> Elliotte Rusty Harold
> elh...@ibiblio.org
>


Re: [DISCUSS] Supporting multiple Flink versions vs. tech debt

2019-09-08 Thread Thomas Weise
David, thanks for working on the Flink 1.9 support, this is very exciting!

Previous discussion on list/PRs (predating even the JIRA referenced by
Max), already pointed to the removal of support for Flink 1.5/1.6 support.
Although we generally need to consider what the Beam users need (not just
what the Flink community currently supports), I believe there was agreement
on removing 1.5/1.6 and it is inline with the Beam Flink version support
goals.

Given that, can we start with removing these two runner variants from the
build as the first step (separate PR for BEAM-7962)?

Based on Max's analysis, this will eliminate a major concern with the
current state, the duplication of source files.

Then, we can take another look at your rebased PR for 1.9 and see if there
are any candidates left to apply what Kenn has suggested, which would lead
to more fine grained overrides that avoid duplication of code.

Thanks,
Thomas


On Sun, Sep 8, 2019 at 3:15 AM Maximilian Michels  wrote:

> One minor addition, the "*1.5_to_1.7_utils" directory you listed, is
> already present as the "1.5" directory. Admittedly, it could be renamed to
> make that clearer.
>
> *
> On 08.09.19 10:06, Maximilian Michels wrote:
> > Thanks for your thoughts, David!
> >
> > Thanks Luke, copybara looks interesting.
> >
> > > Thomas and Max expressed needs for addressing the issue with the
> > > current release.
> >
> > I think there is a misunderstanding. Thomas and I expressed the need to
> > review the PR properly before merging it to avoid introducing technical
> > debt. The expectation to open a feature PR a couple days before the
> > release cut and have it, a) reviewed, b) tested, and c) merged, is a lot
> > to ask for the maintainers. We are currently in stage (a) after the
> > first round of reviews.
> >
> > 1.5/1.6 and possibly 1.7 will be removed. This was announced earlier on
> > the mailing list and is tracked here:
> > https://jira.apache.org/jira/browse/BEAM-7962
> >
> > The amount of duplicated files code is very little, two very simple
> > files. After removing 1.5/1.6/1.7 there should be none left. In the PR
> > it looks like there were unnecessary duplications of files. That's why I
> > commented.
> >
> > Due to the above reasons, I don't see the need to change the current way
> > of cross-compiling for Flink versions. Out of David proposals, I still
> > think the current approach is the easiest and least error-prone.
> >
> > Kenn, I appreciate your thoughts. How is your approach different from
> > the current, expect introducing an additional factory method? The source
> > layout is effectively identical to the current. We still have to retain
> > different source directories for each Flink version where Flink internal
> > interfaces change.
> >
> > > Now the code indicates exactly what is going on and all builds are
> > > straightforward. All dependencies are actual build dependencies, not
> > > src dir hacks.
> >
> > Source directories are build dependencies. I can see a minor improvement
> > though in code readability (and that's always great), but still the
> > developer needs to know from which source directory the factory is going
> > to be loaded from, which is very similar to the current approach.
> >
> > Thanks,
> > Max
> >
> > On 07.09.19 20:37, Kenneth Knowles wrote:
> > > What about...
> > >
> > > *5) Architect the code for sharing using straightforward builds and
> > > dependencies*
> > >
> > > Each versioned directory has an implementation of FlinkRunner
> > > constructed from the common source, with version-specific pieces
> > > injected (not with a complex framework, but by just passing as a
> > > parameter).
> > >
> > > I just read over the actual amount of stuff you would have to inject.
> > > It is very small at least through 1.8. Basically:
> > >
> > >  FlinkRunnerBuilder(
> > >Function coderTypeSerializerFactory,
> > >Function
> > > encodedValueTypeSerializerFactory).build() // returns a FlinkRunner
> > >
> > > Now the code indicates exactly what is going on and all builds are
> > > straightforward. All dependencies are actual build dependencies, not
> > > src dir hacks.
> > >
> > > *flink/*
> > > + *common/ *# move src/ to common/src/ for hygiene; compiles against
> > > all supported Flink versions
> > > - build.gradle
> > > + *src/*
> > >- org/apache/beam/runners/flink/common*/FlinkRunnerBuilder.java*
> > > **- ... # the main runner implementation here
> > > + *1.5_to_1.7_utils/*/ # common pluggable parts that work for 1.5
> > > through 1.7/
> > > - build.gradle # maybe makes sense to cross-compile and unit test
> > > against 1.5 through 1.7, though dirs below give coverage too
> > > + *src/*
> > >
> - org/apache/beam/runners/flink/v1_5_to_1_7/*CoderTypeSerializerFactory.java*
> > >
> > >
> - 
> org/apache/beam/runners/flink/v1_5_to_1_7/*EncodedValueTypeSerializerFactory.java*
> > >
> > > + *1.5/*
> > > - build.gradle
> > > + *src/ */# implementation of 

Re: [DISCUSS] Supporting multiple Flink versions vs. tech debt

2019-09-08 Thread Maximilian Michels

One minor addition, the "*1.5_to_1.7_utils" directory you listed, is already present as 
the "1.5" directory. Admittedly, it could be renamed to make that clearer.

*
On 08.09.19 10:06, Maximilian Michels wrote:

Thanks for your thoughts, David!

Thanks Luke, copybara looks interesting.

> Thomas and Max expressed needs for addressing the issue with the
> current release.

I think there is a misunderstanding. Thomas and I expressed the need to
review the PR properly before merging it to avoid introducing technical
debt. The expectation to open a feature PR a couple days before the
release cut and have it, a) reviewed, b) tested, and c) merged, is a lot
to ask for the maintainers. We are currently in stage (a) after the
first round of reviews.

1.5/1.6 and possibly 1.7 will be removed. This was announced earlier on
the mailing list and is tracked here:
https://jira.apache.org/jira/browse/BEAM-7962

The amount of duplicated files code is very little, two very simple
files. After removing 1.5/1.6/1.7 there should be none left. In the PR
it looks like there were unnecessary duplications of files. That's why I
commented.

Due to the above reasons, I don't see the need to change the current way
of cross-compiling for Flink versions. Out of David proposals, I still
think the current approach is the easiest and least error-prone.

Kenn, I appreciate your thoughts. How is your approach different from
the current, expect introducing an additional factory method? The source
layout is effectively identical to the current. We still have to retain
different source directories for each Flink version where Flink internal
interfaces change.

> Now the code indicates exactly what is going on and all builds are
> straightforward. All dependencies are actual build dependencies, not
> src dir hacks.

Source directories are build dependencies. I can see a minor improvement
though in code readability (and that's always great), but still the
developer needs to know from which source directory the factory is going
to be loaded from, which is very similar to the current approach.

Thanks,
Max

On 07.09.19 20:37, Kenneth Knowles wrote:
> What about...
>
> *5) Architect the code for sharing using straightforward builds and
> dependencies*
>
> Each versioned directory has an implementation of FlinkRunner
> constructed from the common source, with version-specific pieces
> injected (not with a complex framework, but by just passing as a
> parameter).
>
> I just read over the actual amount of stuff you would have to inject.
> It is very small at least through 1.8. Basically:
>
>      FlinkRunnerBuilder(
>        Function coderTypeSerializerFactory,
>        Function
> encodedValueTypeSerializerFactory).build() // returns a FlinkRunner
>
> Now the code indicates exactly what is going on and all builds are
> straightforward. All dependencies are actual build dependencies, not
> src dir hacks.
>
> *flink/*
> + *common/ *# move src/ to common/src/ for hygiene; compiles against
> all supported Flink versions
>     - build.gradle
>     + *src/*
>        - org/apache/beam/runners/flink/common*/FlinkRunnerBuilder.java*
> **- ... # the main runner implementation here
> + *1.5_to_1.7_utils/*/ # common pluggable parts that work for 1.5
> through 1.7/
>     - build.gradle # maybe makes sense to cross-compile and unit test
> against 1.5 through 1.7, though dirs below give coverage too
> + *src/*
> - org/apache/beam/runners/flink/v1_5_to_1_7/*CoderTypeSerializerFactory.java*
>
> - 
org/apache/beam/runners/flink/v1_5_to_1_7/*EncodedValueTypeSerializerFactory.java*
>
> + *1.5/*
>     - build.gradle
>     + *src/ */# implementation of FlinkRunner compatible with 1.5,
> could have some of its own logic plugged in to FlinkRunnerBuilder, and
> some 1.5_to_1.7 utils/*
> *
>        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> + *1.6/*
>     - build.gradle
>     + *src/ */# implementation of FlinkRunner compatible with 1.6,
> actually has none of its own logic but it could/*
> *
>        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> + *1.7/*
>     + *src/ */# implementation of FlinkRunner compatible with 1.7,
> actually has none of its own logic but it could/*
> *
>        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new
> /v1_5_to_1_7./EncodedValueTypeSerializerFactory())/
> + *1.8/*
>     - build.gradle
> + *src/*
>        - org/apache/beam/runners/flink/*FlinkRunner.java */#
> FlinkRunnerBuilder(new v1_8_CoderTypeSerializerFactory(), new
> EncodedValueTypeSerializerFactory())/
> - org/apache/beam/runners/flink/*v1_8/CoderTypeSerializerFactory.java*
> - 

Re: [DISCUSS] Supporting multiple Flink versions vs. tech debt

2019-09-08 Thread Maximilian Michels

Thanks for your thoughts, David!

Thanks Luke, copybara looks interesting.


Thomas and Max expressed needs for addressing the issue with the current 
release.


I think there is a misunderstanding. Thomas and I expressed the need to 
review the PR properly before merging it to avoid introducing technical 
debt. The expectation to open a feature PR a couple days before the 
release cut and have it, a) reviewed, b) tested, and c) merged, is a lot 
to ask for the maintainers. We are currently in stage (a) after the 
first round of reviews.


1.5/1.6 and possibly 1.7 will be removed. This was announced earlier on 
the mailing list and is tracked here: 
https://jira.apache.org/jira/browse/BEAM-7962


The amount of duplicated files code is very little, two very simple 
files. After removing 1.5/1.6/1.7 there should be none left. In the PR 
it looks like there were unnecessary duplications of files. That's why I 
commented.


Due to the above reasons, I don't see the need to change the current way 
of cross-compiling for Flink versions. Out of David proposals, I still 
think the current approach is the easiest and least error-prone.


Kenn, I appreciate your thoughts. How is your approach different from 
the current, expect introducing an additional factory method? The source 
layout is effectively identical to the current. We still have to retain 
different source directories for each Flink version where Flink internal 
interfaces change.



Now the code indicates exactly what is going on and all builds are 
straightforward. All dependencies are actual build dependencies, not src dir 
hacks.


Source directories are build dependencies. I can see a minor improvement 
though in code readability (and that's always great), but still the 
developer needs to know from which source directory the factory is going 
to be loaded from, which is very similar to the current approach.


Thanks,
Max

On 07.09.19 20:37, Kenneth Knowles wrote:

What about...

*5) Architect the code for sharing using straightforward builds and 
dependencies*


Each versioned directory has an implementation of FlinkRunner 
constructed from the common source, with version-specific pieces 
injected (not with a complex framework, but by just passing as a parameter).


I just read over the actual amount of stuff you would have to inject. It 
is very small at least through 1.8. Basically:


     FlinkRunnerBuilder(
       Function coderTypeSerializerFactory,
       Function 
encodedValueTypeSerializerFactory).build() // returns a FlinkRunner


Now the code indicates exactly what is going on and all builds are 
straightforward. All dependencies are actual build dependencies, not src 
dir hacks.


*flink/*
+ *common/ *# move src/ to common/src/ for hygiene; compiles against all 
supported Flink versions

    - build.gradle
    + *src/*
       - org/apache/beam/runners/flink/common*/FlinkRunnerBuilder.java*
**- ... # the main runner implementation here
+ *1.5_to_1.7_utils/*/ # common pluggable parts that work for 1.5 
through 1.7/
    - build.gradle # maybe makes sense to cross-compile and unit test 
against 1.5 through 1.7, though dirs below give coverage too

+ *src/*
   
- org/apache/beam/runners/flink/v1_5_to_1_7/*CoderTypeSerializerFactory.java*
   
- org/apache/beam/runners/flink/v1_5_to_1_7/*EncodedValueTypeSerializerFactory.java*

+ *1.5/*
    - build.gradle
    + *src/ */# implementation of FlinkRunner compatible with 1.5, could 
have some of its own logic plugged in to FlinkRunnerBuilder, and some 
1.5_to_1.7 utils/*

*
       - org/apache/beam/runners/flink/*FlinkRunner.java */# 
FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new 
/v1_5_to_1_7./EncodedValueTypeSerializerFactory())/

+ *1.6/*
    - build.gradle
    + *src/ */# implementation of FlinkRunner compatible with 1.6, 
actually has none of its own logic but it could/*

*
       - org/apache/beam/runners/flink/*FlinkRunner.java */# 
FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new 
/v1_5_to_1_7./EncodedValueTypeSerializerFactory())/

+ *1.7/*
    + *src/ */# implementation of FlinkRunner compatible with 1.7, 
actually has none of its own logic but it could/*

*
       - org/apache/beam/runners/flink/*FlinkRunner.java */# 
FlinkRunnerBuilder(new v1_5_to_1_7.CoderTypeSerializerFactory(), new 
/v1_5_to_1_7./EncodedValueTypeSerializerFactory())/

+ *1.8/*
    - build.gradle
+ *src/*
       - org/apache/beam/runners/flink/*FlinkRunner.java */# 
FlinkRunnerBuilder(new v1_8_CoderTypeSerializerFactory(), new 
EncodedValueTypeSerializerFactory())/
   
- org/apache/beam/runners/flink/*v1_8/CoderTypeSerializerFactory.java*
   
- org/apache/beam/runners/flink/*v1_8/EncodedValueTypeSerializerFactory.java*


Kenn

On Sat, Sep 7, 2019 at 9:00 AM Lukasz Cwik > wrote:


When we import the Beam code into Google, we also run into issues
where sometimes we need to transform parts of the code. During
import we use