Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
Le 10 avr. 2018 22:59, "Robert Bradshaw"  a écrit :

On Tue, Apr 10, 2018 at 1:49 PM Romain Manni-Bucau 
wrote:

>
> Le 10 avr. 2018 21:25, "Robert Bradshaw"  a écrit :
>
> On Tue, Apr 10, 2018 at 12:10 PM Romain Manni-Bucau 
> wrote:
>
>> This is interesting cause it leads to "why do the workers need to do it
>> again instead of reusing the computed one?". Technically the answer is
>> trivial but in terms of design I think beam tends to abuse static init
>> block - even in dofn api - which easily lead to issues when we will want to
>> support more than a main (thinking to OSGi for instance).
>>
>> So:
>>
>> 1. Why not using a standard programming model not cinit based? (Perf are
>> not a valid answer indeed)
>>
>
> The Java language (as far as I know) doesn't have the ability to prohibit
> assigning static values (such as TupleTags) as static members. We can,
> however, detect this (which is what the current code does). It doesn't seem
> to me that code like
>
> public class MyDoFn {
> public static final TupleTag SOME_OUTPUT_TAG = new
> TupleTag<>();
> ...
> }
>
> is "bad practice," especially as this tag will need to be referenced in
> multiple places.
>
>
> It is as soon as you dont run in a flat classpath env. In flat cp it is
> acceptable and dont have much side effects...but beam doesnt know where it
> runs ;).
>

The problem is, people *will* write this.



This is ok and if thegenid/default constructor is deprecated it is ok.

Note however the dofn api encourages this pattern for state and timers
instead of passing them as parameters or field (not static) injections
which would be saner.



> 2. GenId should probably be deprecated and considered a bad practise
>>
>
> Is the proposal that we require the user to manually provide unique
> identifiers everywhere? Or for static case like above? (Note that
> accidentally re-using identifiers can lead to subtle incorrect pipeline
> results.)
>
>
> Yep.
>

Yep to which?



Force a value. Static or not is a user choice we should just advice to not
be static.



> And ensure we can serialize a tupletag with an already uuid-generated id
> for instance.
>

Yes, we already do this.


>
>
> This looks like a detail but for beam 3 we should ensure we drop the
>> legacy bringing bad practises in our user code.
>>
>> Le 10 avr. 2018 20:15, "Ben Chambers"  a écrit :
>>
>>> I believe it doesn't need to be stable across refactoring, only across
>>> all workers executing a specific version of the code. Specifically, it is
>>> used as follows:
>>>
>>> 1. Create a pipeline on the user's machine. It walks the stack until the
>>> static initializer block, which provides an ID.
>>> 2. Send the pipeline to many worker machines.
>>> 3. Each worker machine walks the stack until the static initializer
>>> block (on the same version of the code), receiving the same ID.
>>>
>>> This ensures that the tupletag is the same on all the workers, as well
>>> as on the user's machine, which is critical since it used as an identifier
>>> across these machines.
>>>
>>> Assigning a UUID would work if all of the machines agreed on the same
>>> tuple ID, which could be accomplished with serialization. Serialization,
>>> however, doesn't work well with static initializers, since those will have
>>> been called to initialize the class at load time.
>>>
>>> On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 Well issue is more about all the existing tests currently.

 Out of curiosity: how walking the stack is stable since the stack can
 change? Stop condition is the static block of a class which can use method
 so refactoring and therefore is not stable. Should it be deprecated?


 Le 10 avr. 2018 19:17, "Robert Bradshaw"  a
 écrit :

 If it's too slow perhaps you could use the constructor where you pass
 an explicit id (though in my experience walking the stack isn't that slow).

 On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau <
 rmannibu...@gmail.com> wrote:

> Oops cross post sorry.
>
> Issue i hit on this thread is it is used a lot in tests abd it slows
> down tests for nothing like with generatesequence ones
>
> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
> écrit :
>
>>
>>
>> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a
>> écrit :
>>
>> These values should be, inasmuch as possible, stable across VMs. How
>> slow is slow? Doesn't this happen only once per VM startup?
>>
>>
>> Once per jvm and idea launches a jvm per test and the daemon does
>> save enough time, you still go through the whole project and check all
>> upstream deps it seems.
>>
>> It is <1s with maven vs 5-6s with gradle.
>>
>>
>> On 

Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
Le 11 avr. 2018 02:30, "Reuven Lax"  a écrit :

Actually I always found the right-click to run tests to only sometimes work
in Maven, especially if there were changes to dependent AutoValue classes
where code had to be generated. Too often it would fail, and I would then
need to use Maven to rebuild the whole project. It would be cool if Gradle
could do this more reliably than Maven did.


Hmm, i exactly experiment the opposite. Testes with idea 2017 and 2018.


Reuven

On Tue, Apr 10, 2018 at 8:46 PM Romain Manni-Bucau 
wrote:

> @jb: what did you change? I re-imported the project like 3 times earlier
> today and never got it working acceptably :(
>
> Personally if importing the project and right click on a test+debug works
> as good as maven in idea id be happy. I can manage other stuff in a console
> even if gradle reporting is not that efficient for me for now.
>
> Le 10 avr. 2018 21:37, "Reuven Lax"  a écrit :
>
>> There are a lot of ideas on how to increase usability, but I think
>> they'll get lost in the thread. I suggest we try to capture them in Jiras.
>>
>> I suggest we also find out what common use patterns are (people on this
>> thread are probably sufficient), as different people will have different
>> workflows. We can then make sure that all common workflows are documented.
>> As an example, one task I often do is to run just checkstyle over a module
>> or the entire project.
>>
>> Reuven
>>
>> On Tue, Apr 10, 2018 at 7:18 PM Jean-Baptiste Onofré 
>> wrote:
>>
>>> FYI, I did a new attempt and it works fine (pretty long). Previous try
>>> failed.
>>>
>>> Regards
>>> JB
>>>
>>> On 10/04/2018 19:52, Kenneth Knowles wrote:
>>> > I've been on Idea+Gradle for ~two months, around the time I added
>>> > https://github.com/apache/beam/pull/4583 and
>>> > https://github.com/apache/beam/pull/4626 to make the import require
>>> zero
>>> > user work. I have no fear of deleting my project any time and
>>> re-importing.
>>> >
>>> > I agree with not having auto-import on. It is just too slow. I can't
>>> > remember if it was importing too often due to build outputs or if it
>>> was
>>> > just that I was messing with the build.gradle files. Anyhow it doesn't
>>> > really add much value.
>>> >
>>> > The gradle runner _is_ able to use submodules and run individual tests
>>> > methods, and all that.
>>> >
>>> > Kenn
>>> >
>>> >
>>> > On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau
>>> > > wrote:
>>> >
>>> > Runner a test doesnt have the right classpath (idea uses out/
>>> instead
>>> > of build/) then when you switch on gradle runner the launching uses
>>> > gradle which is not able to use submodules directly but reconsider
>>> the
>>> > whole project which is quite slow for normal dev iterations
>>> > compare to just run the test with the right classpath and a fast
>>> > compile step if needed. I lost literally 1h for something simple
>>> with
>>> > that tooling, this is way too much to be acceptable on my side
>>> since
>>> > I'm sadly not paid to work on beam (one day maybe ;)).
>>> >
>>> > Romain Manni-Bucau
>>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >
>>> >
>>> > 2018-04-10 18:27 GMT+02:00 Reuven Lax >> > >:
>>> >  > Romain,
>>> >  >
>>> >  > Can you detail what's not working. I switched my IntelliJ over
>>> to
>>> > Gradle
>>> >  > about two weeks ago, and haven't had any trouble.
>>> >  >
>>> >  > Reuven
>>> >  >
>>> >  > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau
>>> > >
>>> >  > wrote:
>>> >  >>
>>> >  >> Ok, didn't find a way to make it working properly (only
>>> workaround
>>> >  >> with direct commands and no good idea integration for
>>> > debugging). I'm
>>> >  >> back with maven, if anyone knows how to properly solve it let's
>>> > do it.
>>> >  >> If not I think JB point is to consider more than any other
>>> criteria.
>>> >  >>
>>> >  >> Romain Manni-Bucau
>>> >  >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >  >>
>>> >  >>
>>> >  >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau
>>> > >:
>>> >  >> > side note: do NOT use auto-import until you are sure you can,
>>> > it locks
>>> >  >> > regularly on beam (pby too big for idea?) and makes idea
>>> ready
>>> > to be
>>> >  >> > killed :(
>>> >  >> >
>>> >  >> > Romain Manni-Bucau
>>> >  >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >  >> >
>>> >  >> >
>>> >  >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré
>>> > >:
>>> >  >> >> It's what I did, 

Re: Gradle Status [April 6]

2018-04-10 Thread Reuven Lax
Actually I always found the right-click to run tests to only sometimes work
in Maven, especially if there were changes to dependent AutoValue classes
where code had to be generated. Too often it would fail, and I would then
need to use Maven to rebuild the whole project. It would be cool if Gradle
could do this more reliably than Maven did.

Reuven

On Tue, Apr 10, 2018 at 8:46 PM Romain Manni-Bucau 
wrote:

> @jb: what did you change? I re-imported the project like 3 times earlier
> today and never got it working acceptably :(
>
> Personally if importing the project and right click on a test+debug works
> as good as maven in idea id be happy. I can manage other stuff in a console
> even if gradle reporting is not that efficient for me for now.
>
> Le 10 avr. 2018 21:37, "Reuven Lax"  a écrit :
>
>> There are a lot of ideas on how to increase usability, but I think
>> they'll get lost in the thread. I suggest we try to capture them in Jiras.
>>
>> I suggest we also find out what common use patterns are (people on this
>> thread are probably sufficient), as different people will have different
>> workflows. We can then make sure that all common workflows are documented.
>> As an example, one task I often do is to run just checkstyle over a module
>> or the entire project.
>>
>> Reuven
>>
>> On Tue, Apr 10, 2018 at 7:18 PM Jean-Baptiste Onofré 
>> wrote:
>>
>>> FYI, I did a new attempt and it works fine (pretty long). Previous try
>>> failed.
>>>
>>> Regards
>>> JB
>>>
>>> On 10/04/2018 19:52, Kenneth Knowles wrote:
>>> > I've been on Idea+Gradle for ~two months, around the time I added
>>> > https://github.com/apache/beam/pull/4583 and
>>> > https://github.com/apache/beam/pull/4626 to make the import require
>>> zero
>>> > user work. I have no fear of deleting my project any time and
>>> re-importing.
>>> >
>>> > I agree with not having auto-import on. It is just too slow. I can't
>>> > remember if it was importing too often due to build outputs or if it
>>> was
>>> > just that I was messing with the build.gradle files. Anyhow it doesn't
>>> > really add much value.
>>> >
>>> > The gradle runner _is_ able to use submodules and run individual tests
>>> > methods, and all that.
>>> >
>>> > Kenn
>>> >
>>> >
>>> > On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau
>>> > > wrote:
>>> >
>>> > Runner a test doesnt have the right classpath (idea uses out/
>>> instead
>>> > of build/) then when you switch on gradle runner the launching uses
>>> > gradle which is not able to use submodules directly but reconsider
>>> the
>>> > whole project which is quite slow for normal dev iterations
>>> > compare to just run the test with the right classpath and a fast
>>> > compile step if needed. I lost literally 1h for something simple
>>> with
>>> > that tooling, this is way too much to be acceptable on my side
>>> since
>>> > I'm sadly not paid to work on beam (one day maybe ;)).
>>> >
>>> > Romain Manni-Bucau
>>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >
>>> >
>>> > 2018-04-10 18:27 GMT+02:00 Reuven Lax >> > >:
>>> >  > Romain,
>>> >  >
>>> >  > Can you detail what's not working. I switched my IntelliJ over
>>> to
>>> > Gradle
>>> >  > about two weeks ago, and haven't had any trouble.
>>> >  >
>>> >  > Reuven
>>> >  >
>>> >  > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau
>>> > >
>>> >  > wrote:
>>> >  >>
>>> >  >> Ok, didn't find a way to make it working properly (only
>>> workaround
>>> >  >> with direct commands and no good idea integration for
>>> > debugging). I'm
>>> >  >> back with maven, if anyone knows how to properly solve it let's
>>> > do it.
>>> >  >> If not I think JB point is to consider more than any other
>>> criteria.
>>> >  >>
>>> >  >> Romain Manni-Bucau
>>> >  >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >  >>
>>> >  >>
>>> >  >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau
>>> > >:
>>> >  >> > side note: do NOT use auto-import until you are sure you can,
>>> > it locks
>>> >  >> > regularly on beam (pby too big for idea?) and makes idea
>>> ready
>>> > to be
>>> >  >> > killed :(
>>> >  >> >
>>> >  >> > Romain Manni-Bucau
>>> >  >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >  >> >
>>> >  >> >
>>> >  >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré
>>> > >:
>>> >  >> >> It's what I did, I'm trying a complete reload now (maybe
>>> this
>>> > step
>>> >  >> >> failed).
>>> >  >> >>
>>> >  >> >> On 10/04/2018 

Re: Python postcommit and precommit

2018-04-10 Thread Alan Myrvold
I think we should replace the shell script with a top level
pythonPostCommit gradle target, similar to the precomment.

On Mon, Apr 9, 2018 at 12:12 PM Lukasz Cwik  wrote:

> The shell scripts still exist instead of using Gradle. Migrating to Gradle
> as the build system hasn't addressed this (only change in the Gradle
> migration was an improvement where Gradle now creates a virtualenv
> automatically for building).
>
> Alan, any plans to integrate more closely with Gradle going forward
> instead of using shell scripts for task/input/output management?
>
> On Wed, Apr 4, 2018 at 2:11 PM Kenneth Knowles  wrote:
>
>> Was this resolved off list? I think it makes sense to have a
>> dependency-driven build tool as the entry point to these processes. So in
>> our case, Gradle. If setting it up in Gradle/Groovy is a pain, having it
>> shell out seems fine as an implementation detail, but you need to set up
>> inputs/outputs of the Gradle tasks properly.
>>
>> Kenn
>>
>> On Fri, Mar 30, 2018 at 3:30 PM Udi Meiri  wrote:
>>
>>> Hi,
>>>
>>> I noticed that Python precommit runs using this command:
>>>   mvn clean install -pl sdks/python -am -amd
>>> while postcommit invocation is simply a bash script:
>>>   bash sdks/python/run_postcommit.sh
>>>
>>> Both run unit tests via Tox, however since the runtime environment setup
>>> is configured in different files (pom.xml vs shell script), they don't
>>> always agree in their results (precommit is currently succeeded while
>>> postcommit is failing).
>>>
>>> So my naive question is: why does Python precommit run via Maven/Gradle?
>>> Could we not just use a script like run_postcommit.sh?
>>>
>>> (Side note: there's a lot of code/config duplication, such as: pypi
>>> package versions, *.c, *.so, etc. cleanup)
>>>
>>


Re: Gradle Status [April 6]

2018-04-10 Thread Kenneth Knowles
Reuven's point is good.

Once we hit the bare minimum of having things working, let's collect
usability improvements and engineering improvements on a separate JIRA from
the main migration.

I filed https://issues.apache.org/jira/browse/BEAM-4045 for these less
critical issues to separate them from blockers on
https://issues.apache.org/jira/browse/BEAM-3249.

Once we are through the migration, I expect the subtasks might just be
flattened to top-level tasks, but this is a nice view of them.

Kenn

On Tue, Apr 10, 2018 at 1:46 PM Romain Manni-Bucau 
wrote:

> @jb: what did you change? I re-imported the project like 3 times earlier
> today and never got it working acceptably :(
>
> Personally if importing the project and right click on a test+debug works
> as good as maven in idea id be happy. I can manage other stuff in a console
> even if gradle reporting is not that efficient for me for now.
>
> Le 10 avr. 2018 21:37, "Reuven Lax"  a écrit :
>
>> There are a lot of ideas on how to increase usability, but I think
>> they'll get lost in the thread. I suggest we try to capture them in Jiras.
>>
>> I suggest we also find out what common use patterns are (people on this
>> thread are probably sufficient), as different people will have different
>> workflows. We can then make sure that all common workflows are documented.
>> As an example, one task I often do is to run just checkstyle over a module
>> or the entire project.
>>
>> Reuven
>>
>> On Tue, Apr 10, 2018 at 7:18 PM Jean-Baptiste Onofré 
>> wrote:
>>
>>> FYI, I did a new attempt and it works fine (pretty long). Previous try
>>> failed.
>>>
>>> Regards
>>> JB
>>>
>>> On 10/04/2018 19:52, Kenneth Knowles wrote:
>>> > I've been on Idea+Gradle for ~two months, around the time I added
>>> > https://github.com/apache/beam/pull/4583 and
>>> > https://github.com/apache/beam/pull/4626 to make the import require
>>> zero
>>> > user work. I have no fear of deleting my project any time and
>>> re-importing.
>>> >
>>> > I agree with not having auto-import on. It is just too slow. I can't
>>> > remember if it was importing too often due to build outputs or if it
>>> was
>>> > just that I was messing with the build.gradle files. Anyhow it doesn't
>>> > really add much value.
>>> >
>>> > The gradle runner _is_ able to use submodules and run individual tests
>>> > methods, and all that.
>>> >
>>> > Kenn
>>> >
>>> >
>>> > On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau
>>> > > wrote:
>>> >
>>> > Runner a test doesnt have the right classpath (idea uses out/
>>> instead
>>> > of build/) then when you switch on gradle runner the launching uses
>>> > gradle which is not able to use submodules directly but reconsider
>>> the
>>> > whole project which is quite slow for normal dev iterations
>>> > compare to just run the test with the right classpath and a fast
>>> > compile step if needed. I lost literally 1h for something simple
>>> with
>>> > that tooling, this is way too much to be acceptable on my side
>>> since
>>> > I'm sadly not paid to work on beam (one day maybe ;)).
>>> >
>>> > Romain Manni-Bucau
>>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >
>>> >
>>> > 2018-04-10 18:27 GMT+02:00 Reuven Lax >> > >:
>>> >  > Romain,
>>> >  >
>>> >  > Can you detail what's not working. I switched my IntelliJ over
>>> to
>>> > Gradle
>>> >  > about two weeks ago, and haven't had any trouble.
>>> >  >
>>> >  > Reuven
>>> >  >
>>> >  > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau
>>> > >
>>> >  > wrote:
>>> >  >>
>>> >  >> Ok, didn't find a way to make it working properly (only
>>> workaround
>>> >  >> with direct commands and no good idea integration for
>>> > debugging). I'm
>>> >  >> back with maven, if anyone knows how to properly solve it let's
>>> > do it.
>>> >  >> If not I think JB point is to consider more than any other
>>> criteria.
>>> >  >>
>>> >  >> Romain Manni-Bucau
>>> >  >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >  >>
>>> >  >>
>>> >  >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau
>>> > >:
>>> >  >> > side note: do NOT use auto-import until you are sure you can,
>>> > it locks
>>> >  >> > regularly on beam (pby too big for idea?) and makes idea
>>> ready
>>> > to be
>>> >  >> > killed :(
>>> >  >> >
>>> >  >> > Romain Manni-Bucau
>>> >  >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >  >> >
>>> >  >> >
>>> >  >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré
>>> > >:
>>> >  >> >> It's what 

Jenkins build is back to normal : beam_SeedJob #1471

2018-04-10 Thread Apache Jenkins Server
See 



Build failed in Jenkins: beam_SeedJob #1470

2018-04-10 Thread Apache Jenkins Server
See 

--
GitHub pull request #5088 of commit a5ebf9146e9e05427de5b28f8900578a4cec8949, 
no merge conflicts.
Setting status of a5ebf9146e9e05427de5b28f8900578a4cec8949 to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/1470/ and message: 'Build started 
sha1 is merged.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam4 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/5088/*:refs/remotes/origin/pr/5088/*
 > git rev-parse refs/remotes/origin/pr/5088/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/5088/merge^{commit} # timeout=10
Checking out Revision 44b994ae20ca6110d07be04458b0f6fd0cfa6ff7 
(refs/remotes/origin/pr/5088/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 44b994ae20ca6110d07be04458b0f6fd0cfa6ff7
Commit message: "Merge a5ebf9146e9e05427de5b28f8900578a4cec8949 into 
200eec699f254daba4ce586d1a60716aaede7b5e"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_beam_Inventory.groovy
Processing DSL script job_beam_PerformanceTests_Dataflow.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_beam_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_beam_PerformanceTests_JDBC.groovy
Processing DSL script job_beam_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_Python.groovy
Processing DSL script job_beam_PerformanceTests_Spark.groovy
Processing DSL script job_beam_PostCommit_Java_GradleBuild.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Dataflow.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
Processing DSL script job_beam_PostCommit_Java_ValidatesRunner_Spark.groovy
Processing DSL script 
job_beam_PostCommit_Python_ValidatesContainer_Dataflow.groovy
Processing DSL script job_beam_PostCommit_Python_ValidatesRunner_Dataflow.groovy
Processing DSL script job_beam_PostCommit_Python_Verify.groovy
Processing DSL script job_beam_PostRelease_NightlySnapshot.groovy
Processing DSL script job_beam_PreCommit_Go_GradleBuild.groovy
Processing DSL script job_beam_PreCommit_Java_GradleBuild.groovy
ERROR: (job_beam_PreCommit_Java_GradleBuild.groovy, line 40) No such property: 
gradle_switches for class: javaposse.jobdsl.dsl.jobs.FreeStyleJob



Build failed in Jenkins: beam_SeedJob #1469

2018-04-10 Thread Apache Jenkins Server
See 

--
GitHub pull request #5088 of commit 56ba4f1bde57937c7a1c2d30f7588847a6489d30, 
no merge conflicts.
Setting status of 56ba4f1bde57937c7a1c2d30f7588847a6489d30 to PENDING with url 
https://builds.apache.org/job/beam_SeedJob/1469/ and message: 'Build started 
sha1 is merged.'
Using context: Jenkins: Seed Job
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/5088/*:refs/remotes/origin/pr/5088/*
 > git rev-parse refs/remotes/origin/pr/5088/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/5088/merge^{commit} # timeout=10
Checking out Revision e3be874cab35515c626bd839a8697de14b6bf99d 
(refs/remotes/origin/pr/5088/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e3be874cab35515c626bd839a8697de14b6bf99d
Commit message: "Merge 56ba4f1bde57937c7a1c2d30f7588847a6489d30 into 
200eec699f254daba4ce586d1a60716aaede7b5e"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Processing DSL script job_00_seed.groovy
Processing DSL script job_beam_Inventory.groovy
Processing DSL script job_beam_PerformanceTests_Dataflow.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_FileBasedIO_IT_HDFS.groovy
Processing DSL script job_beam_PerformanceTests_HadoopInputFormat.groovy
Processing DSL script job_beam_PerformanceTests_JDBC.groovy
Processing DSL script job_beam_PerformanceTests_MongoDBIO_IT.groovy
Processing DSL script job_beam_PerformanceTests_Python.groovy
Processing DSL script job_beam_PerformanceTests_Spark.groovy
Processing DSL script job_beam_PostCommit_Java_GradleBuild.groovy
ERROR: (common_job_properties.groovy, line 182) No signature of method: static 
common_job_properties.switches() is applicable for argument types: 
(java.lang.String) values: [--info]
Possible solutions: with(groovy.lang.Closure)



Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Robert Bradshaw
On Tue, Apr 10, 2018 at 1:49 PM Romain Manni-Bucau 
wrote:

>
> Le 10 avr. 2018 21:25, "Robert Bradshaw"  a écrit :
>
> On Tue, Apr 10, 2018 at 12:10 PM Romain Manni-Bucau 
> wrote:
>
>> This is interesting cause it leads to "why do the workers need to do it
>> again instead of reusing the computed one?". Technically the answer is
>> trivial but in terms of design I think beam tends to abuse static init
>> block - even in dofn api - which easily lead to issues when we will want to
>> support more than a main (thinking to OSGi for instance).
>>
>> So:
>>
>> 1. Why not using a standard programming model not cinit based? (Perf are
>> not a valid answer indeed)
>>
>
> The Java language (as far as I know) doesn't have the ability to prohibit
> assigning static values (such as TupleTags) as static members. We can,
> however, detect this (which is what the current code does). It doesn't seem
> to me that code like
>
> public class MyDoFn {
> public static final TupleTag SOME_OUTPUT_TAG = new
> TupleTag<>();
> ...
> }
>
> is "bad practice," especially as this tag will need to be referenced in
> multiple places.
>
>
> It is as soon as you dont run in a flat classpath env. In flat cp it is
> acceptable and dont have much side effects...but beam doesnt know where it
> runs ;).
>

The problem is, people *will* write this.


> 2. GenId should probably be deprecated and considered a bad practise
>>
>
> Is the proposal that we require the user to manually provide unique
> identifiers everywhere? Or for static case like above? (Note that
> accidentally re-using identifiers can lead to subtle incorrect pipeline
> results.)
>
>
> Yep.
>

Yep to which?


> And ensure we can serialize a tupletag with an already uuid-generated id
> for instance.
>

Yes, we already do this.


>
>
> This looks like a detail but for beam 3 we should ensure we drop the
>> legacy bringing bad practises in our user code.
>>
>> Le 10 avr. 2018 20:15, "Ben Chambers"  a écrit :
>>
>>> I believe it doesn't need to be stable across refactoring, only across
>>> all workers executing a specific version of the code. Specifically, it is
>>> used as follows:
>>>
>>> 1. Create a pipeline on the user's machine. It walks the stack until the
>>> static initializer block, which provides an ID.
>>> 2. Send the pipeline to many worker machines.
>>> 3. Each worker machine walks the stack until the static initializer
>>> block (on the same version of the code), receiving the same ID.
>>>
>>> This ensures that the tupletag is the same on all the workers, as well
>>> as on the user's machine, which is critical since it used as an identifier
>>> across these machines.
>>>
>>> Assigning a UUID would work if all of the machines agreed on the same
>>> tuple ID, which could be accomplished with serialization. Serialization,
>>> however, doesn't work well with static initializers, since those will have
>>> been called to initialize the class at load time.
>>>
>>> On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 Well issue is more about all the existing tests currently.

 Out of curiosity: how walking the stack is stable since the stack can
 change? Stop condition is the static block of a class which can use method
 so refactoring and therefore is not stable. Should it be deprecated?


 Le 10 avr. 2018 19:17, "Robert Bradshaw"  a
 écrit :

 If it's too slow perhaps you could use the constructor where you pass
 an explicit id (though in my experience walking the stack isn't that slow).

 On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau <
 rmannibu...@gmail.com> wrote:

> Oops cross post sorry.
>
> Issue i hit on this thread is it is used a lot in tests abd it slows
> down tests for nothing like with generatesequence ones
>
> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
> écrit :
>
>>
>>
>> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a
>> écrit :
>>
>> These values should be, inasmuch as possible, stable across VMs. How
>> slow is slow? Doesn't this happen only once per VM startup?
>>
>>
>> Once per jvm and idea launches a jvm per test and the daemon does
>> save enough time, you still go through the whole project and check all
>> upstream deps it seems.
>>
>> It is <1s with maven vs 5-6s with gradle.
>>
>>
>> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau <
>> rmannibu...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>>> stacktrace or can we use any id generator (like
>>> UUID.random().toString())? Using traces is quite slow under load and
>>> environments where the root stack is not just the "next" level so
>>> 

Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
Le 10 avr. 2018 21:25, "Robert Bradshaw"  a écrit :

On Tue, Apr 10, 2018 at 12:10 PM Romain Manni-Bucau 
wrote:

> This is interesting cause it leads to "why do the workers need to do it
> again instead of reusing the computed one?". Technically the answer is
> trivial but in terms of design I think beam tends to abuse static init
> block - even in dofn api - which easily lead to issues when we will want to
> support more than a main (thinking to OSGi for instance).
>
> So:
>
> 1. Why not using a standard programming model not cinit based? (Perf are
> not a valid answer indeed)
>

The Java language (as far as I know) doesn't have the ability to prohibit
assigning static values (such as TupleTags) as static members. We can,
however, detect this (which is what the current code does). It doesn't seem
to me that code like

public class MyDoFn {
public static final TupleTag SOME_OUTPUT_TAG = new TupleTag<>();
...
}

is "bad practice," especially as this tag will need to be referenced in
multiple places.


It is as soon as you dont run in a flat classpath env. In flat cp it is
acceptable and dont have much side effects...but beam doesnt know where it
runs ;).



> 2. GenId should probably be deprecated and considered a bad practise
>

Is the proposal that we require the user to manually provide unique
identifiers everywhere? Or for static case like above? (Note that
accidentally re-using identifiers can lead to subtle incorrect pipeline
results.)


Yep. And ensure we can serialize a tupletag with an already uuid-generated
id for instance.


This looks like a detail but for beam 3 we should ensure we drop the legacy
> bringing bad practises in our user code.
>
> Le 10 avr. 2018 20:15, "Ben Chambers"  a écrit :
>
>> I believe it doesn't need to be stable across refactoring, only across
>> all workers executing a specific version of the code. Specifically, it is
>> used as follows:
>>
>> 1. Create a pipeline on the user's machine. It walks the stack until the
>> static initializer block, which provides an ID.
>> 2. Send the pipeline to many worker machines.
>> 3. Each worker machine walks the stack until the static initializer block
>> (on the same version of the code), receiving the same ID.
>>
>> This ensures that the tupletag is the same on all the workers, as well as
>> on the user's machine, which is critical since it used as an identifier
>> across these machines.
>>
>> Assigning a UUID would work if all of the machines agreed on the same
>> tuple ID, which could be accomplished with serialization. Serialization,
>> however, doesn't work well with static initializers, since those will have
>> been called to initialize the class at load time.
>>
>> On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau <
>> rmannibu...@gmail.com> wrote:
>>
>>> Well issue is more about all the existing tests currently.
>>>
>>> Out of curiosity: how walking the stack is stable since the stack can
>>> change? Stop condition is the static block of a class which can use method
>>> so refactoring and therefore is not stable. Should it be deprecated?
>>>
>>>
>>> Le 10 avr. 2018 19:17, "Robert Bradshaw"  a écrit :
>>>
>>> If it's too slow perhaps you could use the constructor where you pass an
>>> explicit id (though in my experience walking the stack isn't that slow).
>>>
>>> On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 Oops cross post sorry.

 Issue i hit on this thread is it is used a lot in tests abd it slows
 down tests for nothing like with generatesequence ones

 Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
 écrit :

>
>
> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a
> écrit :
>
> These values should be, inasmuch as possible, stable across VMs. How
> slow is slow? Doesn't this happen only once per VM startup?
>
>
> Once per jvm and idea launches a jvm per test and the daemon does save
> enough time, you still go through the whole project and check all upstream
> deps it seems.
>
> It is <1s with maven vs 5-6s with gradle.
>
>
> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau <
> rmannibu...@gmail.com> wrote:
>
>> Hi
>>
>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>> stacktrace or can we use any id generator (like
>> UUID.random().toString())? Using traces is quite slow under load and
>> environments where the root stack is not just the "next" level so
>> skipping it would be nice.
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>
>
>
>>>


Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
@jb: what did you change? I re-imported the project like 3 times earlier
today and never got it working acceptably :(

Personally if importing the project and right click on a test+debug works
as good as maven in idea id be happy. I can manage other stuff in a console
even if gradle reporting is not that efficient for me for now.

Le 10 avr. 2018 21:37, "Reuven Lax"  a écrit :

> There are a lot of ideas on how to increase usability, but I think they'll
> get lost in the thread. I suggest we try to capture them in Jiras.
>
> I suggest we also find out what common use patterns are (people on this
> thread are probably sufficient), as different people will have different
> workflows. We can then make sure that all common workflows are documented.
> As an example, one task I often do is to run just checkstyle over a module
> or the entire project.
>
> Reuven
>
> On Tue, Apr 10, 2018 at 7:18 PM Jean-Baptiste Onofré 
> wrote:
>
>> FYI, I did a new attempt and it works fine (pretty long). Previous try
>> failed.
>>
>> Regards
>> JB
>>
>> On 10/04/2018 19:52, Kenneth Knowles wrote:
>> > I've been on Idea+Gradle for ~two months, around the time I added
>> > https://github.com/apache/beam/pull/4583 and
>> > https://github.com/apache/beam/pull/4626 to make the import require
>> zero
>> > user work. I have no fear of deleting my project any time and
>> re-importing.
>> >
>> > I agree with not having auto-import on. It is just too slow. I can't
>> > remember if it was importing too often due to build outputs or if it
>> was
>> > just that I was messing with the build.gradle files. Anyhow it doesn't
>> > really add much value.
>> >
>> > The gradle runner _is_ able to use submodules and run individual tests
>> > methods, and all that.
>> >
>> > Kenn
>> >
>> >
>> > On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau
>> > > wrote:
>> >
>> > Runner a test doesnt have the right classpath (idea uses out/
>> instead
>> > of build/) then when you switch on gradle runner the launching uses
>> > gradle which is not able to use submodules directly but reconsider
>> the
>> > whole project which is quite slow for normal dev iterations
>> > compare to just run the test with the right classpath and a fast
>> > compile step if needed. I lost literally 1h for something simple
>> with
>> > that tooling, this is way too much to be acceptable on my side since
>> > I'm sadly not paid to work on beam (one day maybe ;)).
>> >
>> > Romain Manni-Bucau
>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >
>> >
>> > 2018-04-10 18:27 GMT+02:00 Reuven Lax > > >:
>> >  > Romain,
>> >  >
>> >  > Can you detail what's not working. I switched my IntelliJ over to
>> > Gradle
>> >  > about two weeks ago, and haven't had any trouble.
>> >  >
>> >  > Reuven
>> >  >
>> >  > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau
>> > >
>> >  > wrote:
>> >  >>
>> >  >> Ok, didn't find a way to make it working properly (only
>> workaround
>> >  >> with direct commands and no good idea integration for
>> > debugging). I'm
>> >  >> back with maven, if anyone knows how to properly solve it let's
>> > do it.
>> >  >> If not I think JB point is to consider more than any other
>> criteria.
>> >  >>
>> >  >> Romain Manni-Bucau
>> >  >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >  >>
>> >  >>
>> >  >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau
>> > >:
>> >  >> > side note: do NOT use auto-import until you are sure you can,
>> > it locks
>> >  >> > regularly on beam (pby too big for idea?) and makes idea ready
>> > to be
>> >  >> > killed :(
>> >  >> >
>> >  >> > Romain Manni-Bucau
>> >  >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >  >> >
>> >  >> >
>> >  >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré
>> > >:
>> >  >> >> It's what I did, I'm trying a complete reload now (maybe this
>> > step
>> >  >> >> failed).
>> >  >> >>
>> >  >> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
>> >  >> >>>
>> >  >> >>> beam-site PR/414 updates the instructions for using Intellij
>> > and how
>> >  >> >>> to
>> >  >> >>> import a module:
>> >  >> >>>
>> >  >> >>> 1. Create an empty IntelliJ project outside of the Beam
>> > source tree.
>> >  >> >>> 2. Under Project Structure > Project, select a Project SDK.
>> >  >> >>> 3. Under Project Structure > Modules, click the + sign to
>> > add a module
>> >  >> >>> and
>> >  >> >>> select "Import Module".
>> >  >> >>>  1. Select 

Re: Gradle Status [April 6]

2018-04-10 Thread Reuven Lax
There are a lot of ideas on how to increase usability, but I think they'll
get lost in the thread. I suggest we try to capture them in Jiras.

I suggest we also find out what common use patterns are (people on this
thread are probably sufficient), as different people will have different
workflows. We can then make sure that all common workflows are documented.
As an example, one task I often do is to run just checkstyle over a module
or the entire project.

Reuven

On Tue, Apr 10, 2018 at 7:18 PM Jean-Baptiste Onofré 
wrote:

> FYI, I did a new attempt and it works fine (pretty long). Previous try
> failed.
>
> Regards
> JB
>
> On 10/04/2018 19:52, Kenneth Knowles wrote:
> > I've been on Idea+Gradle for ~two months, around the time I added
> > https://github.com/apache/beam/pull/4583 and
> > https://github.com/apache/beam/pull/4626 to make the import require
> zero
> > user work. I have no fear of deleting my project any time and
> re-importing.
> >
> > I agree with not having auto-import on. It is just too slow. I can't
> > remember if it was importing too often due to build outputs or if it was
> > just that I was messing with the build.gradle files. Anyhow it doesn't
> > really add much value.
> >
> > The gradle runner _is_ able to use submodules and run individual tests
> > methods, and all that.
> >
> > Kenn
> >
> >
> > On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau
> > > wrote:
> >
> > Runner a test doesnt have the right classpath (idea uses out/ instead
> > of build/) then when you switch on gradle runner the launching uses
> > gradle which is not able to use submodules directly but reconsider
> the
> > whole project which is quite slow for normal dev iterations
> > compare to just run the test with the right classpath and a fast
> > compile step if needed. I lost literally 1h for something simple with
> > that tooling, this is way too much to be acceptable on my side since
> > I'm sadly not paid to work on beam (one day maybe ;)).
> >
> > Romain Manni-Bucau
> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >
> >
> > 2018-04-10 18:27 GMT+02:00 Reuven Lax  > >:
> >  > Romain,
> >  >
> >  > Can you detail what's not working. I switched my IntelliJ over to
> > Gradle
> >  > about two weeks ago, and haven't had any trouble.
> >  >
> >  > Reuven
> >  >
> >  > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau
> > >
> >  > wrote:
> >  >>
> >  >> Ok, didn't find a way to make it working properly (only
> workaround
> >  >> with direct commands and no good idea integration for
> > debugging). I'm
> >  >> back with maven, if anyone knows how to properly solve it let's
> > do it.
> >  >> If not I think JB point is to consider more than any other
> criteria.
> >  >>
> >  >> Romain Manni-Bucau
> >  >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >  >>
> >  >>
> >  >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau
> > >:
> >  >> > side note: do NOT use auto-import until you are sure you can,
> > it locks
> >  >> > regularly on beam (pby too big for idea?) and makes idea ready
> > to be
> >  >> > killed :(
> >  >> >
> >  >> > Romain Manni-Bucau
> >  >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >  >> >
> >  >> >
> >  >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré
> > >:
> >  >> >> It's what I did, I'm trying a complete reload now (maybe this
> > step
> >  >> >> failed).
> >  >> >>
> >  >> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
> >  >> >>>
> >  >> >>> beam-site PR/414 updates the instructions for using Intellij
> > and how
> >  >> >>> to
> >  >> >>> import a module:
> >  >> >>>
> >  >> >>> 1. Create an empty IntelliJ project outside of the Beam
> > source tree.
> >  >> >>> 2. Under Project Structure > Project, select a Project SDK.
> >  >> >>> 3. Under Project Structure > Modules, click the + sign to
> > add a module
> >  >> >>> and
> >  >> >>> select "Import Module".
> >  >> >>>  1. Select the directory containing the Beam source tree.
> >  >> >>>  2. Tick the "Import module from external model" button
> > and select
> >  >> >>> Gradle
> >  >> >>> from the list.
> >  >> >>>  3. Tick the following boxes.
> >  >> >>> * Use auto-import
> >  >> >>> * Create separate module per source set
> >  >> >>> * Store generated project files externally
> >  >> >>> * Use default gradle wrapper
> >  >> >>> 4. Delegate build actions to Gradle by 

Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Robert Bradshaw
On Tue, Apr 10, 2018 at 12:10 PM Romain Manni-Bucau 
wrote:

> This is interesting cause it leads to "why do the workers need to do it
> again instead of reusing the computed one?". Technically the answer is
> trivial but in terms of design I think beam tends to abuse static init
> block - even in dofn api - which easily lead to issues when we will want to
> support more than a main (thinking to OSGi for instance).
>
> So:
>
> 1. Why not using a standard programming model not cinit based? (Perf are
> not a valid answer indeed)
>

The Java language (as far as I know) doesn't have the ability to prohibit
assigning static values (such as TupleTags) as static members. We can,
however, detect this (which is what the current code does). It doesn't seem
to me that code like

public class MyDoFn {
public static final TupleTag SOME_OUTPUT_TAG = new TupleTag<>();
...
}

is "bad practice," especially as this tag will need to be referenced in
multiple places.


> 2. GenId should probably be deprecated and considered a bad practise
>

Is the proposal that we require the user to manually provide unique
identifiers everywhere? Or for static case like above? (Note that
accidentally re-using identifiers can lead to subtle incorrect pipeline
results.)

This looks like a detail but for beam 3 we should ensure we drop the legacy
> bringing bad practises in our user code.
>
> Le 10 avr. 2018 20:15, "Ben Chambers"  a écrit :
>
>> I believe it doesn't need to be stable across refactoring, only across
>> all workers executing a specific version of the code. Specifically, it is
>> used as follows:
>>
>> 1. Create a pipeline on the user's machine. It walks the stack until the
>> static initializer block, which provides an ID.
>> 2. Send the pipeline to many worker machines.
>> 3. Each worker machine walks the stack until the static initializer block
>> (on the same version of the code), receiving the same ID.
>>
>> This ensures that the tupletag is the same on all the workers, as well as
>> on the user's machine, which is critical since it used as an identifier
>> across these machines.
>>
>> Assigning a UUID would work if all of the machines agreed on the same
>> tuple ID, which could be accomplished with serialization. Serialization,
>> however, doesn't work well with static initializers, since those will have
>> been called to initialize the class at load time.
>>
>> On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau <
>> rmannibu...@gmail.com> wrote:
>>
>>> Well issue is more about all the existing tests currently.
>>>
>>> Out of curiosity: how walking the stack is stable since the stack can
>>> change? Stop condition is the static block of a class which can use method
>>> so refactoring and therefore is not stable. Should it be deprecated?
>>>
>>>
>>> Le 10 avr. 2018 19:17, "Robert Bradshaw"  a écrit :
>>>
>>> If it's too slow perhaps you could use the constructor where you pass an
>>> explicit id (though in my experience walking the stack isn't that slow).
>>>
>>> On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 Oops cross post sorry.

 Issue i hit on this thread is it is used a lot in tests abd it slows
 down tests for nothing like with generatesequence ones

 Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
 écrit :

>
>
> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a
> écrit :
>
> These values should be, inasmuch as possible, stable across VMs. How
> slow is slow? Doesn't this happen only once per VM startup?
>
>
> Once per jvm and idea launches a jvm per test and the daemon does save
> enough time, you still go through the whole project and check all upstream
> deps it seems.
>
> It is <1s with maven vs 5-6s with gradle.
>
>
> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau <
> rmannibu...@gmail.com> wrote:
>
>> Hi
>>
>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>> stacktrace or can we use any id generator (like
>> UUID.random().toString())? Using traces is quite slow under load and
>> environments where the root stack is not just the "next" level so
>> skipping it would be nice.
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>
>
>
>>>


Re: Gradle Status [April 6]

2018-04-10 Thread Jean-Baptiste Onofré
FYI, I did a new attempt and it works fine (pretty long). Previous try 
failed.


Regards
JB

On 10/04/2018 19:52, Kenneth Knowles wrote:
I've been on Idea+Gradle for ~two months, around the time I added 
https://github.com/apache/beam/pull/4583 and 
https://github.com/apache/beam/pull/4626 to make the import require zero 
user work. I have no fear of deleting my project any time and re-importing.


I agree with not having auto-import on. It is just too slow. I can't 
remember if it was importing too often due to build outputs or if it was 
just that I was messing with the build.gradle files. Anyhow it doesn't 
really add much value.


The gradle runner _is_ able to use submodules and run individual tests 
methods, and all that.


Kenn


On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau 
> wrote:


Runner a test doesnt have the right classpath (idea uses out/ instead
of build/) then when you switch on gradle runner the launching uses
gradle which is not able to use submodules directly but reconsider the
whole project which is quite slow for normal dev iterations
compare to just run the test with the right classpath and a fast
compile step if needed. I lost literally 1h for something simple with
that tooling, this is way too much to be acceptable on my side since
I'm sadly not paid to work on beam (one day maybe ;)).

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 18:27 GMT+02:00 Reuven Lax >:
 > Romain,
 >
 > Can you detail what's not working. I switched my IntelliJ over to
Gradle
 > about two weeks ago, and haven't had any trouble.
 >
 > Reuven
 >
 > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau
>
 > wrote:
 >>
 >> Ok, didn't find a way to make it working properly (only workaround
 >> with direct commands and no good idea integration for
debugging). I'm
 >> back with maven, if anyone knows how to properly solve it let's
do it.
 >> If not I think JB point is to consider more than any other criteria.
 >>
 >> Romain Manni-Bucau
 >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
 >>
 >>
 >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau
>:
 >> > side note: do NOT use auto-import until you are sure you can,
it locks
 >> > regularly on beam (pby too big for idea?) and makes idea ready
to be
 >> > killed :(
 >> >
 >> > Romain Manni-Bucau
 >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
 >> >
 >> >
 >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré
>:
 >> >> It's what I did, I'm trying a complete reload now (maybe this
step
 >> >> failed).
 >> >>
 >> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
 >> >>>
 >> >>> beam-site PR/414 updates the instructions for using Intellij
and how
 >> >>> to
 >> >>> import a module:
 >> >>>
 >> >>> 1. Create an empty IntelliJ project outside of the Beam
source tree.
 >> >>> 2. Under Project Structure > Project, select a Project SDK.
 >> >>> 3. Under Project Structure > Modules, click the + sign to
add a module
 >> >>> and
 >> >>>     select "Import Module".
 >> >>>      1. Select the directory containing the Beam source tree.
 >> >>>      2. Tick the "Import module from external model" button
and select
 >> >>> Gradle
 >> >>>         from the list.
 >> >>>      3. Tick the following boxes.
 >> >>>         * Use auto-import
 >> >>>         * Create separate module per source set
 >> >>>         * Store generated project files externally
 >> >>>         * Use default gradle wrapper
 >> >>> 4. Delegate build actions to Gradle by going to Settings >
Build,
 >> >>> Execution,
 >> >>>     Deployment > Build Tools > Gradle and checking "Delegate IDE
 >> >>> build/run
 >> >>>     actions to gradle".
 >> >>>
 >> >>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré

 >> >>> >> wrote:
 >> >>>
 >> >>>     That's a very important issue for contribution. Up to
now, I used
 >> >>> Maven
 >> >>>     for setup IntelliJ (and it works just fine). If we
remove the
 >> >>> pom.xml,
 >> >>>     we have to support Eclipse and IntelliJ "smoothly".
 >> >>>
 >> >>>     Let me try in IntelliJ.
 >> >>>
 >> >>>     Regards
 >> >>>     JB
 >> >>>
 >> >>>     On 10/04/2018 15:21, Romain Manni-Bucau wrote:
 >> >>>      > You dont have issue due to the build setup with that
option. I

Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
Le 10 avr. 2018 19:53, "Kenneth Knowles"  a écrit :

I've been on Idea+Gradle for ~two months, around the time I added
https://github.com/apache/beam/pull/4583 and https://github.com/apache/
beam/pull/4626 to make the import require zero user work. I have no fear of
deleting my project any time and re-importing.



The import works (is slow but works) but then i can run anything "normally".


I agree with not having auto-import on. It is just too slow. I can't
remember if it was importing too often due to build outputs or if it was
just that I was messing with the build.gradle files. Anyhow it doesn't
really add much value.

The gradle runner _is_ able to use submodules and run individual tests
methods, and all that.



If i run gradle from any io or any runner module it still loads the whole
project and checks it which is very slow compare to just run the
module...and i hope i dont need a 50chars long command specific each time
to do it, just "build" would be good.


Kenn


On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau 
wrote:

> Runner a test doesnt have the right classpath (idea uses out/ instead
> of build/) then when you switch on gradle runner the launching uses
> gradle which is not able to use submodules directly but reconsider the
> whole project which is quite slow for normal dev iterations
> compare to just run the test with the right classpath and a fast
> compile step if needed. I lost literally 1h for something simple with
> that tooling, this is way too much to be acceptable on my side since
> I'm sadly not paid to work on beam (one day maybe ;)).
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>
>
> 2018-04-10 18:27 GMT+02:00 Reuven Lax :
> > Romain,
> >
> > Can you detail what's not working. I switched my IntelliJ over to Gradle
> > about two weeks ago, and haven't had any trouble.
> >
> > Reuven
> >
> > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau <
> rmannibu...@gmail.com>
> > wrote:
> >>
> >> Ok, didn't find a way to make it working properly (only workaround
> >> with direct commands and no good idea integration for debugging). I'm
> >> back with maven, if anyone knows how to properly solve it let's do it.
> >> If not I think JB point is to consider more than any other criteria.
> >>
> >> Romain Manni-Bucau
> >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>
> >>
> >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau :
> >> > side note: do NOT use auto-import until you are sure you can, it locks
> >> > regularly on beam (pby too big for idea?) and makes idea ready to be
> >> > killed :(
> >> >
> >> > Romain Manni-Bucau
> >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >> >
> >> >
> >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré :
> >> >> It's what I did, I'm trying a complete reload now (maybe this step
> >> >> failed).
> >> >>
> >> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
> >> >>>
> >> >>> beam-site PR/414 updates the instructions for using Intellij and how
> >> >>> to
> >> >>> import a module:
> >> >>>
> >> >>> 1. Create an empty IntelliJ project outside of the Beam source tree.
> >> >>> 2. Under Project Structure > Project, select a Project SDK.
> >> >>> 3. Under Project Structure > Modules, click the + sign to add a
> module
> >> >>> and
> >> >>> select "Import Module".
> >> >>>  1. Select the directory containing the Beam source tree.
> >> >>>  2. Tick the "Import module from external model" button and
> select
> >> >>> Gradle
> >> >>> from the list.
> >> >>>  3. Tick the following boxes.
> >> >>> * Use auto-import
> >> >>> * Create separate module per source set
> >> >>> * Store generated project files externally
> >> >>> * Use default gradle wrapper
> >> >>> 4. Delegate build actions to Gradle by going to Settings > Build,
> >> >>> Execution,
> >> >>> Deployment > Build Tools > Gradle and checking "Delegate IDE
> >> >>> build/run
> >> >>> actions to gradle".
> >> >>>
> >> >>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> >> >>> > wrote:
> >> >>>
> >> >>> That's a very important issue for contribution. Up to now, I
> used
> >> >>> Maven
> >> >>> for setup IntelliJ (and it works just fine). If we remove the
> >> >>> pom.xml,
> >> >>> we have to support Eclipse and IntelliJ "smoothly".
> >> >>>
> >> >>> Let me try in IntelliJ.
> >> >>>
> >> >>> Regards
> >> >>> JB
> >> >>>
> >> >>> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
> >> >>>  > You dont have issue due to the build setup with that option.
> I
> >> >>> get:
> >> >>>  >
> >> >>>  > avr. 10, 2018 3:20:10 PM
> >> >>>  > org.apache.beam.runners.direct.DirectTransformExecutor run
> >> >>>  > GRAVE: Error occurred within
> >> >>>  > 

Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
This is interesting cause it leads to "why do the workers need to do it
again instead of reusing the computed one?". Technically the answer is
trivial but in terms of design I think beam tends to abuse static init
block - even in dofn api - which easily lead to issues when we will want to
support more than a main (thinking to OSGi for instance).

So:

1. Why not using a standard programming model not cinit based? (Perf are
not a valid answer indeed)
2. GenId should probably be deprecated and considered a bad practise

This looks like a detail but for beam 3 we should ensure we drop the legacy
bringing bad practises in our user code.

Le 10 avr. 2018 20:15, "Ben Chambers"  a écrit :

> I believe it doesn't need to be stable across refactoring, only across all
> workers executing a specific version of the code. Specifically, it is used
> as follows:
>
> 1. Create a pipeline on the user's machine. It walks the stack until the
> static initializer block, which provides an ID.
> 2. Send the pipeline to many worker machines.
> 3. Each worker machine walks the stack until the static initializer block
> (on the same version of the code), receiving the same ID.
>
> This ensures that the tupletag is the same on all the workers, as well as
> on the user's machine, which is critical since it used as an identifier
> across these machines.
>
> Assigning a UUID would work if all of the machines agreed on the same
> tuple ID, which could be accomplished with serialization. Serialization,
> however, doesn't work well with static initializers, since those will have
> been called to initialize the class at load time.
>
> On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau 
> wrote:
>
>> Well issue is more about all the existing tests currently.
>>
>> Out of curiosity: how walking the stack is stable since the stack can
>> change? Stop condition is the static block of a class which can use method
>> so refactoring and therefore is not stable. Should it be deprecated?
>>
>>
>> Le 10 avr. 2018 19:17, "Robert Bradshaw"  a écrit :
>>
>> If it's too slow perhaps you could use the constructor where you pass an
>> explicit id (though in my experience walking the stack isn't that slow).
>>
>> On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau <
>> rmannibu...@gmail.com> wrote:
>>
>>> Oops cross post sorry.
>>>
>>> Issue i hit on this thread is it is used a lot in tests abd it slows
>>> down tests for nothing like with generatesequence ones
>>>
>>> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
>>> écrit :
>>>


 Le 10 avr. 2018 18:40, "Robert Bradshaw"  a
 écrit :

 These values should be, inasmuch as possible, stable across VMs. How
 slow is slow? Doesn't this happen only once per VM startup?


 Once per jvm and idea launches a jvm per test and the daemon does save
 enough time, you still go through the whole project and check all upstream
 deps it seems.

 It is <1s with maven vs 5-6s with gradle.


 On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau <
 rmannibu...@gmail.com> wrote:

> Hi
>
> does org.apache.beam.sdk.values.TupleTag#genId need to get the
> stacktrace or can we use any id generator (like
> UUID.random().toString())? Using traces is quite slow under load and
> environments where the root stack is not just the "next" level so
> skipping it would be nice.
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>


>>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Ben Chambers
I believe it doesn't need to be stable across refactoring, only across all
workers executing a specific version of the code. Specifically, it is used
as follows:

1. Create a pipeline on the user's machine. It walks the stack until the
static initializer block, which provides an ID.
2. Send the pipeline to many worker machines.
3. Each worker machine walks the stack until the static initializer block
(on the same version of the code), receiving the same ID.

This ensures that the tupletag is the same on all the workers, as well as
on the user's machine, which is critical since it used as an identifier
across these machines.

Assigning a UUID would work if all of the machines agreed on the same tuple
ID, which could be accomplished with serialization. Serialization, however,
doesn't work well with static initializers, since those will have been
called to initialize the class at load time.

On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau 
wrote:

> Well issue is more about all the existing tests currently.
>
> Out of curiosity: how walking the stack is stable since the stack can
> change? Stop condition is the static block of a class which can use method
> so refactoring and therefore is not stable. Should it be deprecated?
>
>
> Le 10 avr. 2018 19:17, "Robert Bradshaw"  a écrit :
>
> If it's too slow perhaps you could use the constructor where you pass an
> explicit id (though in my experience walking the stack isn't that slow).
>
> On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau 
> wrote:
>
>> Oops cross post sorry.
>>
>> Issue i hit on this thread is it is used a lot in tests abd it slows down
>> tests for nothing like with generatesequence ones
>>
>> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
>> écrit :
>>
>>>
>>>
>>> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a écrit :
>>>
>>> These values should be, inasmuch as possible, stable across VMs. How
>>> slow is slow? Doesn't this happen only once per VM startup?
>>>
>>>
>>> Once per jvm and idea launches a jvm per test and the daemon does save
>>> enough time, you still go through the whole project and check all upstream
>>> deps it seems.
>>>
>>> It is <1s with maven vs 5-6s with gradle.
>>>
>>>
>>> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 Hi

 does org.apache.beam.sdk.values.TupleTag#genId need to get the
 stacktrace or can we use any id generator (like
 UUID.random().toString())? Using traces is quite slow under load and
 environments where the root stack is not just the "next" level so
 skipping it would be nice.

 Romain Manni-Bucau
 @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book

>>>
>>>
>


Re: Gradle Status [April 6]

2018-04-10 Thread Kenneth Knowles
I've been on Idea+Gradle for ~two months, around the time I added
https://github.com/apache/beam/pull/4583 and
https://github.com/apache/beam/pull/4626 to make the import require zero
user work. I have no fear of deleting my project any time and re-importing.

I agree with not having auto-import on. It is just too slow. I can't
remember if it was importing too often due to build outputs or if it was
just that I was messing with the build.gradle files. Anyhow it doesn't
really add much value.

The gradle runner _is_ able to use submodules and run individual tests
methods, and all that.

Kenn


On Tue, Apr 10, 2018 at 9:31 AM Romain Manni-Bucau 
wrote:

> Runner a test doesnt have the right classpath (idea uses out/ instead
> of build/) then when you switch on gradle runner the launching uses
> gradle which is not able to use submodules directly but reconsider the
> whole project which is quite slow for normal dev iterations
> compare to just run the test with the right classpath and a fast
> compile step if needed. I lost literally 1h for something simple with
> that tooling, this is way too much to be acceptable on my side since
> I'm sadly not paid to work on beam (one day maybe ;)).
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>
>
> 2018-04-10 18:27 GMT+02:00 Reuven Lax :
> > Romain,
> >
> > Can you detail what's not working. I switched my IntelliJ over to Gradle
> > about two weeks ago, and haven't had any trouble.
> >
> > Reuven
> >
> > On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau <
> rmannibu...@gmail.com>
> > wrote:
> >>
> >> Ok, didn't find a way to make it working properly (only workaround
> >> with direct commands and no good idea integration for debugging). I'm
> >> back with maven, if anyone knows how to properly solve it let's do it.
> >> If not I think JB point is to consider more than any other criteria.
> >>
> >> Romain Manni-Bucau
> >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>
> >>
> >> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau :
> >> > side note: do NOT use auto-import until you are sure you can, it locks
> >> > regularly on beam (pby too big for idea?) and makes idea ready to be
> >> > killed :(
> >> >
> >> > Romain Manni-Bucau
> >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >> >
> >> >
> >> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré :
> >> >> It's what I did, I'm trying a complete reload now (maybe this step
> >> >> failed).
> >> >>
> >> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
> >> >>>
> >> >>> beam-site PR/414 updates the instructions for using Intellij and how
> >> >>> to
> >> >>> import a module:
> >> >>>
> >> >>> 1. Create an empty IntelliJ project outside of the Beam source tree.
> >> >>> 2. Under Project Structure > Project, select a Project SDK.
> >> >>> 3. Under Project Structure > Modules, click the + sign to add a
> module
> >> >>> and
> >> >>> select "Import Module".
> >> >>>  1. Select the directory containing the Beam source tree.
> >> >>>  2. Tick the "Import module from external model" button and
> select
> >> >>> Gradle
> >> >>> from the list.
> >> >>>  3. Tick the following boxes.
> >> >>> * Use auto-import
> >> >>> * Create separate module per source set
> >> >>> * Store generated project files externally
> >> >>> * Use default gradle wrapper
> >> >>> 4. Delegate build actions to Gradle by going to Settings > Build,
> >> >>> Execution,
> >> >>> Deployment > Build Tools > Gradle and checking "Delegate IDE
> >> >>> build/run
> >> >>> actions to gradle".
> >> >>>
> >> >>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré <
> j...@nanthrax.net
> >> >>> > wrote:
> >> >>>
> >> >>> That's a very important issue for contribution. Up to now, I
> used
> >> >>> Maven
> >> >>> for setup IntelliJ (and it works just fine). If we remove the
> >> >>> pom.xml,
> >> >>> we have to support Eclipse and IntelliJ "smoothly".
> >> >>>
> >> >>> Let me try in IntelliJ.
> >> >>>
> >> >>> Regards
> >> >>> JB
> >> >>>
> >> >>> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
> >> >>>  > You dont have issue due to the build setup with that option.
> I
> >> >>> get:
> >> >>>  >
> >> >>>  > avr. 10, 2018 3:20:10 PM
> >> >>>  > org.apache.beam.runners.direct.DirectTransformExecutor run
> >> >>>  > GRAVE: Error occurred within
> >> >>>  >
> org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
> >> >>>  > com.google.common.util.concurrent.ExecutionError:
> >> >>>  > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
> >> >>>  >
> >> >>>  > ?
> >> >>>  >
> >> >>>  > Romain Manni-Bucau
> >> >>>  > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >> >>>  >
> >> >>>  >
> >> >>>  > 2018-04-10 15:13 GMT+02:00 

Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
Well issue is more about all the existing tests currently.

Out of curiosity: how walking the stack is stable since the stack can
change? Stop condition is the static block of a class which can use method
so refactoring and therefore is not stable. Should it be deprecated?

Le 10 avr. 2018 19:17, "Robert Bradshaw"  a écrit :

If it's too slow perhaps you could use the constructor where you pass an
explicit id (though in my experience walking the stack isn't that slow).

On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau 
wrote:

> Oops cross post sorry.
>
> Issue i hit on this thread is it is used a lot in tests abd it slows down
> tests for nothing like with generatesequence ones
>
> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
> écrit :
>
>>
>>
>> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a écrit :
>>
>> These values should be, inasmuch as possible, stable across VMs. How slow
>> is slow? Doesn't this happen only once per VM startup?
>>
>>
>> Once per jvm and idea launches a jvm per test and the daemon does save
>> enough time, you still go through the whole project and check all upstream
>> deps it seems.
>>
>> It is <1s with maven vs 5-6s with gradle.
>>
>>
>> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
>> wrote:
>>
>>> Hi
>>>
>>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>>> stacktrace or can we use any id generator (like
>>> UUID.random().toString())? Using traces is quite slow under load and
>>> environments where the root stack is not just the "next" level so
>>> skipping it would be nice.
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>
>>
>>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Robert Bradshaw
If it's too slow perhaps you could use the constructor where you pass an
explicit id (though in my experience walking the stack isn't that slow).

On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau 
wrote:

> Oops cross post sorry.
>
> Issue i hit on this thread is it is used a lot in tests abd it slows down
> tests for nothing like with generatesequence ones
>
> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
> écrit :
>
>>
>>
>> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a écrit :
>>
>> These values should be, inasmuch as possible, stable across VMs. How slow
>> is slow? Doesn't this happen only once per VM startup?
>>
>>
>> Once per jvm and idea launches a jvm per test and the daemon does save
>> enough time, you still go through the whole project and check all upstream
>> deps it seems.
>>
>> It is <1s with maven vs 5-6s with gradle.
>>
>>
>> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
>> wrote:
>>
>>> Hi
>>>
>>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>>> stacktrace or can we use any id generator (like
>>> UUID.random().toString())? Using traces is quite slow under load and
>>> environments where the root stack is not just the "next" level so
>>> skipping it would be nice.
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>
>>
>>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Thomas Groh
It may be reasonable to port most of those TupleTags to have an explicit,
rather than generated ID, which will remove the need to inspect the stack
trace.

However, as mentioned, the constructor shouldn't provide an unstable ID, as
otherwise most pipelines won't work on production runners.

On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau 
wrote:

> Oops cross post sorry.
>
> Issue i hit on this thread is it is used a lot in tests abd it slows down
> tests for nothing like with generatesequence ones
>
> Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
> écrit :
>
>>
>>
>> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a écrit :
>>
>> These values should be, inasmuch as possible, stable across VMs. How slow
>> is slow? Doesn't this happen only once per VM startup?
>>
>>
>> Once per jvm and idea launches a jvm per test and the daemon does save
>> enough time, you still go through the whole project and check all upstream
>> deps it seems.
>>
>> It is <1s with maven vs 5-6s with gradle.
>>
>>
>> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
>> wrote:
>>
>>> Hi
>>>
>>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>>> stacktrace or can we use any id generator (like
>>> UUID.random().toString())? Using traces is quite slow under load and
>>> environments where the root stack is not just the "next" level so
>>> skipping it would be nice.
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>
>>
>>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
Le 10 avr. 2018 18:40, "Robert Bradshaw"  a écrit :

These values should be, inasmuch as possible, stable across VMs. How slow
is slow? Doesn't this happen only once per VM startup?


Once per jvm and idea launches a jvm per test and the daemon does save
enough time, you still go through the whole project and check all upstream
deps it seems.

It is <1s with maven vs 5-6s with gradle.


On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
wrote:

> Hi
>
> does org.apache.beam.sdk.values.TupleTag#genId need to get the
> stacktrace or can we use any id generator (like
> UUID.random().toString())? Using traces is quite slow under load and
> environments where the root stack is not just the "next" level so
> skipping it would be nice.
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
Oops cross post sorry.

Issue i hit on this thread is it is used a lot in tests abd it slows down
tests for nothing like with generatesequence ones

Le 10 avr. 2018 19:00, "Romain Manni-Bucau"  a
écrit :

>
>
> Le 10 avr. 2018 18:40, "Robert Bradshaw"  a écrit :
>
> These values should be, inasmuch as possible, stable across VMs. How slow
> is slow? Doesn't this happen only once per VM startup?
>
>
> Once per jvm and idea launches a jvm per test and the daemon does save
> enough time, you still go through the whole project and check all upstream
> deps it seems.
>
> It is <1s with maven vs 5-6s with gradle.
>
>
> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
> wrote:
>
>> Hi
>>
>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>> stacktrace or can we use any id generator (like
>> UUID.random().toString())? Using traces is quite slow under load and
>> environments where the root stack is not just the "next" level so
>> skipping it would be nice.
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>
>
>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Thomas Groh
In fact, this is explicitly to work with `static final` TupleTags, and
using a non-stable isn't feasible.

A static final TupleTag won't be serialized in the closure of an object
that uses it - it will be instantiated independently in any other
ClassLoader, such as on a remote JVM. If you use a constant TupleTag during
pipeline construction, and again during runtime, they must have matching
identifiers, or the system can't correlate the two objects. Use of
something like UUID.random() would remove our ability to use any constant
values.

On Tue, Apr 10, 2018 at 9:40 AM Robert Bradshaw  wrote:

> These values should be, inasmuch as possible, stable across VMs. How slow
> is slow? Doesn't this happen only once per VM startup?
>
> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
> wrote:
>
>> Hi
>>
>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>> stacktrace or can we use any id generator (like
>> UUID.random().toString())? Using traces is quite slow under load and
>> environments where the root stack is not just the "next" level so
>> skipping it would be nice.
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>
>


Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-10 Thread Alex Amato
I've gathered a lot of feedback so far and want to make a decision by
Friday, and begin working on related PRs next week.

Please make sure that you provide your feedback before then and I will post
the final decisions made to this thread Friday afternoon.

On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía  wrote:

> Nice, I created a short link so people can refer to it easily in
> future discussions, website, etc.
>
> https://s.apache.org/beam-fn-api-metrics
>
> Thanks for sharing.
>
>
> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw 
> wrote:
> > Thanks for the nice writeup. I added some comments.
> >
> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato  wrote:
> >>
> >> Hello beam community,
> >>
> >> Thank you everyone for your initial feedback on this proposal so far. I
> >> have made some revisions based on the feedback. There were some larger
> >> questions asking about alternatives. For each of these I have added a
> >> section tagged with [Alternatives] and discussed my recommendation as
> well
> >> as as few other choices we considered.
> >>
> >> I would appreciate more feedback on the revised proposal. Please take
> >> another look and let me know
> >>
> >>
> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
> >>
> >> Etienne, I would appreciate it if you could please take another look
> after
> >> the revisions I have made as well.
> >>
> >> Thanks again,
> >> Alex
> >>
> >
>


Re: org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Robert Bradshaw
These values should be, inasmuch as possible, stable across VMs. How slow
is slow? Doesn't this happen only once per VM startup?

On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau 
wrote:

> Hi
>
> does org.apache.beam.sdk.values.TupleTag#genId need to get the
> stacktrace or can we use any id generator (like
> UUID.random().toString())? Using traces is quite slow under load and
> environments where the root stack is not just the "next" level so
> skipping it would be nice.
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>


org.apache.beam.sdk.values.TupleTag#genId and stacktraces?

2018-04-10 Thread Romain Manni-Bucau
Hi

does org.apache.beam.sdk.values.TupleTag#genId need to get the
stacktrace or can we use any id generator (like
UUID.random().toString())? Using traces is quite slow under load and
environments where the root stack is not just the "next" level so
skipping it would be nice.

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
Runner a test doesnt have the right classpath (idea uses out/ instead
of build/) then when you switch on gradle runner the launching uses
gradle which is not able to use submodules directly but reconsider the
whole project which is quite slow for normal dev iterations
compare to just run the test with the right classpath and a fast
compile step if needed. I lost literally 1h for something simple with
that tooling, this is way too much to be acceptable on my side since
I'm sadly not paid to work on beam (one day maybe ;)).

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 18:27 GMT+02:00 Reuven Lax :
> Romain,
>
> Can you detail what's not working. I switched my IntelliJ over to Gradle
> about two weeks ago, and haven't had any trouble.
>
> Reuven
>
> On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau 
> wrote:
>>
>> Ok, didn't find a way to make it working properly (only workaround
>> with direct commands and no good idea integration for debugging). I'm
>> back with maven, if anyone knows how to properly solve it let's do it.
>> If not I think JB point is to consider more than any other criteria.
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>
>>
>> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau :
>> > side note: do NOT use auto-import until you are sure you can, it locks
>> > regularly on beam (pby too big for idea?) and makes idea ready to be
>> > killed :(
>> >
>> > Romain Manni-Bucau
>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >
>> >
>> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré :
>> >> It's what I did, I'm trying a complete reload now (maybe this step
>> >> failed).
>> >>
>> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
>> >>>
>> >>> beam-site PR/414 updates the instructions for using Intellij and how
>> >>> to
>> >>> import a module:
>> >>>
>> >>> 1. Create an empty IntelliJ project outside of the Beam source tree.
>> >>> 2. Under Project Structure > Project, select a Project SDK.
>> >>> 3. Under Project Structure > Modules, click the + sign to add a module
>> >>> and
>> >>> select "Import Module".
>> >>>  1. Select the directory containing the Beam source tree.
>> >>>  2. Tick the "Import module from external model" button and select
>> >>> Gradle
>> >>> from the list.
>> >>>  3. Tick the following boxes.
>> >>> * Use auto-import
>> >>> * Create separate module per source set
>> >>> * Store generated project files externally
>> >>> * Use default gradle wrapper
>> >>> 4. Delegate build actions to Gradle by going to Settings > Build,
>> >>> Execution,
>> >>> Deployment > Build Tools > Gradle and checking "Delegate IDE
>> >>> build/run
>> >>> actions to gradle".
>> >>>
>> >>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré > >>> > wrote:
>> >>>
>> >>> That's a very important issue for contribution. Up to now, I used
>> >>> Maven
>> >>> for setup IntelliJ (and it works just fine). If we remove the
>> >>> pom.xml,
>> >>> we have to support Eclipse and IntelliJ "smoothly".
>> >>>
>> >>> Let me try in IntelliJ.
>> >>>
>> >>> Regards
>> >>> JB
>> >>>
>> >>> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
>> >>>  > You dont have issue due to the build setup with that option. I
>> >>> get:
>> >>>  >
>> >>>  > avr. 10, 2018 3:20:10 PM
>> >>>  > org.apache.beam.runners.direct.DirectTransformExecutor run
>> >>>  > GRAVE: Error occurred within
>> >>>  > org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
>> >>>  > com.google.common.util.concurrent.ExecutionError:
>> >>>  > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
>> >>>  >
>> >>>  > ?
>> >>>  >
>> >>>  > Romain Manni-Bucau
>> >>>  > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >>>  >
>> >>>  >
>> >>>  > 2018-04-10 15:13 GMT+02:00 Lukasz Cwik > >>> >:
>> >>>  >> I have found that the simplest setup is to delegate the
>> >>> build/test actions
>> >>>  >> to Gradle. This allows you to run unit tests very easily and
>> >>> since its in
>> >>>  >> the same manner that Gradle would have, you know that if its
>> >>> passing it will
>> >>>  >> pass on the command line and on Jenkins. Here is one site that
>> >>> discusses how
>> >>>  >> to set this up:
>> >>>  >>
>> >>>
>> >>>
>> >>> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
>> >>>  >>
>> >>>  >>
>> >>>  >> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau
>> >>> >
>> >>>  >> wrote:
>> >>>  >>>
>> >>>  >>> What's the plan to make idea supporting gradle on beam
>> >>> project?

Re: Gradle Status [April 6]

2018-04-10 Thread Reuven Lax
Romain,

Can you detail what's not working. I switched my IntelliJ over to Gradle
about two weeks ago, and haven't had any trouble.

Reuven

On Tue, Apr 10, 2018 at 4:20 PM Romain Manni-Bucau 
wrote:

> Ok, didn't find a way to make it working properly (only workaround
> with direct commands and no good idea integration for debugging). I'm
> back with maven, if anyone knows how to properly solve it let's do it.
> If not I think JB point is to consider more than any other criteria.
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>
>
> 2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau :
> > side note: do NOT use auto-import until you are sure you can, it locks
> > regularly on beam (pby too big for idea?) and makes idea ready to be
> > killed :(
> >
> > Romain Manni-Bucau
> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >
> >
> > 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré :
> >> It's what I did, I'm trying a complete reload now (maybe this step
> failed).
> >>
> >> On 10/04/2018 16:38, Lukasz Cwik wrote:
> >>>
> >>> beam-site PR/414 updates the instructions for using Intellij and how to
> >>> import a module:
> >>>
> >>> 1. Create an empty IntelliJ project outside of the Beam source tree.
> >>> 2. Under Project Structure > Project, select a Project SDK.
> >>> 3. Under Project Structure > Modules, click the + sign to add a module
> and
> >>> select "Import Module".
> >>>  1. Select the directory containing the Beam source tree.
> >>>  2. Tick the "Import module from external model" button and select
> >>> Gradle
> >>> from the list.
> >>>  3. Tick the following boxes.
> >>> * Use auto-import
> >>> * Create separate module per source set
> >>> * Store generated project files externally
> >>> * Use default gradle wrapper
> >>> 4. Delegate build actions to Gradle by going to Settings > Build,
> >>> Execution,
> >>> Deployment > Build Tools > Gradle and checking "Delegate IDE
> build/run
> >>> actions to gradle".
> >>>
> >>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré  >>> > wrote:
> >>>
> >>> That's a very important issue for contribution. Up to now, I used
> >>> Maven
> >>> for setup IntelliJ (and it works just fine). If we remove the
> pom.xml,
> >>> we have to support Eclipse and IntelliJ "smoothly".
> >>>
> >>> Let me try in IntelliJ.
> >>>
> >>> Regards
> >>> JB
> >>>
> >>> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
> >>>  > You dont have issue due to the build setup with that option. I
> get:
> >>>  >
> >>>  > avr. 10, 2018 3:20:10 PM
> >>>  > org.apache.beam.runners.direct.DirectTransformExecutor run
> >>>  > GRAVE: Error occurred within
> >>>  > org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
> >>>  > com.google.common.util.concurrent.ExecutionError:
> >>>  > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
> >>>  >
> >>>  > ?
> >>>  >
> >>>  > Romain Manni-Bucau
> >>>  > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>>  >
> >>>  >
> >>>  > 2018-04-10 15:13 GMT+02:00 Lukasz Cwik  >>> >:
> >>>  >> I have found that the simplest setup is to delegate the
> >>> build/test actions
> >>>  >> to Gradle. This allows you to run unit tests very easily and
> >>> since its in
> >>>  >> the same manner that Gradle would have, you know that if its
> >>> passing it will
> >>>  >> pass on the command line and on Jenkins. Here is one site that
> >>> discusses how
> >>>  >> to set this up:
> >>>  >>
> >>>
> >>>
> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
> >>>  >>
> >>>  >>
> >>>  >> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau
> >>> >
> >>>  >> wrote:
> >>>  >>>
> >>>  >>> What's the plan to make idea supporting gradle on beam
> project?
> >>> Do we
> >>>  >>> import the workaround mentionned in
> >>>  >>> https://youtrack.jetbrains.com/issue/IDEA-175172?
> >>>  >>> For the ones who didn't see this issue in action: idea will
> >>> compile in
> >>>  >>> out/ instead of build/ and you will just miss all the
> resources
> >>> you
> >>>  >>> need like some SPI registration which are used by all our
> >>> registrar =>
> >>>  >>> no way to run tests in idea without hacking the configuration
> >>> quite
> >>>  >>> deeply :(
> >>>  >>>
> >>>  >>> Romain Manni-Bucau
> >>>  >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>>  >>>
> >>>  >>>
> >>>  >>> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot
> >>> >:
> >>>
> >>>

Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
Ok, didn't find a way to make it working properly (only workaround
with direct commands and no good idea integration for debugging). I'm
back with maven, if anyone knows how to properly solve it let's do it.
If not I think JB point is to consider more than any other criteria.

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 16:41 GMT+02:00 Romain Manni-Bucau :
> side note: do NOT use auto-import until you are sure you can, it locks
> regularly on beam (pby too big for idea?) and makes idea ready to be
> killed :(
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>
>
> 2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré :
>> It's what I did, I'm trying a complete reload now (maybe this step failed).
>>
>> On 10/04/2018 16:38, Lukasz Cwik wrote:
>>>
>>> beam-site PR/414 updates the instructions for using Intellij and how to
>>> import a module:
>>>
>>> 1. Create an empty IntelliJ project outside of the Beam source tree.
>>> 2. Under Project Structure > Project, select a Project SDK.
>>> 3. Under Project Structure > Modules, click the + sign to add a module and
>>> select "Import Module".
>>>  1. Select the directory containing the Beam source tree.
>>>  2. Tick the "Import module from external model" button and select
>>> Gradle
>>> from the list.
>>>  3. Tick the following boxes.
>>> * Use auto-import
>>> * Create separate module per source set
>>> * Store generated project files externally
>>> * Use default gradle wrapper
>>> 4. Delegate build actions to Gradle by going to Settings > Build,
>>> Execution,
>>> Deployment > Build Tools > Gradle and checking "Delegate IDE build/run
>>> actions to gradle".
>>>
>>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré >> > wrote:
>>>
>>> That's a very important issue for contribution. Up to now, I used
>>> Maven
>>> for setup IntelliJ (and it works just fine). If we remove the pom.xml,
>>> we have to support Eclipse and IntelliJ "smoothly".
>>>
>>> Let me try in IntelliJ.
>>>
>>> Regards
>>> JB
>>>
>>> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
>>>  > You dont have issue due to the build setup with that option. I get:
>>>  >
>>>  > avr. 10, 2018 3:20:10 PM
>>>  > org.apache.beam.runners.direct.DirectTransformExecutor run
>>>  > GRAVE: Error occurred within
>>>  > org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
>>>  > com.google.common.util.concurrent.ExecutionError:
>>>  > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
>>>  >
>>>  > ?
>>>  >
>>>  > Romain Manni-Bucau
>>>  > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>  >
>>>  >
>>>  > 2018-04-10 15:13 GMT+02:00 Lukasz Cwik >> >:
>>>  >> I have found that the simplest setup is to delegate the
>>> build/test actions
>>>  >> to Gradle. This allows you to run unit tests very easily and
>>> since its in
>>>  >> the same manner that Gradle would have, you know that if its
>>> passing it will
>>>  >> pass on the command line and on Jenkins. Here is one site that
>>> discusses how
>>>  >> to set this up:
>>>  >>
>>>
>>> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
>>>  >>
>>>  >>
>>>  >> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau
>>> >
>>>  >> wrote:
>>>  >>>
>>>  >>> What's the plan to make idea supporting gradle on beam project?
>>> Do we
>>>  >>> import the workaround mentionned in
>>>  >>> https://youtrack.jetbrains.com/issue/IDEA-175172?
>>>  >>> For the ones who didn't see this issue in action: idea will
>>> compile in
>>>  >>> out/ instead of build/ and you will just miss all the resources
>>> you
>>>  >>> need like some SPI registration which are used by all our
>>> registrar =>
>>>  >>> no way to run tests in idea without hacking the configuration
>>> quite
>>>  >>> deeply :(
>>>  >>>
>>>  >>> Romain Manni-Bucau
>>>  >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>  >>>
>>>  >>>
>>>  >>> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot
>>> >:
>>>
>>>   As a gradle beginner, I could not agree more !
>>>   +1
>>>   Etienne
>>>   Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a
>>> écrit :
>>>  
>>>   Hi all,
>>>  
>>>   I did multiple gradle build since last week and I would like
>>> to share
>>>   one of my concern: it's about the communities.
>>>  
>>>   If I think our users won't see any change for them 

Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
side note: do NOT use auto-import until you are sure you can, it locks
regularly on beam (pby too big for idea?) and makes idea ready to be
killed :(

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 16:40 GMT+02:00 Jean-Baptiste Onofré :
> It's what I did, I'm trying a complete reload now (maybe this step failed).
>
> On 10/04/2018 16:38, Lukasz Cwik wrote:
>>
>> beam-site PR/414 updates the instructions for using Intellij and how to
>> import a module:
>>
>> 1. Create an empty IntelliJ project outside of the Beam source tree.
>> 2. Under Project Structure > Project, select a Project SDK.
>> 3. Under Project Structure > Modules, click the + sign to add a module and
>> select "Import Module".
>>  1. Select the directory containing the Beam source tree.
>>  2. Tick the "Import module from external model" button and select
>> Gradle
>> from the list.
>>  3. Tick the following boxes.
>> * Use auto-import
>> * Create separate module per source set
>> * Store generated project files externally
>> * Use default gradle wrapper
>> 4. Delegate build actions to Gradle by going to Settings > Build,
>> Execution,
>> Deployment > Build Tools > Gradle and checking "Delegate IDE build/run
>> actions to gradle".
>>
>> On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré > > wrote:
>>
>> That's a very important issue for contribution. Up to now, I used
>> Maven
>> for setup IntelliJ (and it works just fine). If we remove the pom.xml,
>> we have to support Eclipse and IntelliJ "smoothly".
>>
>> Let me try in IntelliJ.
>>
>> Regards
>> JB
>>
>> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
>>  > You dont have issue due to the build setup with that option. I get:
>>  >
>>  > avr. 10, 2018 3:20:10 PM
>>  > org.apache.beam.runners.direct.DirectTransformExecutor run
>>  > GRAVE: Error occurred within
>>  > org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
>>  > com.google.common.util.concurrent.ExecutionError:
>>  > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
>>  >
>>  > ?
>>  >
>>  > Romain Manni-Bucau
>>  > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>  >
>>  >
>>  > 2018-04-10 15:13 GMT+02:00 Lukasz Cwik > >:
>>  >> I have found that the simplest setup is to delegate the
>> build/test actions
>>  >> to Gradle. This allows you to run unit tests very easily and
>> since its in
>>  >> the same manner that Gradle would have, you know that if its
>> passing it will
>>  >> pass on the command line and on Jenkins. Here is one site that
>> discusses how
>>  >> to set this up:
>>  >>
>>
>> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
>>  >>
>>  >>
>>  >> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau
>> >
>>  >> wrote:
>>  >>>
>>  >>> What's the plan to make idea supporting gradle on beam project?
>> Do we
>>  >>> import the workaround mentionned in
>>  >>> https://youtrack.jetbrains.com/issue/IDEA-175172?
>>  >>> For the ones who didn't see this issue in action: idea will
>> compile in
>>  >>> out/ instead of build/ and you will just miss all the resources
>> you
>>  >>> need like some SPI registration which are used by all our
>> registrar =>
>>  >>> no way to run tests in idea without hacking the configuration
>> quite
>>  >>> deeply :(
>>  >>>
>>  >>> Romain Manni-Bucau
>>  >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>  >>>
>>  >>>
>>  >>> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot
>> >:
>>
>>   As a gradle beginner, I could not agree more !
>>   +1
>>   Etienne
>>   Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a
>> écrit :
>>  
>>   Hi all,
>>  
>>   I did multiple gradle build since last week and I would like
>> to share
>>   one of my concern: it's about the communities.
>>  
>>   If I think our users won't see any change for them due to
>> Gradle build
>>   (I think that most of our users will still use Maven with
>> artifacts
>>   provided by Gradle), I'm more concerned by the dev community
>> and the
>>   contribution.
>>  
>>   Maven is well known and straight forward for a large part of
>> potential
>>   contributors. I think we have to keep in mind that we still
>> have to grow
>>   up our contributors community.
>>  
>>   Today, maybe I'm wrong, but I have the feeling that 

Re: Gradle Status [April 6]

2018-04-10 Thread Jean-Baptiste Onofré

It's what I did, I'm trying a complete reload now (maybe this step failed).

On 10/04/2018 16:38, Lukasz Cwik wrote:
beam-site PR/414 updates the instructions for using Intellij and how to 
import a module:


1. Create an empty IntelliJ project outside of the Beam source tree.
2. Under Project Structure > Project, select a Project SDK.
3. Under Project Structure > Modules, click the + sign to add a module and
    select "Import Module".
     1. Select the directory containing the Beam source tree.
     2. Tick the "Import module from external model" button and select 
Gradle

        from the list.
     3. Tick the following boxes.
        * Use auto-import
        * Create separate module per source set
        * Store generated project files externally
        * Use default gradle wrapper
4. Delegate build actions to Gradle by going to Settings > Build, Execution,
    Deployment > Build Tools > Gradle and checking "Delegate IDE build/run
    actions to gradle".

On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré > wrote:


That's a very important issue for contribution. Up to now, I used Maven
for setup IntelliJ (and it works just fine). If we remove the pom.xml,
we have to support Eclipse and IntelliJ "smoothly".

Let me try in IntelliJ.

Regards
JB

On 10/04/2018 15:21, Romain Manni-Bucau wrote:
 > You dont have issue due to the build setup with that option. I get:
 >
 > avr. 10, 2018 3:20:10 PM
 > org.apache.beam.runners.direct.DirectTransformExecutor run
 > GRAVE: Error occurred within
 > org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
 > com.google.common.util.concurrent.ExecutionError:
 > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
 >
 > ?
 >
 > Romain Manni-Bucau
 > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
 >
 >
 > 2018-04-10 15:13 GMT+02:00 Lukasz Cwik >:
 >> I have found that the simplest setup is to delegate the
build/test actions
 >> to Gradle. This allows you to run unit tests very easily and
since its in
 >> the same manner that Gradle would have, you know that if its
passing it will
 >> pass on the command line and on Jenkins. Here is one site that
discusses how
 >> to set this up:
 >>

http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
 >>
 >>
 >> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau
>
 >> wrote:
 >>>
 >>> What's the plan to make idea supporting gradle on beam project?
Do we
 >>> import the workaround mentionned in
 >>> https://youtrack.jetbrains.com/issue/IDEA-175172?
 >>> For the ones who didn't see this issue in action: idea will
compile in
 >>> out/ instead of build/ and you will just miss all the resources you
 >>> need like some SPI registration which are used by all our
registrar =>
 >>> no way to run tests in idea without hacking the configuration quite
 >>> deeply :(
 >>>
 >>> Romain Manni-Bucau
 >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
 >>>
 >>>
 >>> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot
>:
  As a gradle beginner, I could not agree more !
  +1
  Etienne
  Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a
écrit :
 
  Hi all,
 
  I did multiple gradle build since last week and I would like
to share
  one of my concern: it's about the communities.
 
  If I think our users won't see any change for them due to
Gradle build
  (I think that most of our users will still use Maven with
artifacts
  provided by Gradle), I'm more concerned by the dev community
and the
  contribution.
 
  Maven is well known and straight forward for a large part of
potential
  contributors. I think we have to keep in mind that we still
have to grow
  up our contributors community.
 
  Today, maybe I'm wrong, but I have the feeling that gradle
build is not
  straight forward (build.gradle includes build_rules.gradle,
gathering
  all taks all together).
 
  I would like to add a task in the gradle "migration" process:
simplify
  the gradle structure and files, and document this.
 
  I know we already have a Jira about the documentation part,
but I would
  like to "polish" and use a clean structure for the Gradle
resources. As
  already quickly discussed, I think that having one gradle file
per tasks
  in the .gradle directory would be helpful.
 
  The 

Re: Gradle Status [April 6]

2018-04-10 Thread Lukasz Cwik
beam-site PR/414 updates the instructions for using Intellij and how to
import a module:

1. Create an empty IntelliJ project outside of the Beam source tree.
2. Under Project Structure > Project, select a Project SDK.
3. Under Project Structure > Modules, click the + sign to add a module and
   select "Import Module".
1. Select the directory containing the Beam source tree.
2. Tick the "Import module from external model" button and select Gradle
   from the list.
3. Tick the following boxes.
   * Use auto-import
   * Create separate module per source set
   * Store generated project files externally
   * Use default gradle wrapper
4. Delegate build actions to Gradle by going to Settings > Build, Execution,
   Deployment > Build Tools > Gradle and checking "Delegate IDE build/run
   actions to gradle".

On Tue, Apr 10, 2018 at 10:34 AM Jean-Baptiste Onofré 
wrote:

> That's a very important issue for contribution. Up to now, I used Maven
> for setup IntelliJ (and it works just fine). If we remove the pom.xml,
> we have to support Eclipse and IntelliJ "smoothly".
>
> Let me try in IntelliJ.
>
> Regards
> JB
>
> On 10/04/2018 15:21, Romain Manni-Bucau wrote:
> > You dont have issue due to the build setup with that option. I get:
> >
> > avr. 10, 2018 3:20:10 PM
> > org.apache.beam.runners.direct.DirectTransformExecutor run
> > GRAVE: Error occurred within
> > org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
> > com.google.common.util.concurrent.ExecutionError:
> > java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
> >
> > ?
> >
> > Romain Manni-Bucau
> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >
> >
> > 2018-04-10 15:13 GMT+02:00 Lukasz Cwik :
> >> I have found that the simplest setup is to delegate the build/test
> actions
> >> to Gradle. This allows you to run unit tests very easily and since its
> in
> >> the same manner that Gradle would have, you know that if its passing it
> will
> >> pass on the command line and on Jenkins. Here is one site that
> discusses how
> >> to set this up:
> >>
> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
> >>
> >>
> >> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau <
> rmannibu...@gmail.com>
> >> wrote:
> >>>
> >>> What's the plan to make idea supporting gradle on beam project? Do we
> >>> import the workaround mentionned in
> >>> https://youtrack.jetbrains.com/issue/IDEA-175172?
> >>> For the ones who didn't see this issue in action: idea will compile in
> >>> out/ instead of build/ and you will just miss all the resources you
> >>> need like some SPI registration which are used by all our registrar =>
> >>> no way to run tests in idea without hacking the configuration quite
> >>> deeply :(
> >>>
> >>> Romain Manni-Bucau
> >>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>>
> >>>
> >>> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot :
>  As a gradle beginner, I could not agree more !
>  +1
>  Etienne
>  Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :
> 
>  Hi all,
> 
>  I did multiple gradle build since last week and I would like to share
>  one of my concern: it's about the communities.
> 
>  If I think our users won't see any change for them due to Gradle build
>  (I think that most of our users will still use Maven with artifacts
>  provided by Gradle), I'm more concerned by the dev community and the
>  contribution.
> 
>  Maven is well known and straight forward for a large part of potential
>  contributors. I think we have to keep in mind that we still have to
> grow
>  up our contributors community.
> 
>  Today, maybe I'm wrong, but I have the feeling that gradle build is
> not
>  straight forward (build.gradle includes build_rules.gradle, gathering
>  all taks all together).
> 
>  I would like to add a task in the gradle "migration" process: simplify
>  the gradle structure and files, and document this.
> 
>  I know we already have a Jira about the documentation part, but I
> would
>  like to "polish" and use a clean structure for the Gradle resources.
> As
>  already quickly discussed, I think that having one gradle file per
> tasks
>  in the .gradle directory would be helpful.
> 
>  The goal is really to simplify the contribution.
> 
>  Do you agree if I add a Jira about "Gradle polish" ?
>  Thoughts ?
> 
>  Regards
>  JB
> 
>  On 07/04/2018 04:52, Scott Wegner wrote:
> 
>  Here's an end-of-day update on migration work:
> 
>  * Snapshot unsigned dailies and signed release builds are working
> (!!).
>  PR/5048 [1] merges changes from Luke's branch
>  * python precommit failing... will investigate python precommit
>  Monday
>  * All Precommits are gradle only
>  

Re: Gradle Status [April 6]

2018-04-10 Thread Jean-Baptiste Onofré
That's a very important issue for contribution. Up to now, I used Maven 
for setup IntelliJ (and it works just fine). If we remove the pom.xml, 
we have to support Eclipse and IntelliJ "smoothly".


Let me try in IntelliJ.

Regards
JB

On 10/04/2018 15:21, Romain Manni-Bucau wrote:

You dont have issue due to the build setup with that option. I get:

avr. 10, 2018 3:20:10 PM
org.apache.beam.runners.direct.DirectTransformExecutor run
GRAVE: Error occurred within
org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
com.google.common.util.concurrent.ExecutionError:
java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy

?

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 15:13 GMT+02:00 Lukasz Cwik :

I have found that the simplest setup is to delegate the build/test actions
to Gradle. This allows you to run unit tests very easily and since its in
the same manner that Gradle would have, you know that if its passing it will
pass on the command line and on Jenkins. Here is one site that discusses how
to set this up:
http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html


On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau 
wrote:


What's the plan to make idea supporting gradle on beam project? Do we
import the workaround mentionned in
https://youtrack.jetbrains.com/issue/IDEA-175172?
For the ones who didn't see this issue in action: idea will compile in
out/ instead of build/ and you will just miss all the resources you
need like some SPI registration which are used by all our registrar =>
no way to run tests in idea without hacking the configuration quite
deeply :(

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 10:08 GMT+02:00 Etienne Chauchot :

As a gradle beginner, I could not agree more !
+1
Etienne
Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :

Hi all,

I did multiple gradle build since last week and I would like to share
one of my concern: it's about the communities.

If I think our users won't see any change for them due to Gradle build
(I think that most of our users will still use Maven with artifacts
provided by Gradle), I'm more concerned by the dev community and the
contribution.

Maven is well known and straight forward for a large part of potential
contributors. I think we have to keep in mind that we still have to grow
up our contributors community.

Today, maybe I'm wrong, but I have the feeling that gradle build is not
straight forward (build.gradle includes build_rules.gradle, gathering
all taks all together).

I would like to add a task in the gradle "migration" process: simplify
the gradle structure and files, and document this.

I know we already have a Jira about the documentation part, but I would
like to "polish" and use a clean structure for the Gradle resources. As
already quickly discussed, I think that having one gradle file per tasks
in the .gradle directory would be helpful.

The goal is really to simplify the contribution.

Do you agree if I add a Jira about "Gradle polish" ?
Thoughts ?

Regards
JB

On 07/04/2018 04:52, Scott Wegner wrote:

Here's an end-of-day update on migration work:

* Snapshot unsigned dailies and signed release builds are working (!!).
PR/5048 [1] merges changes from Luke's branch
* python precommit failing... will investigate python precommit
Monday
* All Precommits are gradle only
* All Postcommits except performance tests and Java_JDK_Versions_Test
use gradle (after PR/5047 [2] merged)
* Nightly snapshot release using gradle is ready; needs PR/5048 to be
merged before switching
* ValidatesRunner_Spark failing consistently; investigating

Thanks for another productive day of hacking. I'll pick up again on
Monday.

[1] https://github.com/apache/beam/pull/5048
[2] https://github.com/apache/beam/pull/5047


On Fri, Apr 6, 2018 at 11:24 AM Romain Manni-Bucau
> wrote:

 Why building a zip per runner which its stack and just pointing out
 on that zip and let beam lazy load the runner:

 --runner=LazyRunner --lazyRunnerDir=... --lazyRunnerOptions=... (or
 the fromSystemProperties() if it gets merged a day ;))

 Le 6 avr. 2018 20:21, "Kenneth Knowles" > a écrit :

 I'm working on finding a solution for launching the Nexmark
 suite with each runner. This doesn't have to be done via Gradle,
 but we anyhow need built artifacts that don't require user
 classpath intervention.

 It looks to me like the examples are also missing this - they
 have separate configuration e.g. sparkRunnerPreCommit but that
 is overspecified compared to a free-form launching of a main()
 program with a runner profile.

 On Fri, Apr 6, 2018 at 11:09 AM Lukasz Cwik 

Re: Gradle Status [April 6]

2018-04-10 Thread Lukasz Cwik
Romain, I haven't seen that error. At the very top of your test execution
log it gives you the tasks that it is running, for example:
6:41:33 AM: Executing tasks ':beam-sdks-java-core:cleanTest
:beam-sdks-java-core:test --tests
"org.apache.beam.sdk.coders.AvroCoderTest.testAvroCoderEncoding"'...

What task did it fail for you for?

On Tue, Apr 10, 2018 at 9:21 AM Romain Manni-Bucau 
wrote:

> You dont have issue due to the build setup with that option. I get:
>
> avr. 10, 2018 3:20:10 PM
> org.apache.beam.runners.direct.DirectTransformExecutor run
> GRAVE: Error occurred within
> org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
> com.google.common.util.concurrent.ExecutionError:
> java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy
>
> ?
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>
>
> 2018-04-10 15:13 GMT+02:00 Lukasz Cwik :
> > I have found that the simplest setup is to delegate the build/test
> actions
> > to Gradle. This allows you to run unit tests very easily and since its in
> > the same manner that Gradle would have, you know that if its passing it
> will
> > pass on the command line and on Jenkins. Here is one site that discusses
> how
> > to set this up:
> >
> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
> >
> >
> > On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau <
> rmannibu...@gmail.com>
> > wrote:
> >>
> >> What's the plan to make idea supporting gradle on beam project? Do we
> >> import the workaround mentionned in
> >> https://youtrack.jetbrains.com/issue/IDEA-175172?
> >> For the ones who didn't see this issue in action: idea will compile in
> >> out/ instead of build/ and you will just miss all the resources you
> >> need like some SPI registration which are used by all our registrar =>
> >> no way to run tests in idea without hacking the configuration quite
> >> deeply :(
> >>
> >> Romain Manni-Bucau
> >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>
> >>
> >> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot :
> >> > As a gradle beginner, I could not agree more !
> >> > +1
> >> > Etienne
> >> > Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :
> >> >
> >> > Hi all,
> >> >
> >> > I did multiple gradle build since last week and I would like to share
> >> > one of my concern: it's about the communities.
> >> >
> >> > If I think our users won't see any change for them due to Gradle build
> >> > (I think that most of our users will still use Maven with artifacts
> >> > provided by Gradle), I'm more concerned by the dev community and the
> >> > contribution.
> >> >
> >> > Maven is well known and straight forward for a large part of potential
> >> > contributors. I think we have to keep in mind that we still have to
> grow
> >> > up our contributors community.
> >> >
> >> > Today, maybe I'm wrong, but I have the feeling that gradle build is
> not
> >> > straight forward (build.gradle includes build_rules.gradle, gathering
> >> > all taks all together).
> >> >
> >> > I would like to add a task in the gradle "migration" process: simplify
> >> > the gradle structure and files, and document this.
> >> >
> >> > I know we already have a Jira about the documentation part, but I
> would
> >> > like to "polish" and use a clean structure for the Gradle resources.
> As
> >> > already quickly discussed, I think that having one gradle file per
> tasks
> >> > in the .gradle directory would be helpful.
> >> >
> >> > The goal is really to simplify the contribution.
> >> >
> >> > Do you agree if I add a Jira about "Gradle polish" ?
> >> > Thoughts ?
> >> >
> >> > Regards
> >> > JB
> >> >
> >> > On 07/04/2018 04:52, Scott Wegner wrote:
> >> >
> >> > Here's an end-of-day update on migration work:
> >> >
> >> > * Snapshot unsigned dailies and signed release builds are working
> (!!).
> >> > PR/5048 [1] merges changes from Luke's branch
> >> >* python precommit failing... will investigate python precommit
> >> > Monday
> >> > * All Precommits are gradle only
> >> > * All Postcommits except performance tests and Java_JDK_Versions_Test
> >> > use gradle (after PR/5047 [2] merged)
> >> > * Nightly snapshot release using gradle is ready; needs PR/5048 to be
> >> > merged before switching
> >> > * ValidatesRunner_Spark failing consistently; investigating
> >> >
> >> > Thanks for another productive day of hacking. I'll pick up again on
> >> > Monday.
> >> >
> >> > [1] https://github.com/apache/beam/pull/5048
> >> > [2] https://github.com/apache/beam/pull/5047
> >> >
> >> >
> >> > On Fri, Apr 6, 2018 at 11:24 AM Romain Manni-Bucau
> >> > > wrote:
> >> >
> >> > Why building a zip per runner which its stack and just pointing
> out
> >> > on that zip and let beam lazy load the runner:
> >> >
> >> > --runner=LazyRunner --lazyRunnerDir=... --lazyRunnerOptions=...
> 

Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-04-10 Thread Lukasz Cwik
Nightly snapshots have migrated to being produced via Gradle. Alan Myrvold
and a few others have been working on adding automated tests that validate
those nightly snapshots by running through the quickstarts available on the
website.

Please try out the nightly snapshot and report bugs as sub-tasks underneath
BEAM-3249 related to dependency issues you find.

On Tue, Apr 10, 2018 at 8:43 AM Ismaël Mejía  wrote:

> +1000 to Romain's point on dependencies, we have to obsessively pay
> attention to the consistency of the dependencies, this is critical for
> users and we cannot radically change the produced artifacts or we risk
> of breaking their applications..
>
>
> On Tue, Apr 10, 2018 at 6:56 AM, Romain Manni-Bucau
>  wrote:
> > Yes, but I never saw anyone grabbing the sources from dist in maven world
> > but I did saw people using maven dependency plugin to grab the sources
> and
> > the pom and rebuild the modules. I'm not saying it is the best practise
> but
> > beam will always be maven for most java users so we must be very careful
> on
> > that.
> >
> > Personally i only nees dependencies to respect provided/compile/test
> scopes
> > (not shadow which corrupts a pom ;)).
> >
> > For the story jetbrains builds with gradle its plugin repository client
> and
> > uploads to bintray the same kind of pom that we have in the mentionned
> > branch (just gav, no dep etc). It is fully broken on consuler side and
> > requires users to just bypass the dependency management goodness and go
> to
> > the sources to find all the build constraints (java compatible version,
> > packagings, ...). This is why i mentionned that generating some build
> > plugins is important and that keeping profiles would not break consumers.
> >
> > Le 10 avr. 2018 05:34, "Jean-Baptiste Onofré"  a écrit
> :
> >>
> >> Hi Luke,
> >>
> >> you are right, from a Apache perspective, the only required artifacts is
> >> the
> >> source tarball on dist (that should be buildable).
> >>
> >> There is no requirement for the ones on Maven, it's more for convenience
> >> for our
> >> users.
> >>
> >> Regards
> >> JB
> >>
> >> On 04/09/2018 09:56 PM, Lukasz Cwik wrote:
> >> > Romain, I was under the impression that the source tar ball that is
> >> > uploaded to
> >> > www.apache.org/dist/  is required to be
> >> > buildable
> >> > and is a separate deliverable from the artifacts (jars
> >> > (source/test/javadoc/...)/poms) uploaded to
> >> > https://repository.apache.org/service/local/staging/deploy/maven2.
> >> >
> >> > The source tar ball uploaded to www.apache.org/dist/
> >> >  will contain the gradle build files
> >> > allowing one
> >> > to reproduce the artifacts (jars (source/test/javadoc)/poms).
> >> >
> >> > On Mon, Apr 9, 2018 at 3:44 PM Romain Manni-Bucau <
> rmannibu...@gmail.com
> >> > > wrote:
> >> >
> >> >
> >> >
> >> > Le 9 avr. 2018 16:06, "Lukasz Cwik"  >> > > a écrit :
> >> >
> >> >
> >> >
> >> > On Mon, Apr 9, 2018 at 10:02 AM Romain Manni-Bucau
> >> > > wrote:
> >> >
> >> > I got the same with that PR applied and the previous
> >> > command. Is using
> >> > your fork needed?
> >> >
> >> > No, you can also use https://github.com/apache/beam/pull/5048
> >> >
> >> >
> >> > Is there any PR to import it?
> >> >
> >> > Yes, https://github.com/apache/beam/pull/5048
> >> >
> >> >
> >> >
> >> > Ok so it doesnt work and generates a pom without parent nor
> >> > dependencies
> >> > which is a bare minimum but not enough since exploding the sources
> >> > jar and
> >> > running the pom should build a valid jar.
> >> >
> >> >
> >> > In any case master is not ready to be released with that
> yet
> >> > - to come
> >> > back to the actual topic.
> >> >
> >> >
> >> > Romain Manni-Bucau
> >> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >> >
> >> >
> >> > 2018-04-09 15:56 GMT+02:00 Lukasz Cwik  >> > >:
> >> > > Romain,
> >> > > The gradle based release process has an open PR in
> >> > > https://github.com/apache/beam/pull/5048 to merge to
> >> > master.
> >> > > I thought you were running the commands from
> >> > > https://github.com/lukecwik/incubator-beam/tree/gradle
> >> > >
> >> > > On Mon, Apr 9, 2018 at 9:13 AM Romain Manni-Bucau
> >> > >
> >> > > wrote:
> >> > >>
> >> > >> @Lukasz: same with gradlew and release option, pom is
> >> > empty (no
> >> > parent, no
> >> > >> 

Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
You dont have issue due to the build setup with that option. I get:

avr. 10, 2018 3:20:10 PM
org.apache.beam.runners.direct.DirectTransformExecutor run
GRAVE: Error occurred within
org.apache.beam.runners.direct.DirectTransformExecutor@66761b7a
com.google.common.util.concurrent.ExecutionError:
java.lang.NoClassDefFoundError: net/bytebuddy/NamingStrategy

?

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 15:13 GMT+02:00 Lukasz Cwik :
> I have found that the simplest setup is to delegate the build/test actions
> to Gradle. This allows you to run unit tests very easily and since its in
> the same manner that Gradle would have, you know that if its passing it will
> pass on the command line and on Jenkins. Here is one site that discusses how
> to set this up:
> http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html
>
>
> On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau 
> wrote:
>>
>> What's the plan to make idea supporting gradle on beam project? Do we
>> import the workaround mentionned in
>> https://youtrack.jetbrains.com/issue/IDEA-175172?
>> For the ones who didn't see this issue in action: idea will compile in
>> out/ instead of build/ and you will just miss all the resources you
>> need like some SPI registration which are used by all our registrar =>
>> no way to run tests in idea without hacking the configuration quite
>> deeply :(
>>
>> Romain Manni-Bucau
>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>
>>
>> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot :
>> > As a gradle beginner, I could not agree more !
>> > +1
>> > Etienne
>> > Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :
>> >
>> > Hi all,
>> >
>> > I did multiple gradle build since last week and I would like to share
>> > one of my concern: it's about the communities.
>> >
>> > If I think our users won't see any change for them due to Gradle build
>> > (I think that most of our users will still use Maven with artifacts
>> > provided by Gradle), I'm more concerned by the dev community and the
>> > contribution.
>> >
>> > Maven is well known and straight forward for a large part of potential
>> > contributors. I think we have to keep in mind that we still have to grow
>> > up our contributors community.
>> >
>> > Today, maybe I'm wrong, but I have the feeling that gradle build is not
>> > straight forward (build.gradle includes build_rules.gradle, gathering
>> > all taks all together).
>> >
>> > I would like to add a task in the gradle "migration" process: simplify
>> > the gradle structure and files, and document this.
>> >
>> > I know we already have a Jira about the documentation part, but I would
>> > like to "polish" and use a clean structure for the Gradle resources. As
>> > already quickly discussed, I think that having one gradle file per tasks
>> > in the .gradle directory would be helpful.
>> >
>> > The goal is really to simplify the contribution.
>> >
>> > Do you agree if I add a Jira about "Gradle polish" ?
>> > Thoughts ?
>> >
>> > Regards
>> > JB
>> >
>> > On 07/04/2018 04:52, Scott Wegner wrote:
>> >
>> > Here's an end-of-day update on migration work:
>> >
>> > * Snapshot unsigned dailies and signed release builds are working (!!).
>> > PR/5048 [1] merges changes from Luke's branch
>> >* python precommit failing... will investigate python precommit
>> > Monday
>> > * All Precommits are gradle only
>> > * All Postcommits except performance tests and Java_JDK_Versions_Test
>> > use gradle (after PR/5047 [2] merged)
>> > * Nightly snapshot release using gradle is ready; needs PR/5048 to be
>> > merged before switching
>> > * ValidatesRunner_Spark failing consistently; investigating
>> >
>> > Thanks for another productive day of hacking. I'll pick up again on
>> > Monday.
>> >
>> > [1] https://github.com/apache/beam/pull/5048
>> > [2] https://github.com/apache/beam/pull/5047
>> >
>> >
>> > On Fri, Apr 6, 2018 at 11:24 AM Romain Manni-Bucau
>> > > wrote:
>> >
>> > Why building a zip per runner which its stack and just pointing out
>> > on that zip and let beam lazy load the runner:
>> >
>> > --runner=LazyRunner --lazyRunnerDir=... --lazyRunnerOptions=... (or
>> > the fromSystemProperties() if it gets merged a day ;))
>> >
>> > Le 6 avr. 2018 20:21, "Kenneth Knowles" > > > a écrit :
>> >
>> > I'm working on finding a solution for launching the Nexmark
>> > suite with each runner. This doesn't have to be done via Gradle,
>> > but we anyhow need built artifacts that don't require user
>> > classpath intervention.
>> >
>> > It looks to me like the examples are also missing this - they
>> > have separate configuration e.g. sparkRunnerPreCommit but that
>> > is overspecified compared to a 

Re: Gradle Status [April 6]

2018-04-10 Thread Lukasz Cwik
I have found that the simplest setup is to delegate the build/test actions
to Gradle. This allows you to run unit tests very easily and since its in
the same manner that Gradle would have, you know that if its passing it
will pass on the command line and on Jenkins. Here is one site that
discusses how to set this up:
http://mrhaki.blogspot.com/2016/11/gradle-goodness-delegate-build-and-run.html


On Tue, Apr 10, 2018 at 8:45 AM Romain Manni-Bucau 
wrote:

> What's the plan to make idea supporting gradle on beam project? Do we
> import the workaround mentionned in
> https://youtrack.jetbrains.com/issue/IDEA-175172?
> For the ones who didn't see this issue in action: idea will compile in
> out/ instead of build/ and you will just miss all the resources you
> need like some SPI registration which are used by all our registrar =>
> no way to run tests in idea without hacking the configuration quite
> deeply :(
>
> Romain Manni-Bucau
> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>
>
> 2018-04-10 10:08 GMT+02:00 Etienne Chauchot :
> > As a gradle beginner, I could not agree more !
> > +1
> > Etienne
> > Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :
> >
> > Hi all,
> >
> > I did multiple gradle build since last week and I would like to share
> > one of my concern: it's about the communities.
> >
> > If I think our users won't see any change for them due to Gradle build
> > (I think that most of our users will still use Maven with artifacts
> > provided by Gradle), I'm more concerned by the dev community and the
> > contribution.
> >
> > Maven is well known and straight forward for a large part of potential
> > contributors. I think we have to keep in mind that we still have to grow
> > up our contributors community.
> >
> > Today, maybe I'm wrong, but I have the feeling that gradle build is not
> > straight forward (build.gradle includes build_rules.gradle, gathering
> > all taks all together).
> >
> > I would like to add a task in the gradle "migration" process: simplify
> > the gradle structure and files, and document this.
> >
> > I know we already have a Jira about the documentation part, but I would
> > like to "polish" and use a clean structure for the Gradle resources. As
> > already quickly discussed, I think that having one gradle file per tasks
> > in the .gradle directory would be helpful.
> >
> > The goal is really to simplify the contribution.
> >
> > Do you agree if I add a Jira about "Gradle polish" ?
> > Thoughts ?
> >
> > Regards
> > JB
> >
> > On 07/04/2018 04:52, Scott Wegner wrote:
> >
> > Here's an end-of-day update on migration work:
> >
> > * Snapshot unsigned dailies and signed release builds are working (!!).
> > PR/5048 [1] merges changes from Luke's branch
> >* python precommit failing... will investigate python precommit Monday
> > * All Precommits are gradle only
> > * All Postcommits except performance tests and Java_JDK_Versions_Test
> > use gradle (after PR/5047 [2] merged)
> > * Nightly snapshot release using gradle is ready; needs PR/5048 to be
> > merged before switching
> > * ValidatesRunner_Spark failing consistently; investigating
> >
> > Thanks for another productive day of hacking. I'll pick up again on
> Monday.
> >
> > [1] https://github.com/apache/beam/pull/5048
> > [2] https://github.com/apache/beam/pull/5047
> >
> >
> > On Fri, Apr 6, 2018 at 11:24 AM Romain Manni-Bucau
> > > wrote:
> >
> > Why building a zip per runner which its stack and just pointing out
> > on that zip and let beam lazy load the runner:
> >
> > --runner=LazyRunner --lazyRunnerDir=... --lazyRunnerOptions=... (or
> > the fromSystemProperties() if it gets merged a day ;))
> >
> > Le 6 avr. 2018 20:21, "Kenneth Knowles"  > > a écrit :
> >
> > I'm working on finding a solution for launching the Nexmark
> > suite with each runner. This doesn't have to be done via Gradle,
> > but we anyhow need built artifacts that don't require user
> > classpath intervention.
> >
> > It looks to me like the examples are also missing this - they
> > have separate configuration e.g. sparkRunnerPreCommit but that
> > is overspecified compared to a free-form launching of a main()
> > program with a runner profile.
> >
> > On Fri, Apr 6, 2018 at 11:09 AM Lukasz Cwik  > > wrote:
> >
> > Romain, are you talking about the profiles that exist as
> > part of the archetype examples?
> >
> > If so, then those still exist and haven't been changed. If
> > not, can you provide a link to the profile in a pom file to
> > be clearer?
> >
> > On Fri, Apr 6, 2018 at 12:40 PM Romain Manni-Bucau
> > 

Re: Gradle Status [April 6]

2018-04-10 Thread Romain Manni-Bucau
What's the plan to make idea supporting gradle on beam project? Do we
import the workaround mentionned in
https://youtrack.jetbrains.com/issue/IDEA-175172?
For the ones who didn't see this issue in action: idea will compile in
out/ instead of build/ and you will just miss all the resources you
need like some SPI registration which are used by all our registrar =>
no way to run tests in idea without hacking the configuration quite
deeply :(

Romain Manni-Bucau
@rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book


2018-04-10 10:08 GMT+02:00 Etienne Chauchot :
> As a gradle beginner, I could not agree more !
> +1
> Etienne
> Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :
>
> Hi all,
>
> I did multiple gradle build since last week and I would like to share
> one of my concern: it's about the communities.
>
> If I think our users won't see any change for them due to Gradle build
> (I think that most of our users will still use Maven with artifacts
> provided by Gradle), I'm more concerned by the dev community and the
> contribution.
>
> Maven is well known and straight forward for a large part of potential
> contributors. I think we have to keep in mind that we still have to grow
> up our contributors community.
>
> Today, maybe I'm wrong, but I have the feeling that gradle build is not
> straight forward (build.gradle includes build_rules.gradle, gathering
> all taks all together).
>
> I would like to add a task in the gradle "migration" process: simplify
> the gradle structure and files, and document this.
>
> I know we already have a Jira about the documentation part, but I would
> like to "polish" and use a clean structure for the Gradle resources. As
> already quickly discussed, I think that having one gradle file per tasks
> in the .gradle directory would be helpful.
>
> The goal is really to simplify the contribution.
>
> Do you agree if I add a Jira about "Gradle polish" ?
> Thoughts ?
>
> Regards
> JB
>
> On 07/04/2018 04:52, Scott Wegner wrote:
>
> Here's an end-of-day update on migration work:
>
> * Snapshot unsigned dailies and signed release builds are working (!!).
> PR/5048 [1] merges changes from Luke's branch
>* python precommit failing... will investigate python precommit Monday
> * All Precommits are gradle only
> * All Postcommits except performance tests and Java_JDK_Versions_Test
> use gradle (after PR/5047 [2] merged)
> * Nightly snapshot release using gradle is ready; needs PR/5048 to be
> merged before switching
> * ValidatesRunner_Spark failing consistently; investigating
>
> Thanks for another productive day of hacking. I'll pick up again on Monday.
>
> [1] https://github.com/apache/beam/pull/5048
> [2] https://github.com/apache/beam/pull/5047
>
>
> On Fri, Apr 6, 2018 at 11:24 AM Romain Manni-Bucau
> > wrote:
>
> Why building a zip per runner which its stack and just pointing out
> on that zip and let beam lazy load the runner:
>
> --runner=LazyRunner --lazyRunnerDir=... --lazyRunnerOptions=... (or
> the fromSystemProperties() if it gets merged a day ;))
>
> Le 6 avr. 2018 20:21, "Kenneth Knowles"  > a écrit :
>
> I'm working on finding a solution for launching the Nexmark
> suite with each runner. This doesn't have to be done via Gradle,
> but we anyhow need built artifacts that don't require user
> classpath intervention.
>
> It looks to me like the examples are also missing this - they
> have separate configuration e.g. sparkRunnerPreCommit but that
> is overspecified compared to a free-form launching of a main()
> program with a runner profile.
>
> On Fri, Apr 6, 2018 at 11:09 AM Lukasz Cwik  > wrote:
>
> Romain, are you talking about the profiles that exist as
> part of the archetype examples?
>
> If so, then those still exist and haven't been changed. If
> not, can you provide a link to the profile in a pom file to
> be clearer?
>
> On Fri, Apr 6, 2018 at 12:40 PM Romain Manni-Bucau
> > wrote:
>
> Hi Scott,
>
> is it right that 2 doesn't handle the hierachy anymore
> and that it doesn't handle profiles for runners as it is
> currently with maven?
>
>
> Romain Manni-Bucau
> @rmannibucau  | Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
>
> 

Re: [PROPOSAL] Preparing 2.5.0 release next week

2018-04-10 Thread Ismaël Mejía
+1000 to Romain's point on dependencies, we have to obsessively pay
attention to the consistency of the dependencies, this is critical for
users and we cannot radically change the produced artifacts or we risk
of breaking their applications..


On Tue, Apr 10, 2018 at 6:56 AM, Romain Manni-Bucau
 wrote:
> Yes, but I never saw anyone grabbing the sources from dist in maven world
> but I did saw people using maven dependency plugin to grab the sources and
> the pom and rebuild the modules. I'm not saying it is the best practise but
> beam will always be maven for most java users so we must be very careful on
> that.
>
> Personally i only nees dependencies to respect provided/compile/test scopes
> (not shadow which corrupts a pom ;)).
>
> For the story jetbrains builds with gradle its plugin repository client and
> uploads to bintray the same kind of pom that we have in the mentionned
> branch (just gav, no dep etc). It is fully broken on consuler side and
> requires users to just bypass the dependency management goodness and go to
> the sources to find all the build constraints (java compatible version,
> packagings, ...). This is why i mentionned that generating some build
> plugins is important and that keeping profiles would not break consumers.
>
> Le 10 avr. 2018 05:34, "Jean-Baptiste Onofré"  a écrit :
>>
>> Hi Luke,
>>
>> you are right, from a Apache perspective, the only required artifacts is
>> the
>> source tarball on dist (that should be buildable).
>>
>> There is no requirement for the ones on Maven, it's more for convenience
>> for our
>> users.
>>
>> Regards
>> JB
>>
>> On 04/09/2018 09:56 PM, Lukasz Cwik wrote:
>> > Romain, I was under the impression that the source tar ball that is
>> > uploaded to
>> > www.apache.org/dist/  is required to be
>> > buildable
>> > and is a separate deliverable from the artifacts (jars
>> > (source/test/javadoc/...)/poms) uploaded to
>> > https://repository.apache.org/service/local/staging/deploy/maven2.
>> >
>> > The source tar ball uploaded to www.apache.org/dist/
>> >  will contain the gradle build files
>> > allowing one
>> > to reproduce the artifacts (jars (source/test/javadoc)/poms).
>> >
>> > On Mon, Apr 9, 2018 at 3:44 PM Romain Manni-Bucau > > > wrote:
>> >
>> >
>> >
>> > Le 9 avr. 2018 16:06, "Lukasz Cwik" > > > a écrit :
>> >
>> >
>> >
>> > On Mon, Apr 9, 2018 at 10:02 AM Romain Manni-Bucau
>> > > wrote:
>> >
>> > I got the same with that PR applied and the previous
>> > command. Is using
>> > your fork needed?
>> >
>> > No, you can also use https://github.com/apache/beam/pull/5048
>> >
>> >
>> > Is there any PR to import it?
>> >
>> > Yes, https://github.com/apache/beam/pull/5048
>> >
>> >
>> >
>> > Ok so it doesnt work and generates a pom without parent nor
>> > dependencies
>> > which is a bare minimum but not enough since exploding the sources
>> > jar and
>> > running the pom should build a valid jar.
>> >
>> >
>> > In any case master is not ready to be released with that yet
>> > - to come
>> > back to the actual topic.
>> >
>> >
>> > Romain Manni-Bucau
>> > @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >
>> >
>> > 2018-04-09 15:56 GMT+02:00 Lukasz Cwik > > >:
>> > > Romain,
>> > > The gradle based release process has an open PR in
>> > > https://github.com/apache/beam/pull/5048 to merge to
>> > master.
>> > > I thought you were running the commands from
>> > > https://github.com/lukecwik/incubator-beam/tree/gradle
>> > >
>> > > On Mon, Apr 9, 2018 at 9:13 AM Romain Manni-Bucau
>> > >
>> > > wrote:
>> > >>
>> > >> @Lukasz: same with gradlew and release option, pom is
>> > empty (no
>> > parent, no
>> > >> dependencies, no more description - needed since central
>> > poms use
>> > that for
>> > >> doc purposes).
>> > >>
>> > >>
>> > >> Romain Manni-Bucau
>> > >> @rmannibucau |  Blog | Old Blog | Github | LinkedIn |
>> > Book
>> > >>
>> > >> 2018-04-09 15:00 GMT+02:00 Reuven Lax > > >:
>> > >>>
>> > >>> Is everything needed merged into master?
>> > >>>
>> > >>> If so, why don't we try doing it with Gradle, but "fail
>> > fast"
>> > back to
>> > >>> Maven if 

Re: DirectRunner in test - await completion of workers threads?

2018-04-10 Thread Ismaël Mejía
It seems there is still an issue with teardown not being called in
failed tasks, just created BEAM-4040 to track it.

On Thu, Apr 5, 2018 at 4:45 PM, Tim Robertson  wrote:
> Will do - I'll report the result on https://github.com/apache/beam/pull/4905
>
> On Thu, Apr 5, 2018 at 11:45 AM, Ismaël Mejía  wrote:
>>
>> For info, Romain's PR was merged today, can you confirm if this fixes
>> the issue Tim.
>>
>> On Sun, Apr 1, 2018 at 9:21 PM, Tim Robertson 
>> wrote:
>> > Thanks all.
>> >
>> > I went with what I outlined above, which you can see in this test.
>> >
>> > https://github.com/timrobertson100/beam/blob/BEAM-3848/sdks/java/io/solr/src/test/java/org/apache/beam/sdk/io/solr/SolrIOTest.java#L285
>> >
>> > That forms part of this PR https://github.com/apache/beam/pull/4956
>> >
>> > I'll monitor Romain's PR and back it out when appropriate.
>> >
>> >
>> >
>> >
>> >
>> > On Sun, Apr 1, 2018 at 8:20 PM, Jean-Baptiste Onofré 
>> > wrote:
>> >>
>> >> Indeed. It's exactly what Romain's PR is about.
>> >>
>> >> Regards
>> >> JB
>> >> Le 1 avr. 2018, à 19:33, Reuven Lax  a écrit:
>> >>>
>> >>> Correct - teardown is currently run in the direct runner, but
>> >>> asynchronously. I believe Romain's pending PRs should solve this for
>> >>> your
>> >>> use case.
>> >>>
>> >>> On Sun, Apr 1, 2018 at 3:13 AM Tim Robertson <
>> >>> timrobertson...@gmail.com>
>> >>> wrote:
>> 
>>  Thanks for confirming Romain - also for the very fast reply!
>> 
>>  I'll continue with the workaround and reference BEAM-3409 inline as
>>  justification.
>>  I'm trying to wrap this up before travel next week, but if I get a
>>  chance I'll try and run this scenario (BEAM-3848) with your patch.
>> 
>> 
>> 
>>  On Sun, Apr 1, 2018 at 12:05 PM, Romain Manni-Bucau
>>   wrote:
>> >
>> > Hi
>> >
>> > I have the same blocker and created
>> >
>> > https://github.com/apache/beam/pull/4790 and
>> > https://github.com/apache/beam/pull/4965 to solve part of it
>> >
>> >
>> >
>> > Le 1 avr. 2018 11:35, "Tim Robertson" < timrobertson...@gmail.com> a
>> > écrit :
>> >
>> > Hi devs
>> >
>> > I'm working on SolrIO tests for failure scenarios (i.e. an exception
>> > will come out of the pipeline execution).  I see that the exception
>> > is
>> > surfaced to the driver while " direct-runner-worker" threads are
>> > still
>> > running.  This causes issue because:
>> >
>> >   1. The Solr tests do thread leak detection, and a
>> > solrClient.close()
>> > is what removes the object
>> >   2. @Teardown is not necessarily called which is what would close
>> > the
>> > solrClient
>> >
>> > I can unregister all the solrClients that have been spawned.
>> > However I
>> > have seen race conditions where there are still threads running
>> > creating and
>> > registering clients. I need to someone ensure that all workers
>> > related to
>> > the pipeline execution are indeed finished so no new ones are
>> > created after
>> > the first exception is passed up.
>> >
>> > Currently I have this (psuedo code) which works, but I suspect
>> > someone
>> > can suggest a better approach:
>> >
>> > // store the state of clients registered for object leak check
>> > Set existingClients = registeredSolrClients();
>> > try {
>> >   pipeline.run();
>> >
>> > } catch (Pipeline.PipelineExecutionException e) {
>> >
>> >
>> >   // Hack: await all bundle workers completing
>> >   while (namedThreadStillExists("direct-runner-worker")) {
>> > Thread.sleep(100);
>> >   }
>> >
>> >   // remove all solrClients created in this execution only
>> >   // since the teardown may not have done so
>> >   for (Object o : ObjectReleaseTracker.OBJECTS.keySet()) {
>> > if (o instanceof SolrClient && !existingClients.contains(o)) {
>> >   ObjectReleaseTracker.release(o);
>> > }
>> >   }
>> >
>> >   // now we can do our assertions
>> >
>> >
>> > expectedLogs.verifyWarn(String.format(SolrIO.Write.WriteFn.RETRY_ATTEMPT_LOG,
>> > 1));
>> >
>> >
>> > Please do point out the obvious if I am missing it - I am a newbie
>> > here...
>> >
>> > Thank you all very much,
>> > Tim
>> > ( timrobertson...@gmail.com on the slack apache/beam channel)
>> >
>> >
>> >
>> 
>> >
>
>


Re: Gradle Status [April 6]

2018-04-10 Thread Etienne Chauchot
As a gradle beginner, I could not agree more ! 
+1
Etienne
Le lundi 09 avril 2018 à 18:47 +0200, Jean-Baptiste Onofré a écrit :
> Hi all,
> 
> I did multiple gradle build since last week and I would like to share 
> one of my concern: it's about the communities.
> 
> If I think our users won't see any change for them due to Gradle build 
> (I think that most of our users will still use Maven with artifacts 
> provided by Gradle), I'm more concerned by the dev community and the 
> contribution.
> 
> Maven is well known and straight forward for a large part of potential 
> contributors. I think we have to keep in mind that we still have to grow 
> up our contributors community.
> 
> Today, maybe I'm wrong, but I have the feeling that gradle build is not 
> straight forward (build.gradle includes build_rules.gradle, gathering 
> all taks all together).
> 
> I would like to add a task in the gradle "migration" process: simplify 
> the gradle structure and files, and document this.
> 
> I know we already have a Jira about the documentation part, but I would 
> like to "polish" and use a clean structure for the Gradle resources. As 
> already quickly discussed, I think that having one gradle file per tasks 
> in the .gradle directory would be helpful.
> 
> The goal is really to simplify the contribution.
> 
> Do you agree if I add a Jira about "Gradle polish" ?
> Thoughts ?
> 
> Regards
> JB
> 
> On 07/04/2018 04:52, Scott Wegner wrote:
> > 
> > Here's an end-of-day update on migration work:
> > 
> > * Snapshot unsigned dailies and signed release builds are working (!!). 
> > PR/5048 [1] merges changes from Luke's branch
> >    * python precommit failing... will investigate python precommit Monday
> > * All Precommits are gradle only
> > * All Postcommits except performance tests and Java_JDK_Versions_Test  
> > use gradle (after PR/5047 [2] merged)
> > * Nightly snapshot release using gradle is ready; needs PR/5048 to be 
> > merged before switching
> > * ValidatesRunner_Spark failing consistently; investigating
> > 
> > Thanks for another productive day of hacking. I'll pick up again on Monday.
> > 
> > [1] https://github.com/apache/beam/pull/5048
> > [2] https://github.com/apache/beam/pull/5047
> > 
> > 
> > On Fri, Apr 6, 2018 at 11:24 AM Romain Manni-Bucau 
> > > wrote:
> > 
> > Why building a zip per runner which its stack and just pointing out
> > on that zip and let beam lazy load the runner:
> > 
> > --runner=LazyRunner --lazyRunnerDir=... --lazyRunnerOptions=... (or
> > the fromSystemProperties() if it gets merged a day ;))
> > 
> > Le 6 avr. 2018 20:21, "Kenneth Knowles"  > k...@google.com>> a écrit :
> > 
> > I'm working on finding a solution for launching the Nexmark
> > suite with each runner. This doesn't have to be done via Gradle,
> > but we anyhow need built artifacts that don't require user
> > classpath intervention.
> > 
> > It looks to me like the examples are also missing this - they
> > have separate configuration e.g. sparkRunnerPreCommit but that
> > is overspecified compared to a free-form launching of a main()
> > program with a runner profile.
> > 
> > On Fri, Apr 6, 2018 at 11:09 AM Lukasz Cwik  > lc...@google.com>> wrote:
> > 
> > Romain, are you talking about the profiles that exist as
> > part of the archetype examples?
> > 
> > If so, then those still exist and haven't been changed. If
> > not, can you provide a link to the profile in a pom file to
> > be clearer?
> > 
> > On Fri, Apr 6, 2018 at 12:40 PM Romain Manni-Bucau
> > > wrote:
> > 
> > Hi Scott,
> > 
> > is it right that 2 doesn't handle the hierachy anymore
> > and that it doesn't handle profiles for runners as it is
> > currently with maven?
> > 
> > 
> > Romain Manni-Bucau
> > @rmannibucau  | Blog
> >  | Old Blog
> >  | Github
> >  | LinkedIn
> >  | Book
> > 
> > 
> > 
> > 2018-04-06 18:32 GMT+02:00 Scott Wegner
> > >:
> > 
> > I wanted to start a thread to summarize the current
> > state of Gradle migration. We've made lots of good
> > progress so far this week. Here's the status from
> > what I can tell-- please add or correct anything I