Re: [DISCUSS] Beam public roadmap

2018-10-19 Thread Ahmet Altay
I looked at #6718, I think this is great as a starting point and not just a
mock. I particularly like that:
- It divides the roadmap along major component areas (SDKs, runners,
portability). This is good because (a) it provides a complete top down
picture and (b) allows groups of people working in these areas to build
their own roadmaps. This division would empower people working in those
components to build mini-roadmaps. This make sense to me because people
with most context in those components would likely to already have some
vision somewhere about the future of those components and they are already
working towards realizing those. Now, they can share it with rest of the
community and users in a structured way.
- The other good bit is that, there is a index page that pulls major bits
from each individual roadmap and provides a coherent list of where the
project is going. It would be very easy for users to just look at this page
and get a sense of the where the project is going.

I believe this break down makes it easier for the most folks in the
community to participate in the process of building and roadmap. In my
opinion, we can merge Kenn's _mock_ and ask people to start filling in the
areas they care about.

Ahmet

On Wed, Oct 17, 2018 at 7:23 AM, Kenneth Knowles  wrote:

> I mocked up a little something on https://github.com/apache/beam/pull/6718
> .
>
> Kenn
>
> On Sun, Oct 14, 2018 at 5:33 PM Thomas Weise  wrote:
>
>> Indeed, our current in-progress subsection isn't visible enough. It is
>> also too coarse grained. Perhaps we can replace it with a list of current
>> and proposed initiatives?
>>
>> I could see the index live on the web site, but would prefer individual,
>> per-initiative pages to live on the wiki. That way they are easy to
>> maintain by respective contributors.
>>
>> Thanks
>>
>> On Fri, Oct 12, 2018 at 8:06 PM Kenneth Knowles  wrote:
>>
>>> I think we can easily steer clear of those concerns. It should not look
>>> like a company's roadmap. This is just a term that users search for and ask
>>> for. It might be an incremental improvement on https://beam.apache.org/
>>> contribute/#works-in-progress to present it more for users, to just
>>> give them a picture of the trajectory. For example, Beam Python on Flink
>>> would probably be of considerable interest but it is buried at
>>> https://beam.apache.org/contribute/portability/#status.
>>>
>>> Kenn
>>>
>>> On Fri, Oct 12, 2018 at 6:49 PM Thomas Weise  wrote:
>>>
 As I understand it the term "roadmap" is not favored. It may convey the
 impression of an outside entity that controls what is being worked on and
 when. At least in theory contributions are volunteer work and individuals
 decide what they take up. There are projects that have a "list of
 initiatives" or "improvement proposals" that are either in idea phase or
 ongoing. Those provide an idea what is on the radar and perhaps that is a
 sufficient for those looking for the overall direction?


 On Fri, Oct 12, 2018 at 3:08 PM Kenneth Knowles 
 wrote:

> Did some searching about to see what other projects have done. Most
> OSS projects with open governance don't actually have such a thing AFAICT.
> Here are some from various [types of] projects. Please contribute links 
> for
> any project you can think of that might be interesting examples.
>
> My personal favorite for readability and content is Bazel. It does not
> do timelines, but says what they are most focused on. It has fewer, 
> larger,
> items than our "Ongoing Projects" section. Then some breakouts into
> roadmaps for sub-bits.
>
> Apache Flink (roadmap doc is stale, FLIPs nice and readable though)
>  - https://cwiki.apache.org/confluence/display/FLINK/
> Flink+Release+and+Feature+Plan
>  - https://cwiki.apache.org/confluence/display/FLINK/
> Flink+Improvement+Proposals
>
> Apache Spark (no roadmap doc I could find, SPIPs not in real readable
> format):
>  - https://spark.apache.org/improvement-proposals.html
>
> Apache Apex
>  - http://apex.apache.org/roadmap.html
>
> Apache Calcite Avatica
>  - https://calcite.apache.org/avatica/docs/roadmap.html
>
> Apache Kafka
>  - https://cwiki.apache.org/confluence/display/KAFKA/
> Future+release+plan
>
> Tensorflow
>  - https://www.tensorflow.org/community/roadmap
>
> Kubernetes
>  - https://github.com/kubernetes/kubernetes/milestones
>
> Firefox
>  - https://wiki.mozilla.org/Firefox/Roadmap
>
> Servo
>  - https://github.com/servo/servo/wiki/Roadmap
>
> Bazel
>  - https://bazel.build/roadmap.html
>
> Kenn
>
> On Fri, Oct 12, 2018 at 10:34 AM Tim Robertson <
> timrobertson...@gmail.com> wrote:
>
>> Thanks Kenn,
>>
>> I think this is a very good idea.
>>
>> My preference would be part of the website 

Re: Docker missing on Beam15

2018-10-19 Thread Yifan Zou
I got "Failed to restart docker.service: Interactive authentication required
" while trying to restart the docker on beam15.
Does anyone have the permission to do that? Or, we need to ask Apache Infra
for help.

Thanks.
Yifan

On Fri, Oct 19, 2018 at 2:51 PM Ankur Goenka  wrote:

> Hi,
>
> Can we restart docker as it seems to have fixed the issue for others
> https://github.com/moby/moby/issues/31849 ?
>
> Thanks,
> Ankur
>
> On Fri, Oct 19, 2018 at 1:11 PM Yifan Zou  wrote:
>
>> Hi,
>>
>> The docker has been installed on all Jenkins VMs. The image build process
>> was interrupted by a grpc connection issue.
>>
>> *11:02:12* Starting process 'command 'docker''. Working directory: 
>> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_VR_Flink/src/sdks/python/container/build/docker
>>  Command: docker build --no-cache -t 
>> jenkins-docker-apache.bintray.io/beam/python:latest .*11:02:12* Successfully 
>> started process 'command 'docker''*11:02:12* Sending build context to Docker 
>> daemon  17.65MB
>> *11:02:12* Step 1/9 : FROM python:2-stretch*11:02:12*  ---> 
>> 3c43a5d4034a*11:02:12* Step 2/9 : MAINTAINER "Apache Beam 
>> "*11:02:12*  ---> Running in f86bad9aef9c*11:02:12*  
>> ---> 610a5dec907e*11:02:12* Removing intermediate container 
>> f86bad9aef9c*11:02:12* Step 3/9 : RUN apt-get update && apt-get install 
>> -ylibsnappy-devlibyaml-dev&& rm -rf 
>> /var/lib/apt/lists/**11:02:12*  ---> Running in 5e9b67be03f9*11:02:12* grpc: 
>> the connection is unavailable
>>
>>
>> - Yifan
>>
>>
>>
>> On Fri, Oct 19, 2018 at 12:45 PM Ankur Goenka  wrote:
>>
>>> Hi,
>>>
>>> Flink Validates Runner test cases are failing on Beam 15 because docker
>>> is not installed.
>>> Failing tasks
>>> https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/buildTimeTrend
>>> Can we install docker on all the machines as the Portable Validates
>>> Runner tests need it.
>>>
>>> Thanks,
>>> Ankur
>>>
>>


Re: [DISCUSS] Publish vendored dependencies independently

2018-10-19 Thread Lukasz Cwik
I have tried several times to improve the build system and intellij
integration and each attempt ended with little progress when dealing with
vendored code. My latest attempt has been the most promising where I take
the vendored classes/jars and decompile them generating the source that
Intellij can then use. I have a branch[1] that demonstrates the idea. It
works pretty well (and up until a change where we started vendoring gRPC,
was impractical to do. Instructions to try it out are:

// Clean up any remnants of prior builds/intellij projects
git clean -fdx
// Generated the source for vendored/shaded modules
./gradlew decompile

// Remove the "generated" Java sources for protos so they don't
conflict with the decompiled sources.
rm -rf model/pipeline/build/generated/source/proto
rm -rf model/job-management/build/generated/source/proto
rm -rf model/fn-execution/build/generated/source/proto
// Import the project into Intellij, most code completion now works
still some issues with a few classes.
// Note that the Java decompiler doesn't generate valid source so
still need to delegate to Gradle for build/run/test actions
// Other decompilers may do a better/worse job but haven't tried them.


The problems that I face are that the generated Java source from the protos
and the decompiled source from the compiled version of that source post
shading are both being imported as content roots and then conflict. Also,
the CFR decompiler isn't producing valid source, if people could try others
and report their mileage, we may find one that works and then we would be
able to use intellij to build/run our code and not need to delegate all our
build/run/test actions to Gradle.

After all these attempts I have done, vendoring the dependencies outside of
the project seems like a sane approach and unless someone wants to take a
stab at the best progress I have made above, I would go with what Kenn is
suggesting even though it will mean that we will need to perform releases
every time we want to change the version of one of our vendored
dependencies.

1: https://github.com/lukecwik/incubator-beam/tree/intellij


On Fri, Oct 19, 2018 at 10:43 AM Kenneth Knowles  wrote:

> Another reason to push on this is to get build times down. Once only
> generated proto classes use the shadow plugin we'll cut the build time in
> ~half? And there is no reason to constantly re-vendor.
>
> Kenn
>
> On Fri, Oct 19, 2018 at 10:39 AM Kenneth Knowles  wrote:
>
>> Hi all,
>>
>> A while ago we had pretty good consensus that we should vendor
>> ("pre-shade") specific dependencies, and there's start on it here:
>> https://github.com/apache/beam/tree/master/vendor
>>
>> IntelliJ notes:
>>
>>  - With shading, it is hard (impossible?) to step into dependency code in
>> IntelliJ's debugger, because the actual symbol at runtime does not match
>> what is in the external jars
>>
>
Intellij can step through the classes if they were published outside the
project since it can decompile them. The source won't be legible.
Decompiling the source as in my example effectively shows the same issue.


>  - With vendoring, if the vendored dependencies are part of the project
>> then IntelliJ gets confused because it operates on source, not the produced
>> jars
>>
>
Yes, I tried several ways to get intellij to ignore the source and use the
output jars but no luck.


> The second one has a quick fix for most cases*: don't make the vendored
>> dep a subproject, but just separately build and publish it. Since a
>> vendored dependency should change much more infrequently (or if we bake the
>> version into the name, ~never) this means we publish once and save headache
>> and build time forever.
>>
>> WDYT? Have I overlooked something? How about we set up vendored versions
>> of guava, protobuf, grpc, and publish them. We don't have to actually start
>> using them yet, and can do it incrementally.
>>
>
Currently we are relocating code depending on the version string. If the
major version is >= 1, we use only the major version within the package
string and rely on semantic versioning provided by the dependency to not
break people. If the major version is 0, we assume the dependency is
unstable and use the full version as part of the package string during
relocation.

The downside of using the full version string for relocated packages:
1) Users will end up with multiple copies of dependencies that differ only
by the minor or patch version increasing the size of their application.
2) Bumping up the version of a dependency now requires the import statement
in all java files to be updated (not too difficult with some sed/grep
skills)

The upside of using the full version string in the relocated package:
1) We don't have to worry about whether a dependency maintains semantic
versioning which means our users won't have to worry about that either.
2) This increases the odds that a user will load multiple slightly
different versions of the same dependency which is known to be 

[DISCUSS] Move beam_SeedJob notifications to another email address

2018-10-19 Thread Rui Wang
Hi Community,

I have seen some Jenkins build failure/back-to-normal emails in dev@ in
last several months. Seems to me that this setting is coded in
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_00_seed.groovy#L100
.

In the link above, the comment says the seed job is very important so the
notification emails should be sent to dev@.

I am wondering if this is still true that we always want to see such
notifications in dev@? If such notifications have become spams to dev@, can
we move it to either commits@ or another dedicated email address (maybe
create a new one)?

-Rui


Re: What is required for LTS releases? (was: [PROPOSAL] Prepare Beam 2.8.0 release)

2018-10-19 Thread Kenneth Knowles
Pinging this. I think Beam should have a live LTS branch.

I want to suggest a different approach: choose something already released
to be LTS. This way it has had some usage and we have some confidence there
are no critical problems.

So how about we make 2.7 the first LTS branch?

Kenn

On Wed, Oct 10, 2018 at 8:18 AM Romain Manni-Bucau 
wrote:

> some times Ago JB spoke about Beam roadmap. I tend to think this
> discussion does no make any sense without a clear roadmap. The rational
> here is that a roadmap will give you the future changes
> and the potential future versions (we spoke a few times of Beam 3). This
> does not have to be very factual, a slice of 3 months is ok at that stage.
> However, if you don't have that,
> you can say 2.8 will be LTS and we support 2 versions but if 2.9 and 2.10
> introduce breaking changes, then
> it leads to a LTS 2.8 no more supported. This is just an example but the
> ratio "cost(project) / gain(user)" 100% depends the plans for the project,
> technically there is no blocker to support all releases
> for life but would any PMC have the will to release beam 0.x now? The
> point of a LTS for an user is to plan investment, if we are in previous
> case it does not help IMHO.
> So maybe grab back the Beam enhancement plans and assign them some fix
> versions before defining what support model of Beam can be.
>
> Just the 2 cts of an outsider.
>
> Romain Manni-Bucau
> @rmannibucau  |  Blog
>  | Old Blog
>  | Github
>  | LinkedIn
>  | Book
> 
>
>
> Le mer. 10 oct. 2018 à 17:10, Chamikara Jayalath  a
> écrit :
>
>>
>>
>> On Wed, Oct 10, 2018 at 2:56 AM Robert Bradshaw 
>> wrote:
>>
>>> On Wed, Oct 10, 2018 at 9:37 AM Ismaël Mejía  wrote:
>>>
 The simplest thing we can do is just to pin all the deps of the LTS
 and not move them in any maintenance release if not a strong reason to
 do so.

 The next subject is to make maintainers aware of which release will be
 the LTS in advance so they decide what to do with the dependencies
 versions. In my previous mail I mentioned all the possible cases that
 can happen with dependencies and it is clear that one unified policy
 won’t satisfy every one. So better let the maintainers (who can also
 ask for user feedback in the ML) to decide about  versions before the
 release.

 Alexey’s question is still a really important issue, and has been so
 far ignored. What happens with the ‘Experimental’ APIs in the LTS.
 Options are:

 (1) We keep consistent with Experimental which means that they are
 still not guarantees (note that this does not mean that they will be
 broken arbitrarily).
 (2) We are consistent with the LTS approach which makes them ‘non
 experimental’ for the LTS so we will guarantee the functionality/API
 stable.

 I personally have conflicted opinions I would like to favor (1) but
 this is not consistent with the whole idea of LTS so probably (2) is
 wiser.

>>>
>>> Yeah, I think (2) is the only viable option.
>>>
>>
>> I think important thing here is that future releases on a  LTS branch
>> will be patch (bugfix) releases so I don't think we can/should do
>> API/functionality changes (even if the change is experimental and/or
>> backwards compatible).
>>
>> I think same goes for dependency changes. If the change is to fix a known
>> bug we can do that in a patch release but if it's to add more functionality
>> probably that should come in a new minor release instead of a patch
>> release.
>>
>> This is why I think we should be bit careful about "rushed" changes to
>> major functionalities of Beam going into LTS releases. Just my 2 cents.
>>
>> Thanks,
>> Cham
>>
>>
>>>
>>>
 Finally I also worry about Tim’s remarks on performance and quality,
 even if some of these things effectively can be fixed in a subsequent
 LTS release. Users will probably prefer a LTS to start with Beam and
 if the performance/quality of the LTS, this can hurt perception of the
 project.

>>>
>>> Yes, for this reason I think it's important to consider what goes into
>>> an LTS as well as what happens after. Almost by definition, using an LTS is
>>> choosing stability over cutting edge features. I'd rather major feature X
>>> goes in after LTS, and lives in a couple of releases gaining fixes and
>>> improvements before being released as part of the next LTS, than quickly
>>> making it into an LTS while brand new (both due to the time period before
>>> we refine it, and the extra work of porting refinements back).
>>>
>>> Or maybe LTS-users are unlikely to pick up a x.y.0 release anyway,
>>> waiting for at least x.y.1?
>>>
>>> Come to think of it, do we even have to 

Re: Docker missing on Beam15

2018-10-19 Thread Ankur Goenka
Hi,

Can we restart docker as it seems to have fixed the issue for others
https://github.com/moby/moby/issues/31849 ?

Thanks,
Ankur

On Fri, Oct 19, 2018 at 1:11 PM Yifan Zou  wrote:

> Hi,
>
> The docker has been installed on all Jenkins VMs. The image build process
> was interrupted by a grpc connection issue.
>
> *11:02:12* Starting process 'command 'docker''. Working directory: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_VR_Flink/src/sdks/python/container/build/docker
>  Command: docker build --no-cache -t 
> jenkins-docker-apache.bintray.io/beam/python:latest .*11:02:12* Successfully 
> started process 'command 'docker''*11:02:12* Sending build context to Docker 
> daemon  17.65MB
> *11:02:12* Step 1/9 : FROM python:2-stretch*11:02:12*  ---> 
> 3c43a5d4034a*11:02:12* Step 2/9 : MAINTAINER "Apache Beam 
> "*11:02:12*  ---> Running in f86bad9aef9c*11:02:12*  
> ---> 610a5dec907e*11:02:12* Removing intermediate container 
> f86bad9aef9c*11:02:12* Step 3/9 : RUN apt-get update && apt-get install 
> -ylibsnappy-devlibyaml-dev&& rm -rf 
> /var/lib/apt/lists/**11:02:12*  ---> Running in 5e9b67be03f9*11:02:12* grpc: 
> the connection is unavailable
>
>
> - Yifan
>
>
>
> On Fri, Oct 19, 2018 at 12:45 PM Ankur Goenka  wrote:
>
>> Hi,
>>
>> Flink Validates Runner test cases are failing on Beam 15 because docker
>> is not installed.
>> Failing tasks
>> https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/buildTimeTrend
>> Can we install docker on all the machines as the Portable Validates
>> Runner tests need it.
>>
>> Thanks,
>> Ankur
>>
>


Re: Docker missing on Beam15

2018-10-19 Thread Yifan Zou
Hi,

The docker has been installed on all Jenkins VMs. The image build process
was interrupted by a grpc connection issue.

*11:02:12* Starting process 'command 'docker''. Working directory:
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_VR_Flink/src/sdks/python/container/build/docker
Command: docker build --no-cache -t
jenkins-docker-apache.bintray.io/beam/python:latest .*11:02:12*
Successfully started process 'command 'docker''*11:02:12* Sending
build context to Docker daemon  17.65MB
*11:02:12* Step 1/9 : FROM python:2-stretch*11:02:12*  --->
3c43a5d4034a*11:02:12* Step 2/9 : MAINTAINER "Apache Beam
"*11:02:12*  ---> Running in
f86bad9aef9c*11:02:12*  ---> 610a5dec907e*11:02:12* Removing
intermediate container f86bad9aef9c*11:02:12* Step 3/9 : RUN apt-get
update && apt-get install -ylibsnappy-dev
libyaml-dev&& rm -rf /var/lib/apt/lists/**11:02:12*  --->
Running in 5e9b67be03f9*11:02:12* grpc: the connection is unavailable


- Yifan



On Fri, Oct 19, 2018 at 12:45 PM Ankur Goenka  wrote:

> Hi,
>
> Flink Validates Runner test cases are failing on Beam 15 because docker is
> not installed.
> Failing tasks
> https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/buildTimeTrend
> Can we install docker on all the machines as the Portable Validates Runner
> tests need it.
>
> Thanks,
> Ankur
>


Re: a new contributor

2018-10-19 Thread Ankur Goenka
Welcome Heejong!

On Fri, Oct 19, 2018 at 12:27 PM Rui Wang  wrote:

> Welcome!
>
> -Rui
>
> On Fri, Oct 19, 2018 at 11:55 AM Robin Qiu  wrote:
>
>> Welcome, Heejong!
>>
>> On Fri, Oct 19, 2018 at 11:55 AM Ahmet Altay  wrote:
>>
>>> Welcome!
>>>
>>> On Fri, Oct 19, 2018 at 11:48 AM, Heejong Lee 
>>> wrote:
>>>
 Hi,

 I just wanted to introduce myself as a new contributor. I'm a new
 member of Apache Beam team at Google and will be working on IO modules.
 Happy to meet you all!

 Thanks,
 Heejong

>>>
>>>


Docker missing on Beam15

2018-10-19 Thread Ankur Goenka
Hi,

Flink Validates Runner test cases are failing on Beam 15 because docker is
not installed.
Failing tasks
https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/buildTimeTrend
Can we install docker on all the machines as the Portable Validates Runner
tests need it.

Thanks,
Ankur


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Rui Wang
Congrats and thanks for your contributions!

-Rui

On Fri, Oct 19, 2018 at 11:55 AM Ahmet Altay  wrote:

> Congratulations to both of you! :)
>
> On Fri, Oct 19, 2018 at 11:52 AM, Robin Qiu  wrote:
>
>> Congrats, Xinyu and Ankur!
>>
>> On Fri, Oct 19, 2018 at 11:51 AM Daniel Oliveira 
>> wrote:
>>
>>> Congratulations!
>>>
>>> On Fri, Oct 19, 2018 at 8:27 AM Thomas Weise  wrote:
>>>
 Congrats!


 On Fri, Oct 19, 2018 at 7:24 AM Ismaël Mejía  wrote:

> Congratulations guys and welcome !
> On Fri, Oct 19, 2018 at 4:12 PM Jean-Baptiste Onofré 
> wrote:
> >
> > Congrats and welcome aboard !
> >
> > Regards
> > JB
> >
> > On 19/10/2018 16:09, Kenneth Knowles wrote:
> > > Hi all,
> > >
> > > Hot on the tail of the summer announcement comes our pre-Hallowe'en
> > > celebration.
> > >
> > > Please join me and the rest of the Beam PMC in welcoming the
> following
> > > new committers:
> > >
> > >  - Xinyu Liu, author/maintainer of the Samza runner
> > >  - Ankur Goenka, major contributor to portability efforts
> > >
> > > And, as before, while I've noted some areas of contribution for
> each,
> > > most important is that they are a valued part of our Beam
> community that
> > > the PMC trusts with the responsibilities of a Beam committer [1].
> > >
> > > A big thanks to both for their contributions.
> > >
> > > Kenn
> > >
> > > [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>

>


Re: a new contributor

2018-10-19 Thread Rui Wang
Welcome!

-Rui

On Fri, Oct 19, 2018 at 11:55 AM Robin Qiu  wrote:

> Welcome, Heejong!
>
> On Fri, Oct 19, 2018 at 11:55 AM Ahmet Altay  wrote:
>
>> Welcome!
>>
>> On Fri, Oct 19, 2018 at 11:48 AM, Heejong Lee  wrote:
>>
>>> Hi,
>>>
>>> I just wanted to introduce myself as a new contributor. I'm a new member
>>> of Apache Beam team at Google and will be working on IO modules. Happy to
>>> meet you all!
>>>
>>> Thanks,
>>> Heejong
>>>
>>
>>


Re: a new contributor

2018-10-19 Thread Robin Qiu
Welcome, Heejong!

On Fri, Oct 19, 2018 at 11:55 AM Ahmet Altay  wrote:

> Welcome!
>
> On Fri, Oct 19, 2018 at 11:48 AM, Heejong Lee  wrote:
>
>> Hi,
>>
>> I just wanted to introduce myself as a new contributor. I'm a new member
>> of Apache Beam team at Google and will be working on IO modules. Happy to
>> meet you all!
>>
>> Thanks,
>> Heejong
>>
>
>


Re: a new contributor

2018-10-19 Thread Ahmet Altay
Welcome!

On Fri, Oct 19, 2018 at 11:48 AM, Heejong Lee  wrote:

> Hi,
>
> I just wanted to introduce myself as a new contributor. I'm a new member
> of Apache Beam team at Google and will be working on IO modules. Happy to
> meet you all!
>
> Thanks,
> Heejong
>


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Ahmet Altay
Congratulations to both of you! :)

On Fri, Oct 19, 2018 at 11:52 AM, Robin Qiu  wrote:

> Congrats, Xinyu and Ankur!
>
> On Fri, Oct 19, 2018 at 11:51 AM Daniel Oliveira 
> wrote:
>
>> Congratulations!
>>
>> On Fri, Oct 19, 2018 at 8:27 AM Thomas Weise  wrote:
>>
>>> Congrats!
>>>
>>>
>>> On Fri, Oct 19, 2018 at 7:24 AM Ismaël Mejía  wrote:
>>>
 Congratulations guys and welcome !
 On Fri, Oct 19, 2018 at 4:12 PM Jean-Baptiste Onofré 
 wrote:
 >
 > Congrats and welcome aboard !
 >
 > Regards
 > JB
 >
 > On 19/10/2018 16:09, Kenneth Knowles wrote:
 > > Hi all,
 > >
 > > Hot on the tail of the summer announcement comes our pre-Hallowe'en
 > > celebration.
 > >
 > > Please join me and the rest of the Beam PMC in welcoming the
 following
 > > new committers:
 > >
 > >  - Xinyu Liu, author/maintainer of the Samza runner
 > >  - Ankur Goenka, major contributor to portability efforts
 > >
 > > And, as before, while I've noted some areas of contribution for
 each,
 > > most important is that they are a valued part of our Beam community
 that
 > > the PMC trusts with the responsibilities of a Beam committer [1].
 > >
 > > A big thanks to both for their contributions.
 > >
 > > Kenn
 > >
 > > [1] https://beam.apache.org/contribute/become-a-committer/
 #an-apache-beam-committer
 >
 > --
 > Jean-Baptiste Onofré
 > jbono...@apache.org
 > http://blog.nanthrax.net
 > Talend - http://www.talend.com

>>>


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Robin Qiu
Congrats, Xinyu and Ankur!

On Fri, Oct 19, 2018 at 11:51 AM Daniel Oliveira 
wrote:

> Congratulations!
>
> On Fri, Oct 19, 2018 at 8:27 AM Thomas Weise  wrote:
>
>> Congrats!
>>
>>
>> On Fri, Oct 19, 2018 at 7:24 AM Ismaël Mejía  wrote:
>>
>>> Congratulations guys and welcome !
>>> On Fri, Oct 19, 2018 at 4:12 PM Jean-Baptiste Onofré 
>>> wrote:
>>> >
>>> > Congrats and welcome aboard !
>>> >
>>> > Regards
>>> > JB
>>> >
>>> > On 19/10/2018 16:09, Kenneth Knowles wrote:
>>> > > Hi all,
>>> > >
>>> > > Hot on the tail of the summer announcement comes our pre-Hallowe'en
>>> > > celebration.
>>> > >
>>> > > Please join me and the rest of the Beam PMC in welcoming the
>>> following
>>> > > new committers:
>>> > >
>>> > >  - Xinyu Liu, author/maintainer of the Samza runner
>>> > >  - Ankur Goenka, major contributor to portability efforts
>>> > >
>>> > > And, as before, while I've noted some areas of contribution for each,
>>> > > most important is that they are a valued part of our Beam community
>>> that
>>> > > the PMC trusts with the responsibilities of a Beam committer [1].
>>> > >
>>> > > A big thanks to both for their contributions.
>>> > >
>>> > > Kenn
>>> > >
>>> > > [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>> >
>>> > --
>>> > Jean-Baptiste Onofré
>>> > jbono...@apache.org
>>> > http://blog.nanthrax.net
>>> > Talend - http://www.talend.com
>>>
>>


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Daniel Oliveira
Congratulations!

On Fri, Oct 19, 2018 at 8:27 AM Thomas Weise  wrote:

> Congrats!
>
>
> On Fri, Oct 19, 2018 at 7:24 AM Ismaël Mejía  wrote:
>
>> Congratulations guys and welcome !
>> On Fri, Oct 19, 2018 at 4:12 PM Jean-Baptiste Onofré 
>> wrote:
>> >
>> > Congrats and welcome aboard !
>> >
>> > Regards
>> > JB
>> >
>> > On 19/10/2018 16:09, Kenneth Knowles wrote:
>> > > Hi all,
>> > >
>> > > Hot on the tail of the summer announcement comes our pre-Hallowe'en
>> > > celebration.
>> > >
>> > > Please join me and the rest of the Beam PMC in welcoming the following
>> > > new committers:
>> > >
>> > >  - Xinyu Liu, author/maintainer of the Samza runner
>> > >  - Ankur Goenka, major contributor to portability efforts
>> > >
>> > > And, as before, while I've noted some areas of contribution for each,
>> > > most important is that they are a valued part of our Beam community
>> that
>> > > the PMC trusts with the responsibilities of a Beam committer [1].
>> > >
>> > > A big thanks to both for their contributions.
>> > >
>> > > Kenn
>> > >
>> > > [1]
>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>> >
>> > --
>> > Jean-Baptiste Onofré
>> > jbono...@apache.org
>> > http://blog.nanthrax.net
>> > Talend - http://www.talend.com
>>
>


a new contributor

2018-10-19 Thread Heejong Lee
Hi,

I just wanted to introduce myself as a new contributor. I'm a new member of
Apache Beam team at Google and will be working on IO modules. Happy to meet
you all!

Thanks,
Heejong


Re: [DISCUSS] Publish vendored dependencies independently

2018-10-19 Thread Kenneth Knowles
Another reason to push on this is to get build times down. Once only
generated proto classes use the shadow plugin we'll cut the build time in
~half? And there is no reason to constantly re-vendor.

Kenn

On Fri, Oct 19, 2018 at 10:39 AM Kenneth Knowles  wrote:

> Hi all,
>
> A while ago we had pretty good consensus that we should vendor
> ("pre-shade") specific dependencies, and there's start on it here:
> https://github.com/apache/beam/tree/master/vendor
>
> IntelliJ notes:
>
>  - With shading, it is hard (impossible?) to step into dependency code in
> IntelliJ's debugger, because the actual symbol at runtime does not match
> what is in the external jars
>  - With vendoring, if the vendored dependencies are part of the project
> then IntelliJ gets confused because it operates on source, not the produced
> jars
>
> The second one has a quick fix for most cases*: don't make the vendored
> dep a subproject, but just separately build and publish it. Since a
> vendored dependency should change much more infrequently (or if we bake the
> version into the name, ~never) this means we publish once and save headache
> and build time forever.
>
> WDYT? Have I overlooked something? How about we set up vendored versions
> of guava, protobuf, grpc, and publish them. We don't have to actually start
> using them yet, and can do it incrementally.
>
> (side note: what do other projects like Flink do?)
>
> Kenn
>
> *for generated proto classes, they need to be altered after being
> generated so shading happens there, but actually only relocation and the
> shared vendored dep should work elsewhere in the project
>


[DISCUSS] Publish vendored dependencies independently

2018-10-19 Thread Kenneth Knowles
Hi all,

A while ago we had pretty good consensus that we should vendor
("pre-shade") specific dependencies, and there's start on it here:
https://github.com/apache/beam/tree/master/vendor

IntelliJ notes:

 - With shading, it is hard (impossible?) to step into dependency code in
IntelliJ's debugger, because the actual symbol at runtime does not match
what is in the external jars
 - With vendoring, if the vendored dependencies are part of the project
then IntelliJ gets confused because it operates on source, not the produced
jars

The second one has a quick fix for most cases*: don't make the vendored dep
a subproject, but just separately build and publish it. Since a vendored
dependency should change much more infrequently (or if we bake the version
into the name, ~never) this means we publish once and save headache and
build time forever.

WDYT? Have I overlooked something? How about we set up vendored versions of
guava, protobuf, grpc, and publish them. We don't have to actually start
using them yet, and can do it incrementally.

(side note: what do other projects like Flink do?)

Kenn

*for generated proto classes, they need to be altered after being generated
so shading happens there, but actually only relocation and the shared
vendored dep should work elsewhere in the project


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Thomas Weise
Congrats!


On Fri, Oct 19, 2018 at 7:24 AM Ismaël Mejía  wrote:

> Congratulations guys and welcome !
> On Fri, Oct 19, 2018 at 4:12 PM Jean-Baptiste Onofré 
> wrote:
> >
> > Congrats and welcome aboard !
> >
> > Regards
> > JB
> >
> > On 19/10/2018 16:09, Kenneth Knowles wrote:
> > > Hi all,
> > >
> > > Hot on the tail of the summer announcement comes our pre-Hallowe'en
> > > celebration.
> > >
> > > Please join me and the rest of the Beam PMC in welcoming the following
> > > new committers:
> > >
> > >  - Xinyu Liu, author/maintainer of the Samza runner
> > >  - Ankur Goenka, major contributor to portability efforts
> > >
> > > And, as before, while I've noted some areas of contribution for each,
> > > most important is that they are a valued part of our Beam community
> that
> > > the PMC trusts with the responsibilities of a Beam committer [1].
> > >
> > > A big thanks to both for their contributions.
> > >
> > > Kenn
> > >
> > > [1]
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
> >
> > --
> > Jean-Baptiste Onofré
> > jbono...@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
>


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Ismaël Mejía
Congratulations guys and welcome !
On Fri, Oct 19, 2018 at 4:12 PM Jean-Baptiste Onofré  wrote:
>
> Congrats and welcome aboard !
>
> Regards
> JB
>
> On 19/10/2018 16:09, Kenneth Knowles wrote:
> > Hi all,
> >
> > Hot on the tail of the summer announcement comes our pre-Hallowe'en
> > celebration.
> >
> > Please join me and the rest of the Beam PMC in welcoming the following
> > new committers:
> >
> >  - Xinyu Liu, author/maintainer of the Samza runner
> >  - Ankur Goenka, major contributor to portability efforts
> >
> > And, as before, while I've noted some areas of contribution for each,
> > most important is that they are a valued part of our Beam community that
> > the PMC trusts with the responsibilities of a Beam committer [1].
> >
> > A big thanks to both for their contributions.
> >
> > Kenn
> >
> > [1] 
> > https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com


Re: [ANNOUNCE] New committers, October 2018

2018-10-19 Thread Jean-Baptiste Onofré
Congrats and welcome aboard !

Regards
JB

On 19/10/2018 16:09, Kenneth Knowles wrote:
> Hi all,
> 
> Hot on the tail of the summer announcement comes our pre-Hallowe'en
> celebration.
> 
> Please join me and the rest of the Beam PMC in welcoming the following
> new committers:
> 
>  - Xinyu Liu, author/maintainer of the Samza runner
>  - Ankur Goenka, major contributor to portability efforts
> 
> And, as before, while I've noted some areas of contribution for each,
> most important is that they are a valued part of our Beam community that
> the PMC trusts with the responsibilities of a Beam committer [1].
> 
> A big thanks to both for their contributions.
> 
> Kenn
> 
> [1] 
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer

-- 
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: Python SDK worker / portable Flink runner performance improvements

2018-10-19 Thread Kenneth Knowles
This is really cool news. Pretty awesome to move from the "get it to run"
phase to the "get it to run faster" phase of this project.

Streaming testing: In Java there's a synthetic source (GenerateSequence /
CountingSource) for testing. Maybe in this case I'd say porting to py is
worth it?

Kenn

On Wed, Oct 17, 2018 at 2:00 PM Lukasz Cwik  wrote:

> Thanks, this was useful for me since I have been away these past couple of
> weeks.
>
> On Wed, Oct 17, 2018 at 8:45 AM Thomas Weise  wrote:
>
>> Hi,
>>
>> As you may have noticed, some of the contributors are working on enabling
>> the Python support on Flink. The upcoming 2.8 release is going to include
>> much of the functionality and we are now shifting gears to stability and
>> performance.
>>
>> There have been some basic fixes already (logging, memory leak) and at
>> this point we see very low throughput in streaming mode. Improvements are
>> in-flight:
>>
>> https://issues.apache.org/jira/browse/BEAM-5760
>> https://issues.apache.org/jira/browse/BEAM-5521
>>
>> There has been discussion and preliminary work to improve support for
>> testing as well (streaming mode). The Python SDK currently doesn't have any
>> (open source) streaming connectors, but we have added a Flink native
>> transform that can be used for testing:
>>
>> https://issues.apache.org/jira/browse/BEAM-5707
>>
>> I'm starting this thread here so that it is easier for more folks to get
>> involved and stay in sync.
>>
>> Thanks,
>> Thomas
>>
>>
>>
>>


[ANNOUNCE] New committers, October 2018

2018-10-19 Thread Kenneth Knowles
Hi all,

Hot on the tail of the summer announcement comes our pre-Hallowe'en
celebration.

Please join me and the rest of the Beam PMC in welcoming the following new
committers:

 - Xinyu Liu, author/maintainer of the Samza runner
 - Ankur Goenka, major contributor to portability efforts

And, as before, while I've noted some areas of contribution for each, most
important is that they are a valued part of our Beam community that the PMC
trusts with the responsibilities of a Beam committer [1].

A big thanks to both for their contributions.

Kenn

[1]
https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer


Re: Python SDK worker / portable Flink runner performance improvements

2018-10-19 Thread Maximilian Michels
Thanks Thomas, I think it is important to start looking at performance 
and improved test coverage.


While we have the basic functionality, there is still state and timers 
to be implemented for the Portable FlinkRunner. These two will allow 
full testing/optimization:


State:  https://issues.apache.org/jira/browse/BEAM-2918 (pending PR)
Timers: https://issues.apache.org/jira/browse/BEAM-4681

-Max

On 17.10.18 22:59, Lukasz Cwik wrote:
Thanks, this was useful for me since I have been away these past couple 
of weeks.


On Wed, Oct 17, 2018 at 8:45 AM Thomas Weise > wrote:


Hi,

As you may have noticed, some of the contributors are working on
enabling the Python support on Flink. The upcoming 2.8 release is
going to include much of the functionality and we are now shifting
gears to stability and performance.

There have been some basic fixes already (logging, memory leak) and
at this point we see very low throughput in streaming mode.
Improvements are in-flight:

https://issues.apache.org/jira/browse/BEAM-5760
https://issues.apache.org/jira/browse/BEAM-5521

There has been discussion and preliminary work to improve support
for testing as well (streaming mode). The Python SDK currently
doesn't have any (open source) streaming connectors, but we have
added a Flink native transform that can be used for testing:

https://issues.apache.org/jira/browse/BEAM-5707

I'm starting this thread here so that it is easier for more folks to
get involved and stay in sync.

Thanks,
Thomas





Re: Does anyone have a strong intelliJ setup?

2018-10-19 Thread Maximilian Michels
Yes, I have the same issue for sources (not binaries). I usually end up 
manually fetching sources and adding them to IntelliJ.


On 18.10.18 18:16, Alexey Romanenko wrote:

Does anyone have a problem to fetch a source code of external dependencies?
I have always this error (see attached picture) - it doesn’t fetch 
source artifacts




On 16 Oct 2018, at 20:06, Lukasz Cwik > wrote:


I also reached out on the Gradle forum asking about how to get 
Intellij to use a subprojects output jars instead of the output 
classes: 
https://discuss.gradle.org/t/how-to-get-intellij-to-use-module-output-jars-instead-of-output-classes/27794


This would solve lots of problems with how Intellij integrates with 
Gradle but haven't received any responses yet.


On Tue, Oct 16, 2018 at 11:03 AM Ryan Williams > wrote:


Thanks for this info and work! A couple relevant notes:

  * There is a #beam-intellij slack channel where I tried to
collect some info a few weeks ago when I was debugging
IntelliJ issues
  * I tried to figure out where IntelliJ stores the info about the
vendored JARs we manually add to various modules, so we could
automate adding them, but failed so far.
  o afaict it is not in the .idea directory in the project;
I'm not sure where it goes
  * I had some early exchanges on YouTrack with JetBrains folks
about specific issues and possibly opening the Gradle plugin
to outside contributors, but haven't heard anything back in a
few months:
  o IDEA-195908
:
project import gets corrupted when certain libraries are
present in the local Maven cache (~/.m2)
  o IDEA-197980
:
intellij doesn't understand vendored classes (while the
CLI does)
  o IDEA-198150
: can
the Gradle plugin be open-sourced?


On Tue, Oct 16, 2018 at 12:45 PM Scott Wegner mailto:sc...@apache.org>> wrote:

FYI, I've opened BEAM-5762 to track the work to document and
improve IntelliJ integration. It's broken down into sub-tasks
for documenting individual scenarios. I've grabbed a couple;
if you're feeling motivated feel free to grab one or two to
help out!

https://issues.apache.org/jira/browse/BEAM-5762 Improve
IntelliJ support and documentation


On Wed, Oct 10, 2018 at 12:16 PM Rui Wang mailto:ruw...@google.com>> wrote:

I left my tips to run *Java* unit tests in Intellij (work
for me all the time). I assumed that people mostly use
intellij for Java development.

If there are some cases when people use Intellij to
develop other languages (maybe because of the power of
plugins?), we might need to create separate sessions for
those cases.

-Rui

On Wed, Oct 10, 2018 at 11:46 AM Scott Wegner
mailto:sc...@apache.org>> wrote:

Last week I migrated all previous content from the
website into wiki pages for IntelliJ [1] and Eclipse
[2] (thanks Thomas Weise for the pointers).

The next step is to incorporate all the tips that
people have mentioned here and fill in any other gaps
we have. Here's how I'd like to get started:

1) Focus on IntelliJ first. I don't use Eclipse and I
don't have the expertise to make this experience
great. I'd be glad if somebody else picked this up.
2) Re-organize the wiki page into a set of high-level
developer tasks that we support; things like "Setting
up IntelliJ IDE from scratch", "Performing a full
build", "Building a testing a single module", "Running
a single unit test", "Running an IT for a particular
runner", "Recovering from project corruption", "Common
errors"
3) Work on one section at a time, filling in
step-by-step instructions that are prescriptive and
easy to validate.

And I'd love some help! Here's what you could do to help:

* Respond to this email with any high-level "developer
scenarios" that I've forgotten above. Things that you
should be able to do in an IDE and we should document
for all contributors.
* Add your tips and work-arounds; I'll be collecting
as much as I can in this working doc before organizing
it into the wiki: