Hi all, I don't seem to have the permissions to create a Kanban board or even assign tasks to myself. Who could help me with this?
I've updated the coders package pull request [1] and added the applied strategy to the proposal document [2]. It would be great to get some feedback on this, so we can start moving forward with other subpackages. Kind regards, Robbe [1] https://github.com/apache/beam/pull/4990 [2] https://docs.google.com/document/d/1xDG0MWVlDKDPu_IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing On Mon, 2 Apr 2018 at 21:07 Robbe Sneyders <robbe.sneyd...@ml6.eu> wrote: > Hello Robert, > > I think a Kanban board on Jira as proposed by Ahmet can be helpful for > this. I'll look into setting one up tomorrow. > > In the meantime, you can find the first pull request with the updated > coders package here: > https://github.com/apache/beam/pull/4990 > > Kind regards, > Robbe > > On Fri, 30 Mar 2018 at 18:01 Robert Bradshaw <rober...@google.com> wrote: > >> On Fri, Mar 30, 2018 at 8:39 AM Robbe Sneyders <robbe.sneyd...@ml6.eu> >> wrote: >> >>> Thanks Ahmet and Robert, >>> >>> I think we can work on different subpackages in parallel, but it's >>> important to apply the same strategy everywhere. I'm currently working on >>> applying step 1 (was mostly done already) and 2 of the proposal to the >>> coders subpackage to create a first pull request. We can then discuss the >>> applied strategy in detail before merging and applying it to the other >>> subpackages. >>> >> >> Sounds good. Again, could you document (in a more permanent/easy to look >> up state than email) when packages are started/done? >> >> >>> This strategy also includes the choice of automated tools. I'm focusing >>> on writing python 3 code with python 2 compatibility, which means depending >>> on the future package instead of the six package (which is already used in >>> some places in the current code base). I have already noticed that this >>> indeed requires a lot of manual work after running the automated script. >>> The future package supports python 3.3+ compatibility, so I don't think >>> there is a higher cost supporting 3.4 compared to 3.5+. >>> >> >> Sure. It may incur a higher maintenance burden long-term though. >> (Basically, if we go out the door with 3.4 it's a promise to support it for >> some time to come.) >> >> >>> I have already added a tox environment to run pylint2 with the --py3k >>> argument per updated subpackage, which should help avoid regression between >>> step 2 and step 3 of the proposal. This update will be pushed with the >>> first pull request. >>> >>> Kind regards, >>> Robbe >>> >>> >>> On Fri, 30 Mar 2018 at 02:22 Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> Thank you, Robbie, for your offer to help with contribution here. I >>>> read over your doc and the one thing I'd like to add is that this work is >>>> very parallelizable, but if we have enough people looking at it we'll want >>>> some way to coordinate so as to not overlap work (or just waste time >>>> discovering what's been done). Tracking individual JIRAs and PRs gets >>>> unwieldy, perhaps a spreadsheet with modules/packages on one axis and the >>>> various automated/manual conversions along the other would be helpful? >>>> >>>> A note on automated tools, they're sometimes overly conservative, so we >>>> should be sure to review the changes manually. (A typical example of this >>>> is unnecessarily importing six.moves.xrange when there was no big reason to >>>> use xrange over range in Python 2, or conversely using list(range(...) in >>>> Python 3.) >>>> >>>> Also, +1 to targetting 3.4+ and upgrading tox to prevent regressions. >>>> If there's a cost to supporting 3.4 as opposed to requiring 3.5+ we should >>>> identify it and decide that before widespread announcement. >>>> >>>> On Tue, Mar 27, 2018 at 2:27 PM Ahmet Altay <al...@google.com> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Mar 27, 2018 at 7:12 AM, Holden Karau <hol...@pigscanfly.ca> >>>>> wrote: >>>>> >>>>>> >>>>>> On Tue, Mar 27, 2018 at 4:27 AM Robbe Sneyders <robbe.sneyd...@ml6.eu> >>>>>> wrote: >>>>>> >>>>>>> Hi Anand, >>>>>>> >>>>>>> Thanks for the feedback. >>>>>>> >>>>>>> It should be no problem to run everything on DataflowRunner as well. >>>>>>> Are there any performance tests in place to check for performance >>>>>>> regressions? >>>>>>> >>>>>> >>>>> Yes there is a suite ( >>>>> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PerformanceTests_Python.groovy). >>>>> It may not be very comprehensive and seems to be failing for a while. I >>>>> would not block python 3 work on performance for now. That is the >>>>> unfortuante state of things. >>>>> >>>>> If anybody in the community is interested, this would be a great >>>>> opportunity to help with benchmarks in general. >>>>> >>>>> >>>>>> >>>>>>> Some questions were raised in the proposal document which I want to >>>>>>> add to this conversation: >>>>>>> >>>>>>> The first comment was about the targeted python 3 versions. We >>>>>>> proposed to target 3.6 since it is the latest version available and >>>>>>> added >>>>>>> 3.5 because 3.6 adoption seems rather low (hard to find any relevant >>>>>>> sources on this though). >>>>>>> If the beam community prefers 3.4, I would propose to target 3.4 >>>>>>> only during porting and add 3.5 and 3.6 later so we don't slow down the >>>>>>> porting progress. 3.4 has the advantage of already being installed on >>>>>>> the >>>>>>> workers and allows pySpark pipelines to be moved over to beam more >>>>>>> easily. >>>>>>> It would be great to get some opinions on this. >>>>>>> >>>>>> >>>>> My preference is to support 3.4+. I searched a bit on the web to >>>>> understand the usage statistics for python 3, it seems like python 3.4 has >>>>> ~20% usage and python 3.4+ has 99% ( >>>>> https://semaphoreci.com/blog/2017/10/18/python-versions-used-in-commercial-projects-in-2017.html). >>>>> Based on that, I think it makes sense to support it. >>>>> >>>>> >>>>> >>>>>> >>>>>>> Another comment was made on how to avoid regression during the >>>>>>> porting progress. >>>>>>> After applying step 1 and step 2, no python 3 compatibility lint >>>>>>> warnings should remain, so it would be great if we could enforce this >>>>>>> check >>>>>>> for every pull request on an already updated subpackage. >>>>>>> After applying step 3, all tests should run on python 3, so again it >>>>>>> would be great if we can enforce these per updated subpackage. >>>>>>> Any insights on how to best accomplish this? >>>>>>> >>>>>> So you can look at some of the recent changes to tox.ini in the git >>>>>> log to see what we’ve done so far around this I suspect you can repeat >>>>>> that >>>>>> same pattern. >>>>>> >>>>> >>>>> +1 updating tox.ini and adding new checks to run_mini_py3lint.sh would >>>>> help a lot to prevent regressions. >>>>> >>>>> >>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Robbe >>>>>>> >>>>>>> On Fri, 23 Mar 2018 at 19:59 Ahmet Altay <al...@google.com> wrote: >>>>>>> >>>>>>>> Thank you Robbe. >>>>>>>> >>>>>>>> I reviewed the document it looks reasonable to me. I will touch on >>>>>>>> some points that were not mentioned: >>>>>>>> - Runner exercise different code paths. Doing auto conversions and >>>>>>>> focusing on DirectRunner is not enough. It is worthwhile to run things >>>>>>>> on >>>>>>>> DataflowRunner as well. This can be triggered from Jenkins. It will >>>>>>>> validate that we are still compatible for python 2. >>>>>>>> - Similar to above but with an eye on perf regressions. >>>>>>>> >>>>>>>> For project tracking on JIRA, please feel free to create any new >>>>>>>> issues, close stale ones, or take ownership of any open issues. All >>>>>>>> JIRAs >>>>>>>> should be assigned to the people actively working on them. If you wan >>>>>>>> to >>>>>>>> track it in a separate way, you can also propose that. (For example a >>>>>>>> kanban board is used for portability effort which is fully supported in >>>>>>>> JIRA.) >>>>>>>> >>>>>>>> I will also call out to a few other people in addition to Holden >>>>>>>> who helped out or showed interest in helping with Python 3. @cclaus, >>>>>>>> @luke-zhu, @udim, @robertwb, @charlesccychen, @tvalentyn. You can >>>>>>>> include these people (and myself) for reviews and other questions that >>>>>>>> you >>>>>>>> have. >>>>>>>> >>>>>>>> Welcome again, and looking forward to your contributions. >>>>>>>> >>>>>>>> Thank you, >>>>>>>> Ahmet >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Mar 23, 2018 at 9:27 AM, Robbe Sneyders < >>>>>>>> robbe.sneyd...@ml6.eu> wrote: >>>>>>>> >>>>>>>>> Hello everyone, >>>>>>>>> >>>>>>>>> In the next month(s), me and my colleague Matthias will commit a >>>>>>>>> lot of time and effort to python 3 support for beam and we would like >>>>>>>>> to >>>>>>>>> discuss the best way to go forward with this. >>>>>>>>> >>>>>>>>> We have drawn up a document [1] with a high level outline of the >>>>>>>>> proposed approach and would like to get your feedback on this. >>>>>>>>> >>>>>>>>> The main Jira issue [2] for python 3 support has been mostly >>>>>>>>> inactive for the past year. Other smaller issues have been opened, >>>>>>>>> but it's >>>>>>>>> hard to track the general progress. It would be great if anyone could >>>>>>>>> offer >>>>>>>>> some insights on how to best handle this project on Jira. >>>>>>>>> >>>>>>>>> @Holden Karau, you seem to have already put in a lot of effort to >>>>>>>>> add python 3 support, so it would be great to get your insights and >>>>>>>>> find a >>>>>>>>> way to merge our efforts. >>>>>>>>> >>>>>>>>> Kind regards, >>>>>>>>> Robbe >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://docs.google.com/document/d/1xDG0MWVlDKDPu_IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing >>>>>>>>> >>>>>>>>> [2] https://issues.apache.org/jira/browse/BEAM-1251 >>>>>>>>> -- >>>>>>>>> >>>>>>>>> [image: https://ml6.eu] <https://ml6.eu/> >>>>>>>>> >>>>>>>>> * Robbe Sneyders* >>>>>>>>> >>>>>>>>> ML6 Gent >>>>>>>>> <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl> >>>>>>>>> >>>>>>>>> M: +32 474 71 31 08 <+32%20474%2071%2031%2008> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> >>>>>>> [image: https://ml6.eu] <https://ml6.eu/> >>>>>>> >>>>>>> * Robbe Sneyders* >>>>>>> >>>>>>> ML6 Gent >>>>>>> <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl> >>>>>>> >>>>>>> M: +32 474 71 31 08 <+32%20474%2071%2031%2008> >>>>>>> >>>>>> -- >>>>>> Twitter: https://twitter.com/holdenkarau >>>>>> >>>>> >>>>> -- >>> >>> [image: https://ml6.eu] <https://ml6.eu/> >>> >>> * Robbe Sneyders* >>> >>> ML6 Gent >>> <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl> >>> >>> M: +32 474 71 31 08 <+32%20474%2071%2031%2008> >>> >> -- > > [image: https://ml6.eu] <https://ml6.eu/> > > * Robbe Sneyders* > > ML6 Gent > <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl> > > M: +32 474 71 31 08 <+32%20474%2071%2031%2008> > -- [image: https://ml6.eu] <https://ml6.eu/> * Robbe Sneyders* ML6 Gent <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl> M: +32 474 71 31 08