On Tue, Mar 27, 2018 at 7:12 AM, Holden Karau <hol...@pigscanfly.ca> wrote:

>
> On Tue, Mar 27, 2018 at 4:27 AM Robbe Sneyders <robbe.sneyd...@ml6.eu>
> wrote:
>
>> Hi Anand,
>>
>> Thanks for the feedback.
>>
>> It should be no problem to run everything on DataflowRunner as well.
>> Are there any performance tests in place to check for performance
>> regressions?
>>
>
Yes there is a suite (
https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PerformanceTests_Python.groovy).
It may not be very comprehensive and seems to be failing for a while. I
would not block python 3 work on performance for now. That is the
unfortuante state of things.

If anybody in the community is interested, this would be a great
opportunity to help with benchmarks in general.


>
>> Some questions were raised in the proposal document which I want to add
>> to this conversation:
>>
>> The first comment was about the targeted python 3 versions. We proposed
>> to target 3.6 since it is the latest version available and added 3.5
>> because 3.6 adoption seems rather low (hard to find any relevant sources on
>> this though).
>> If the beam community prefers 3.4, I would propose to target 3.4 only
>> during porting and add 3.5 and 3.6 later so we don't slow down the porting
>> progress. 3.4 has the advantage of already being installed on the workers
>> and allows pySpark pipelines to be moved over to beam more easily.
>> It would be great to get some opinions on this.
>>
>
My preference is to support 3.4+. I searched a bit on the web to understand
the usage statistics for python 3, it seems like python 3.4 has ~20% usage
and python 3.4+ has 99% (
https://semaphoreci.com/blog/2017/10/18/python-versions-used-in-commercial-projects-in-2017.html).
Based on that, I think it makes sense to support it.



>
>> Another comment was made on how to avoid regression during the porting
>> progress.
>> After applying step 1 and step 2, no python 3 compatibility lint warnings
>> should remain, so it would be great if we could enforce this check for
>> every pull request on an already updated subpackage.
>> After applying step 3, all tests should run on python 3, so again it
>> would be great if we can enforce these per updated subpackage.
>> Any insights on how to best accomplish this?
>>
> So you can look at some of the recent changes to tox.ini in the git log to
> see what we’ve done so far around this I suspect you can repeat that same
> pattern.
>

+1 updating tox.ini and adding new checks to run_mini_py3lint.sh would help
a lot to prevent regressions.



>
>> Thanks,
>> Robbe
>>
>> On Fri, 23 Mar 2018 at 19:59 Ahmet Altay <al...@google.com> wrote:
>>
>>> Thank you Robbe.
>>>
>>> I reviewed the document it looks reasonable to me. I will touch on some
>>> points that were not mentioned:
>>> - Runner exercise different code paths. Doing auto conversions and
>>> focusing on DirectRunner is not enough. It is worthwhile to run things on
>>> DataflowRunner as well. This can be triggered from Jenkins. It will
>>> validate that we are still compatible for python 2.
>>> - Similar to above but with an eye on perf regressions.
>>>
>>> For project tracking on JIRA, please feel free to create any new issues,
>>> close stale ones, or take ownership of any open issues. All JIRAs should be
>>> assigned to the people actively working on them. If you wan to track it in
>>> a separate way, you can also propose that. (For example a kanban board is
>>> used for portability effort which is fully supported in JIRA.)
>>>
>>> I will also call out to a few other people in addition to Holden who
>>> helped out or showed interest in helping with Python 3. @cclaus, @luke-zhu,
>>> @udim, @robertwb, @charlesccychen, @tvalentyn. You can include these
>>> people (and myself) for reviews and other questions that you have.
>>>
>>> Welcome again, and looking forward to your contributions.
>>>
>>> Thank you,
>>> Ahmet
>>>
>>>
>>>
>>> On Fri, Mar 23, 2018 at 9:27 AM, Robbe Sneyders <robbe.sneyd...@ml6.eu>
>>> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> In the next month(s), me and my colleague Matthias will commit a lot of
>>>> time and effort to python 3 support for beam and we would like to discuss
>>>> the best way to go forward with this.
>>>>
>>>> We have drawn up a document [1] with a high level outline of the
>>>> proposed approach and would like to get your feedback on this.
>>>>
>>>> The main Jira issue [2] for python 3 support has been mostly inactive
>>>> for the past year. Other smaller issues have been opened, but it's hard to
>>>> track the general progress. It would be great if anyone could offer some
>>>> insights on how to best handle this project on Jira.
>>>>
>>>> @Holden Karau, you seem to have already put in a lot of effort to add
>>>> python 3 support, so it would be great to get your insights and find a way
>>>> to merge our efforts.
>>>>
>>>> Kind regards,
>>>> Robbe
>>>>
>>>> [1] https://docs.google.com/document/d/1xDG0MWVlDKDPu_
>>>> IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=sharing
>>>> [2] https://issues.apache.org/jira/browse/BEAM-1251
>>>> --
>>>>
>>>> [image: https://ml6.eu] <https://ml6.eu/>
>>>>
>>>> * Robbe Sneyders*
>>>>
>>>> ML6 Gent
>>>> <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl>
>>>>
>>>> M: +32 474 71 31 08 <+32%20474%2071%2031%2008>
>>>>
>>>
>>> --
>>
>> [image: https://ml6.eu] <https://ml6.eu/>
>>
>> * Robbe Sneyders*
>>
>> ML6 Gent
>> <https://www.google.be/maps/place/ML6/@51.037408,3.7044893,17z/data=!3m1!4b1!4m5!3m4!1s0x47c37161feeca14b:0xb8f72585fdd21c90!8m2!3d51.037408!4d3.706678?hl=nl>
>>
>> M: +32 474 71 31 08 <+32%20474%2071%2031%2008>
>>
> --
> Twitter: https://twitter.com/holdenkarau
>

Reply via email to