[ 
https://issues.apache.org/jira/browse/BEAM-3981?focusedWorklogId=88952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-88952
 ]

ASF GitHub Bot logged work on BEAM-3981:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Apr/18 14:36
            Start Date: 09/Apr/18 14:36
    Worklog Time Spent: 10m 
      Work Description: RobbeSneyders opened a new pull request #5053: 
[BEAM-3981] Futurize coders subpackage
URL: https://github.com/apache/beam/pull/5053
 
 
   This pull request prepares the coders subpackage for Python 3 support. This 
pull request is the first of a series in which all subpackages will be updated 
using the same approach.
   This approach has been documented 
[here](https://docs.google.com/document/d/1xDG0MWVlDKDPu_IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=drive_web&ouid=104989467967390032603)
 and the WIP pull request can be found at #4990.
   
   The used approach can be summarized as follows:
   
   - The future package provides tools to forward-port our code, while the six 
package focuses more on backporting. Whenever possible, we will therefore use 
the future package instead of the six package.
   The future package provides backported Python 3 builtins, which can be used 
to write Python 3 style code with python 2 compatibility.
   
   - One of the biggest problems in porting python 2 code to python 3, is the 
changed handling of strings and bytes.
   
     - The future package provides backported Python 3 `str` and `bytes` types. 
While these can be convenient to write Python 2/3 compatible code in end 
products, we don’t believe they are the right choice for Beam.
   These backported types are new classes, subclassed from existing Python 2 
builtins (e.g. `from future.builtins import int` imports the class 
`future.builtins.newint.newint`, which is a subclass op the Python 2 `long` 
type).
   While these new classes behave like the Python 3 types, they don’t give the 
same results when used in type checks, which are constantly used in beam (e.g. 
typecoders, typechecks, …)
   
     - Instead, we propose to rewrite everything using the default `str` and 
`bytes` types. On Python 2, the `bytes` type is an alias for `str`. On Python 
3, `str` is equivalent to Python 2 `unicode`.
   A consistent behaviour between Python 2/3 can be reached by using the 
`bytes` type whenever `str` behaviour is desired on Python 2 and `bytes` 
behaviour is desired on Python 3 (= bytes data).
   The `unicode` type can be used whenever `unicode` behaviour is desired on 
Python 2 and `str` behaviour is desired on Python 3 (= text data). The 
`unicode` type is not available in Python 3, which can be solved by adding
       ```
       Try:
         unicode           # pylint: disable=unicode-builtin
       except NameError:
         unicode = str
       at the top of the module.
       ```
   
     - All string literals which represent `bytes`, should be marked as `b’’`. 
String literals representing `unicode` in test modules should not be marked 
u’’. These will automatically be interpreted as `unicode` literals in Python 3, 
but we still want to test for unmarked Python 2 code.
   Do not use the `from __future__ import unicode_literals` import since [its 
changes are too implicit and introduces a risk of subtle regressions on python 
2](http://python-future.org/unicode_literals.html). 
   
   - The used approach for the long / int types is equivalent to the `unicode` 
/ `str` approach outlined above.
   The `long` type is not available in Python 3, since `int` now has `long` 
behaviour. This can be solved by adding
       ```
       Try:
         long          # pylint: disable=long-builtin
       except NameError:
         long = int
       at the top of the module.
       ```
   
   - Regression should be avoided as much as possible between the application 
of step 2 and step 3. This document proposes to take following measures to keep 
the probability of regression as low as possible:
     - Add the following import to the top of every module:
   `from __future__ import absolute_import`
   We can also add following imports to the top of every measure. This will 
ensures that no new code can be added using for instance the old python 2 
division and adds consistency across modules. We would like to hear the 
community’s opinion on this.
       ```
       from __future__ import division
       from __future__ import print_function
       ```
     - A new tox environment has been added which runs pylint --py3k to check 
for python 3 compatibility.
   
   @aaltay @charlesccychen 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 88952)
    Time Spent: 5h 50m  (was: 5h 40m)

> Futurize and fix python 2 compatibility for coders package
> ----------------------------------------------------------
>
>                 Key: BEAM-3981
>                 URL: https://issues.apache.org/jira/browse/BEAM-3981
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Robbe
>            Assignee: Ahmet Altay
>            Priority: Major
>          Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Run automatic conversion with futurize tool on coders subpackage and fix 
> python 2 compatibility. This prepares the subpackage for python 3 support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to