[
https://issues.apache.org/jira/browse/BEAM-3981?focusedWorklogId=88952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-88952
]
ASF GitHub Bot logged work on BEAM-3981:
----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Apr/18 14:36
Start Date: 09/Apr/18 14:36
Worklog Time Spent: 10m
Work Description: RobbeSneyders opened a new pull request #5053:
[BEAM-3981] Futurize coders subpackage
URL: https://github.com/apache/beam/pull/5053
This pull request prepares the coders subpackage for Python 3 support. This
pull request is the first of a series in which all subpackages will be updated
using the same approach.
This approach has been documented
[here](https://docs.google.com/document/d/1xDG0MWVlDKDPu_IW9gtMvxi2S9I0GB0VDTkPhjXT0nE/edit?usp=drive_web&ouid=104989467967390032603)
and the WIP pull request can be found at #4990.
The used approach can be summarized as follows:
- The future package provides tools to forward-port our code, while the six
package focuses more on backporting. Whenever possible, we will therefore use
the future package instead of the six package.
The future package provides backported Python 3 builtins, which can be used
to write Python 3 style code with python 2 compatibility.
- One of the biggest problems in porting python 2 code to python 3, is the
changed handling of strings and bytes.
- The future package provides backported Python 3 `str` and `bytes` types.
While these can be convenient to write Python 2/3 compatible code in end
products, we don’t believe they are the right choice for Beam.
These backported types are new classes, subclassed from existing Python 2
builtins (e.g. `from future.builtins import int` imports the class
`future.builtins.newint.newint`, which is a subclass op the Python 2 `long`
type).
While these new classes behave like the Python 3 types, they don’t give the
same results when used in type checks, which are constantly used in beam (e.g.
typecoders, typechecks, …)
- Instead, we propose to rewrite everything using the default `str` and
`bytes` types. On Python 2, the `bytes` type is an alias for `str`. On Python
3, `str` is equivalent to Python 2 `unicode`.
A consistent behaviour between Python 2/3 can be reached by using the
`bytes` type whenever `str` behaviour is desired on Python 2 and `bytes`
behaviour is desired on Python 3 (= bytes data).
The `unicode` type can be used whenever `unicode` behaviour is desired on
Python 2 and `str` behaviour is desired on Python 3 (= text data). The
`unicode` type is not available in Python 3, which can be solved by adding
```
Try:
unicode # pylint: disable=unicode-builtin
except NameError:
unicode = str
at the top of the module.
```
- All string literals which represent `bytes`, should be marked as `b’’`.
String literals representing `unicode` in test modules should not be marked
u’’. These will automatically be interpreted as `unicode` literals in Python 3,
but we still want to test for unmarked Python 2 code.
Do not use the `from __future__ import unicode_literals` import since [its
changes are too implicit and introduces a risk of subtle regressions on python
2](http://python-future.org/unicode_literals.html).
- The used approach for the long / int types is equivalent to the `unicode`
/ `str` approach outlined above.
The `long` type is not available in Python 3, since `int` now has `long`
behaviour. This can be solved by adding
```
Try:
long # pylint: disable=long-builtin
except NameError:
long = int
at the top of the module.
```
- Regression should be avoided as much as possible between the application
of step 2 and step 3. This document proposes to take following measures to keep
the probability of regression as low as possible:
- Add the following import to the top of every module:
`from __future__ import absolute_import`
We can also add following imports to the top of every measure. This will
ensures that no new code can be added using for instance the old python 2
division and adds consistency across modules. We would like to hear the
community’s opinion on this.
```
from __future__ import division
from __future__ import print_function
```
- A new tox environment has been added which runs pylint --py3k to check
for python 3 compatibility.
@aaltay @charlesccychen
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 88952)
Time Spent: 5h 50m (was: 5h 40m)
> Futurize and fix python 2 compatibility for coders package
> ----------------------------------------------------------
>
> Key: BEAM-3981
> URL: https://issues.apache.org/jira/browse/BEAM-3981
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-py-core
> Reporter: Robbe
> Assignee: Ahmet Altay
> Priority: Major
> Time Spent: 5h 50m
> Remaining Estimate: 0h
>
> Run automatic conversion with futurize tool on coders subpackage and fix
> python 2 compatibility. This prepares the subpackage for python 3 support.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)