Stephan Hoyer created BEAM-5431:
-----------------------------------
Summary: StarMap transform for Python SDK
Key: BEAM-5431
URL: https://issues.apache.org/jira/browse/BEAM-5431
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Reporter: Stephan Hoyer
Assignee: Ahmet Altay
I'd like to propose a new high-level transform "StarMap" for the Python SDK.
The transform would be syntactic sugar for ParDo like Map, but would would
automatically unpack arguments like
[itertools.starmap|https://docs.python.org/3/library/itertools.html#itertools.starmap]
from Python's standard library.
The use-case is to handle applying functions to tuples of arguments, which is a
common pattern when using Beam's combine and group-by transforms. Right now,
it's common to write functions with manual unpacking, e.g.,
{code:java}
def my_func(inputs):
key, value = inputs
...
beam.Map(my_func) {code}
StarMap offers a much more readable alternative:
{code:java}
def my_func(key, value):
...
beam.StarMap(my_func){code}
The need for StarMap is especially pressing with the advent of Python 3 support
and the eventual wind-down of Python 2. Currently, it's common to achieve this
pattern using unpacking in a function definition, e.g., beam.Map(lambda (k, v):
my_func(k, v)), but this is invalid syntax in Python 3. My internal search of
Google's codebase turns up quite a few matches for "beam\.Map\(lambda\ \(",
none of which would work on Python 3.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)