[ 
https://issues.apache.org/jira/browse/BEAM-5615?focusedWorklogId=153755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-153755
 ]

ASF GitHub Bot logged work on BEAM-5615:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Oct/18 01:43
            Start Date: 12/Oct/18 01:43
    Worklog Time Spent: 10m 
      Work Description: tvalentyn commented on issue #6570: [BEAM-5615] fix cmp 
is an invalid keyword for sort function in python 3
URL: https://github.com/apache/beam/pull/6570#issuecomment-429176130
 
 
   I took a closer look at `Top.Of()` API and I think my original analysis is 
correct - the API accepts both `compare` and `key` functions, and current 
behavior for this case is that we apply `key` function first, and the `compare` 
function afterwards. Disregarding `compare` function when `key` is supplied 
would be a breaking API change. Although `Top.Of()` docstring says: 
   ```
   compare should be an implementation of "a < b" taking at least two arguments
   (a and b). Additional arguments and side inputs specified in the apply call
   become additional arguments to the comparator.  Defaults to the natural
   ordering of the elements.
   
   The arguments 'key' and 'reverse' may instead be passed as keyword
   arguments, and have the same meaning as for Python's sort functions.
   ```
   "Instead" can be interpreted as "`compare` and `key` should be exclusive", 
however this is not enforced by the SDK, and the behavior of `Top.Of()` is 
consistent with behavior of `sorted()` built-in function on Python 2, except 
that `Top.Of()` requires that `compare` returns a boolean, while `sorted()` on 
`Python 2` requires that `cmp` returns -1, 0, 1. In Python 3, `sorted()` no 
longer accepts `cmp`.
   
   Things we can do:
   1) Fix the docstring, and explicitly allow `compare` and `key`. This is the 
less restrictive option for our users, although SDK code is becomes more 
convoluted.
   2) Gradually enforce the docstring - require `compare` and `key`  to be 
exclusive parameters. This would slightly simplify the SDK.
   3) Gradually deprecate `compare` parameter. Assume the natural ordering by 
default, and require a `key` function for custom order. This will also simplify 
SDK code and will be more consistent with `sorted()` signature on Python 3.
   
   3 may be more Pythonic option, 1 is the least restrictive for users, also it 
does not require API deprecation. 
   
   I propose we do a combination 1 and 3 by targeting 3 for Beam 3.0 (or first 
SDK that drops Python 2 support), while doing 1 in short term. 
   
   WDYT @aaltay @robertwb.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 153755)
    Time Spent: 3h 10m  (was: 3h)

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -----------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-5615
>                 URL: https://issues.apache.org/jira/browse/BEAM-5615
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-harness
>            Reporter: Valentyn Tymofieiev
>            Assignee: Juta Staes
>            Priority: Major
>             Fix For: Not applicable
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
>     names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
>     return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
>     label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
>     return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
>     pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
>     return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
>     return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
>     return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
>     TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
>     return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
>     pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
>     return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
>     return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
>     default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
>     *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
>     self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
>     key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to