[ 
https://issues.apache.org/jira/browse/BEAM-6695?focusedWorklogId=236533&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-236533
 ]

ASF GitHub Bot logged work on BEAM-6695:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/May/19 20:14
            Start Date: 02/May/19 20:14
    Worklog Time Spent: 10m 
      Work Description: ttanay commented on issue #8206: [BEAM-6695] Latest 
PTransform for Python SDK
URL: https://github.com/apache/beam/pull/8206#issuecomment-488815622
 
 
   @aaltay @robinyqiu 
   I agree that `_check_instance_type` is not the way to go. 
   
   The need for type validation inside the staticmethod `add_timestamp` of 
PerKey exists because when using `with_input_types`, the type evaluation does 
not evaluate PCollections of WindowedValue correctly as needed in this case - 
the value attribute of a WindowedValue. 
   Eg: 
   This is from the test: 
[test_per_key](https://github.com/ttanay/beam/blob/beam-6695/sdks/python/apache_beam/transforms/combiners_test.py#L415)
  
   ```python
   # Elements to create PCollection from
   elem_list = [window.GlobalWindows.windowed_value(('a', 1), 300),
            window.GlobalWindows.windowed_value(('b', 3), 100),
            window.GlobalWindows.windowed_value(('a', 2), 200)]
   
   with TestPipeline() as p:
         pc = p | Create(elem_list)
         latest = pc | combine.Latest.PerKey()
         assert_that(latest, equal_to([('a', 1), ('b', 3)]))
   ```
   In the staticmethod `add_timestamp` of Latest PTransform, the type of `pc` 
is evaluated as WindowedValue when using `with_input_types`, whereas, it should 
be KV since the value attribute of the WindowedValue object is a KV(`('a', 1)`) 
which is the case when using `_check_instance_type`.
   
   Is this expected behaviour? If not, I'd love to fix this. It would solve the 
problem.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 236533)
    Time Spent: 7.5h  (was: 7h 20m)

> Latest transform for Python SDK
> -------------------------------
>
>                 Key: BEAM-6695
>                 URL: https://issues.apache.org/jira/browse/BEAM-6695
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Ahmet Altay
>            Assignee: Tanay Tummalapalli
>            Priority: Minor
>          Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Add a PTransform} and Combine.CombineFn for computing the latest element in a 
> PCollection.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to