[ 
https://issues.apache.org/jira/browse/BEAM-6694?focusedWorklogId=285426&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-285426
 ]

ASF GitHub Bot logged work on BEAM-6694:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 31/Jul/19 01:29
            Start Date: 31/Jul/19 01:29
    Worklog Time Spent: 10m 
      Work Description: aaltay commented on pull request #9153: [BEAM-6694] 
Added Approximate Quantile Transfrom on Python SDK
URL: https://github.com/apache/beam/pull/9153#discussion_r309002847
 
 

 ##########
 File path: sdks/python/apache_beam/transforms/stats.py
 ##########
 @@ -234,3 +239,419 @@ def extract_output(accumulator):
 
   def display_data(self):
     return {'sample_size': self._sample_size}
+
+
+class ApproximateQuantiles(object):
+  """
+  PTransfrom for getting the idea of data distribution using approximate N-tile
+  (e.g. quartiles, percentiles etc.) either globally or per-key.
+  """
+
+  @staticmethod
+  def _display_data(num_quantiles, compare, key, reverse):
+    return {
+        'num_quantiles': DisplayDataItem(num_quantiles, label="Quantile 
Count"),
+        'compare': DisplayDataItem(compare.__class__,
+                                   label='Record Comparer FN'),
+        'key': DisplayDataItem(key.__class__, label='Record Comparer Key'),
+        'reverse': DisplayDataItem(reverse.__class__, label='Is reversed')
 
 Review comment:
   reverse is a boolean. We would like to capture whether it is True or False, 
instead of it being a type bool.
   
   Similarly for compare, __class__ will be of type None or function and that 
is less helpful. (For that data could be compare.__name__ for example if it is 
not None.)
   
   Similarly for key,  we should try to have some meaningful thing to display 
to users.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 285426)
    Time Spent: 4h  (was: 3h 50m)

> ApproximateQuantiles transform for Python SDK
> ---------------------------------------------
>
>                 Key: BEAM-6694
>                 URL: https://issues.apache.org/jira/browse/BEAM-6694
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Ahmet Altay
>            Assignee: Shehzaad Nakhoda
>            Priority: Minor
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> Add PTransforms for getting an idea of a PCollection's data distribution 
> using approximate N-tiles (e.g. quartiles, percentiles, etc.), either 
> globally or per-key.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateQuantiles.java



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to