[ https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=239514&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-239514 ]
ASF GitHub Bot logged work on BEAM-6693: ---------------------------------------- Author: ASF GitHub Bot Created on: 08/May/19 23:42 Start Date: 08/May/19 23:42 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #8535: [BEAM-6693] ApproximateUnique transform for Python SDK URL: https://github.com/apache/beam/pull/8535#discussion_r282294445 ########## File path: sdks/python/setup.py ########## @@ -125,6 +125,7 @@ def get_version(): 'pyvcf>=0.6.8,<0.7.0; python_version < "3.0"', 'pyyaml>=3.12,<4.0.0', 'typing>=3.6.0,<3.7.0; python_version < "3.5.0"', + 'mmh3>=2.5.1; python_version >= "2.7"', Review comment: A few questions: - What is this dependency? The pypi page says this is a wrapper. Does it require other things to be installed? Is this the only option we have? - You do need the python_version >= "2.7" part, because all our sdks only support py >= 2.7 - Can you add an upper bound here. Maybe <3.0.0 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 239514) Time Spent: 0.5h (was: 20m) > ApproximateUnique transform for Python SDK > ------------------------------------------ > > Key: BEAM-6693 > URL: https://issues.apache.org/jira/browse/BEAM-6693 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core > Reporter: Ahmet Altay > Assignee: Hannah Jiang > Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Add a PTransform for estimating the number of distinct elements in a > PCollection and the number of distinct values associated with each key in a > PCollection KVs. > it should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)