[
https://issues.apache.org/jira/browse/BEAM-6693?focusedWorklogId=257176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-257176
]
ASF GitHub Bot logged work on BEAM-6693:
----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Jun/19 20:34
Start Date: 10/Jun/19 20:34
Worklog Time Spent: 10m
Work Description: Hannah-Jiang commented on pull request #8799:
[BEAM-6693] replace mmh3 with default hash function
URL: https://github.com/apache/beam/pull/8799#discussion_r292179692
##########
File path: sdks/python/apache_beam/transforms/stats_test.py
##########
@@ -266,27 +251,7 @@ def
test_approximate_unique_global_by_error_with_samll_population(self):
>> beam.ApproximateUnique.Globally(error=est_err))
assert_that(result, equal_to([actual_count]),
- label='assert:global_by_error_with_samll_population')
- pipeline.run()
-
- def test_approximate_unique_global_by_error_with_big_population(self):
Review comment:
This test is more for estimation algorithm performance testing, rather than
for testing functionality. With py3 default hash algorithm, it would fail
sometimes (2%) because it is out of estimation error range, so I decided to
remove it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 257176)
Time Spent: 12h (was: 11h 50m)
> ApproximateUnique transform for Python SDK
> ------------------------------------------
>
> Key: BEAM-6693
> URL: https://issues.apache.org/jira/browse/BEAM-6693
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Ahmet Altay
> Assignee: Hannah Jiang
> Priority: Minor
> Fix For: 2.14.0
>
> Time Spent: 12h
> Remaining Estimate: 0h
>
> Add a PTransform for estimating the number of distinct elements in a
> PCollection and the number of distinct values associated with each key in a
> PCollection KVs.
> it should offer the same API as its Java counterpart:
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateUnique.java
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)