[
https://issues.apache.org/jira/browse/BEAM-12550?focusedWorklogId=672245&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672245
]
ASF GitHub Bot logged work on BEAM-12550:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Oct/21 22:47
Start Date: 29/Oct/21 22:47
Worklog Time Spent: 10m
Work Description: TheNeuralBit commented on pull request #15809:
URL: https://github.com/apache/beam/pull/15809#issuecomment-955088276
Thanks @svetakvsundhar! Could you take a look at the failing PreCommit
checks?
- PythonDocs one may not be your problem
- PythonLint = pylint, and PythonFormatter = yapf. There should be some info
to help you run these on the Python Tips wiki page (if not let me know)
Python PreCommit runs all the unit tests. It looks like your change actually
breaks one of the tests in frames_test.py. We run some tests on all of the
aggregation methods (including skew and kurtosis) in `AggregationTest`. It
expects skew and kurtosis to not be parallelizable:
https://github.com/apache/beam/blob/a5a0bd26ded0117240b2f6a967eb9f4c65209e6c/sdks/python/apache_beam/dataframe/frames_test.py#L1402-L1411
But you've fixed that now! To make these tests pass you should just need to
remove skew from that list.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 672245)
Remaining Estimate: 0h
Time Spent: 10m
> Implement parallelizable skew and kurtosis
> -------------------------------------------
>
> Key: BEAM-12550
> URL: https://issues.apache.org/jira/browse/BEAM-12550
> Project: Beam
> Issue Type: Improvement
> Components: dsl-dataframe
> Reporter: Brian Hulette
> Assignee: Svetak Vihaan Sundhar
> Priority: P3
> Time Spent: 10m
> Remaining Estimate: 0h
>
> skew and kurtosis should be parallelizable/lifftable by using a similar
> [approach as std and
> var|https://github.com/apache/beam/blob/a0f5e932d8a9aa491b16361abdc629b5e9a483f6/sdks/python/apache_beam/dataframe/frames.py#L1307-L1310].
> See
> https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Higher-order_statistics
> which has information on extending that approach to calculating the third and
> fourth central moments, needed for skew and kurtosis.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)