[ 
https://issues.apache.org/jira/browse/BEAM-12550?focusedWorklogId=672245&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-672245
 ]

ASF GitHub Bot logged work on BEAM-12550:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Oct/21 22:47
            Start Date: 29/Oct/21 22:47
    Worklog Time Spent: 10m 
      Work Description: TheNeuralBit commented on pull request #15809:
URL: https://github.com/apache/beam/pull/15809#issuecomment-955088276


   Thanks @svetakvsundhar! Could you take a look at the failing PreCommit 
checks?
   
   - PythonDocs one may not be your problem
   - PythonLint = pylint, and PythonFormatter = yapf. There should be some info 
to help you run these on the Python Tips wiki page (if not let me know)
   
   Python PreCommit runs all the unit tests. It looks like your change actually 
breaks one of the tests in frames_test.py. We run some tests on all of the 
aggregation methods (including skew and kurtosis) in `AggregationTest`. It 
expects skew and kurtosis to not be parallelizable: 
https://github.com/apache/beam/blob/a5a0bd26ded0117240b2f6a967eb9f4c65209e6c/sdks/python/apache_beam/dataframe/frames_test.py#L1402-L1411
   
   But you've fixed that now! To make these tests pass you should just need to 
remove skew from that list.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

            Worklog Id:     (was: 672245)
    Remaining Estimate: 0h
            Time Spent: 10m

> Implement parallelizable skew and kurtosis 
> -------------------------------------------
>
>                 Key: BEAM-12550
>                 URL: https://issues.apache.org/jira/browse/BEAM-12550
>             Project: Beam
>          Issue Type: Improvement
>          Components: dsl-dataframe
>            Reporter: Brian Hulette
>            Assignee: Svetak Vihaan Sundhar
>            Priority: P3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> skew and kurtosis should be parallelizable/lifftable by using a similar 
> [approach as std and 
> var|https://github.com/apache/beam/blob/a0f5e932d8a9aa491b16361abdc629b5e9a483f6/sdks/python/apache_beam/dataframe/frames.py#L1307-L1310].
>  See 
> https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Higher-order_statistics
> which has information on extending that approach to calculating the third and 
> fourth central moments, needed for skew and kurtosis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to