[
https://issues.apache.org/jira/browse/BEAM-9547?focusedWorklogId=610188&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-610188
]
ASF GitHub Bot logged work on BEAM-9547:
----------------------------------------
Author: ASF GitHub Bot
Created on: 14/Jun/21 07:26
Start Date: 14/Jun/21 07:26
Worklog Time Spent: 10m
Work Description: TheNeuralBit commented on a change in pull request
#14908:
URL: https://github.com/apache/beam/pull/14908#discussion_r650152641
##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -1541,6 +1541,19 @@ def repeat(self, repeats, axis):
"repeat(repeats=) value must be an int or a "
f"DeferredSeries (encountered {type(repeats)}).")
+ @frame_base.with_docs_from(pd.Series)
Review comment:
Hm good thing you asked for this. When I wrote a test for this I
realized this is actually order-sensitive. It returns indexes that can be used
with loc to impose the sorted order, so the result depends on the order of the
data that is observed by argsort.
I think I had in mind that what was returned was "this element is the Nth
largest" which would be independent of the input ordering. I think we should
just make this WontImplement(order-sensitive). The rest of this PR could be
useful though.
##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -1541,6 +1541,19 @@ def repeat(self, repeats, axis):
"repeat(repeats=) value must be an int or a "
f"DeferredSeries (encountered {type(repeats)}).")
+ @frame_base.with_docs_from(pd.Series)
Review comment:
Done. Also updated the logic for indexing loc with a DeferredSeries of
labels to be more general (not just the integer dtype case), and made loc
available on `DeferredSeries`. Could you take another look?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 610188)
Time Spent: 117h 40m (was: 117.5h)
> Implement all pandas operations (or raise WontImplementError)
> -------------------------------------------------------------
>
> Key: BEAM-9547
> URL: https://issues.apache.org/jira/browse/BEAM-9547
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-core
> Reporter: Brian Hulette
> Assignee: Robert Bradshaw
> Priority: P2
> Labels: dataframe-api
> Time Spent: 117h 40m
> Remaining Estimate: 0h
>
> We should have an implementation for every DataFrame, Series, and GroupBy
> method. Everything that's not possible to implement should get a default
> implementation that raises WontImplementError
> SeeĀ https://github.com/apache/beam/pull/10757#discussion_r389132292
> Progress at the individual operation level is tracked in a
> [spreadsheet|https://docs.google.com/spreadsheets/d/1hHAaJ0n0k2tw465ORs5tfdy4Lg0DnGWIQ53cLjAhel0/edit],
> consider requesting edit access if you'd like to help out.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)