[GitHub] [beam] TheNeuralBit commented on a change in pull request #14517: [BEAM-12029] Make WontImplementErrors more helpful for order-sensitive operations

GitBox Wed, 14 Apr 2021 18:11:55 -0700


TheNeuralBit commented on a change in pull request #14517:
URL: https://github.com/apache/beam/pull/14517#discussion_r613688586




##########
File path: sdks/python/apache_beam/dataframe/frames.py
##########
@@ -1393,9 +1430,16 @@ def fill_dataframe(*args):
 
 
 
-  cummax = cummin = cumsum = cumprod = frame_base.wont_implement_method(
-      'order-sensitive')
-  diff = frame_base.wont_implement_method('order-sensitive')
+  cummax = frame_base.order_sensitive_method(pd.DataFrame, 'cummax')
+  cummin = frame_base.order_sensitive_method(pd.DataFrame, 'cummin')
+  cumprod = frame_base.order_sensitive_method(pd.DataFrame, 'cumprod')
+  cumsum = frame_base.order_sensitive_method(pd.DataFrame, 'cumsum')
+  diff = frame_base.order_sensitive_method(pd.DataFrame, 'diff')

Review comment:
       Ahh I see. Yeah I think you're right we could do that and it wouldn't be 
too expensive. The problem there is it would be a divergence from pandas, which 
always ignores the index and computes the diff based on the order of the data. 
That feels like a nitpicky argument, but it could be surprising for someone 
very familiar with pandas, so we've opted to just disallow cases like this.
   
   There are some cases where we've added Beam-only arguments that make 
operations feasible in Beam (e.g. `nlargest(keep='any')`, 
`unique(as_series=True)`). We could consider that here, something like 
`diff(by_index=True)`.
   I filed BEAM-12171 to track this idea, thanks :)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] TheNeuralBit commented on a change in pull request #14517: [BEAM-12029] Make WontImplementErrors more helpful for order-sensitive operations

Reply via email to