nikfio opened a new issue, #41049:
URL: https://github.com/apache/arrow/issues/41049

   ### Describe the enhancement requested
   
   Hello everyone,
   
   
   I was looking for a way to perform group_by on Table on equally spaced 
intervals of data specified by a freq value to apply on the key column passed 
as group_by first input described as 'Name of the grouped columns'.
   
   On pandas can be done like this (most compact):
   
   `
   df = dataframe.resample('timestamp').interpolate(method='nearest')
   `
   
   On polars can be done like this (it would be the most similar option for 
pyarrow Table group_by):
   
   `
   dataframe.group_by_dynamic(
                       timestamp,
                       every=tf).agg(col('value').first().alias('first')
   )
   `
   
   
   Does someone have any suggestion?
   I think is a needed functionality, Table should have this feature.
   To implement this feature, I was thinking about :
   
   1. partitioning the Table in equale parts following a specified `freq` input 
value like polars has
   2. execute the group_by on each partition
   3. concatenate the partitions in one final Table
   
   What do you think about it?
   How could I do the point 1? 
   
   
   Thanks,
   Nick 
   
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to