milenkovicm commented on issue #7858:
URL: 
https://github.com/apache/arrow-datafusion/issues/7858#issuecomment-1770357507

   Thanks @kazuyukitanimura , @alamb for your comments, 
   
   I guess there is no perfect way to address 1.
   
   Another alternative for 1 might be to sort data in `GroupValuesRows`. 
   We could crate a `SortedGroupValuesRows` which would have 
`BTreeMap<GroupingKeyBlob, offset>` instead of 
   
   
https://github.com/apache/arrow-datafusion/blob/b6e4c8238031ae17373d9ae3be2def4b57645e42/datafusion/physical-plan/src/aggregates/group_values/row.rs#L47
   
   this would make state sorted by key, thus no need for `RecordBatch` sorting 
later. 
   
   There should be conversion from `GroupValuesRows` to `SortedGroupValuesRows` 
which would involve copying  just a single table entry from Map to BTree.  This 
conversion (sorting) would be performed on the first spill, 
`SortedGroupValuesRows` would then converted to a batch and spilled. Once the 
first spill has been done AggregateExec should continue using 
`SortedGroupValuesRows` for its state,  as we know at that points that keys 
needed to be sortedfor eventual spill or merge. 
   
   I get your idea regarding point 3, @kazuyukitanimura. I've tried to tune 
memory and batch size, what I observed that once the system is in border area 
this `if` will fail execution. It is quite indeterministic when it's going to 
fail. 
   I agree that spill should not produce a lot of small file, but in practice I 
believe small spill would be just a temporary, until one of the bigger consumer 
spills their state, and they are as well under memory pressure. Thus, believe 
removing this particular checks would improve robustness.  wdyt?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to