westonpace commented on issue #10073:
URL: https://github.com/apache/datafusion/issues/10073#issuecomment-2573298788

   > FWIW I'm still seeing the same issue through LanceDB 
(https://github.com/lancedb/lance/issues/2119#issuecomment-2136414811).
   
   This isn't necessarily indicative as Lance lags behind Datafusion (currently 
we are at 42 which is 4 months behind).  However, I just updated my local lance 
to release 44 (which should contain the potential fix @alamb is alluding to) 
and confirmed that the issue is still not fixed.
   
   This also doesn't surprise me.  I think the issue here is not 
double-counting but rather is dealing with the fact that a string array uses 
more memory after sorting than it was using before sorting (and so we run out 
of memory trying to spill).
   
   I'll try and find some time today to create a pure datafusion reproducer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to