zifengyu commented on PR #14158:
URL: https://github.com/apache/arrow/pull/14158#issuecomment-1325893750

   This feature is exactly what we need to adapt Acero. I tried to add 
ExecBatch ordering and implemented the limit operator in our product. Here is 
what we saw in the tests. 
   
   1. It seems a little difficult to finish the node (and notify downstream 
node) as the input / output batch counts are not the same. In our case, the 
finish may happen either when having the limit number of rows or upstream node 
is finished producing (but not generated limit rows). The former occurs in 
Queue's deliver task while latter occurs in FetchNode's InputFinished. We did 
not find an easy way to sync these two components so we moved the queue part 
inside node and added a counter to track sent rows.
   
   2. We also need the `offset` setting to skip the first a few rows in the 
limit operator. Can this be included in FetchNode so we may switch back to 
Acero node in future?
   
   Anyway, this proposal is critical to our using Acero. We are looking forward 
to its release.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to