Vibhatha Lakmal Abeykoon created ARROW-17183:
------------------------------------------------

             Summary: [C++] Adding ExecNode with Sort and Fetch capability
                 Key: ARROW-17183
                 URL: https://issues.apache.org/jira/browse/ARROW-17183
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++
            Reporter: Vibhatha Lakmal Abeykoon
            Assignee: Vibhatha Lakmal Abeykoon


In Substrait integrations with ACERO, a functionality required is the ability 
to fetch records sorted and unsorted.

Fetch operation is defined as selecting `K` number of records with an offset. 
For instance pick 10 records skipping the first 5 elements. Here we can define 
this as a Slice operation and records can be easily extracted in a sink-node. 

Sort and Fetch operation applies when we need to execute a Fetch operation on 
sorted data. The main issue is we cannot have a sort node followed by a fetch. 
The reason is that all existing node definitions supporting sort are based on 
sink nodes. Since there cannot be a node followed by sink, this functionality 
has to take place in a single node. 

But this is not a perfect solution for fetch and sort, but one way to do this 
is define a sink node where the records are sorted and then a set of items are 
fetched. 

Another dilema is what if sort is followed by a fetch. In that case, there has 
to be a flag to enable the order of the operations. 

The objective of this ticket is to discuss a viable efficient solution and 
include new nodes or a method to execute such a logic.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to