[ 
https://issues.apache.org/jira/browse/ARROW-17183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570840#comment-17570840
 ] 

Jeroen van Straten edited comment on ARROW-17183 at 7/25/22 11:03 AM:
----------------------------------------------------------------------

{quote}Keeping an index column is an interesting idea and making it optional 
via extensions could possibly be a good way to handle ordering. But I am not 
quite sure, if that has to be a feature of Acero or an attribute of the table 
itself.
{quote}
The way I see it, it would be generated during the read, just like the metadata 
scalar columns that reads append.

A single Arrow IPC file, Parquet file, or in-memory table always has some 
defined order. The zero-based indices of those rows are the indices I'm 
referring to; not to be confused with database keys and indices. AFAIK the 
number of rows in each chunk can be (somewhat?) efficiently read from metadata 
prior to parallelizing, so the start index for each chunk can be computed. I 
can't gauge how difficult it would be to implement this, but I don't think 
there is or should be a fundamental issue here.

Between files/partitions, I imagine you could just order alphabetically or 
something. Generally speaking, anything that is stable between subsequent 
immutable accesses would be good enough if there is no inherent expectation of 
ordering for the storage method. Choosing the storage method is the 
responsibility of the user, so weird orderings due to the storage method should 
not come as a surprise.
{quote}Should a built Acero exec-plan be optimized before running internally? I 
guess that would be an important feature, but I am not quite sure if we have a 
plan for this kind of an implementation.
{quote}
In general, if you want something to run efficiency on some specific 
architecture, you can't expect to rely solely upon a generic optimizer with no 
knowledge of that specific architecture. So I don't think you can ever avoid 
Acero-specific optimizations. I do however think that you can get ~90% of the 
Acero-specific optimizations done in the same tree traversal that you need for 
the Substrait to Acero conversion anyway, as the more complex grunt work of 
pushing filters and projections through joins and such would already have been 
done at the Substrait level. So that's what I've been proposing.

ETA:
{quote}But I imagined Acero and Substriat to be orthogonal as far as 
implementation wise and decision making, but if it is the goal to support 
Substrait in a very deeper level, these points are very important in 
re-thinking to improve if any features are missing.
{quote}
I don't think this orthogonality is a bad thing, actually. However, if the goal 
is to become a fully-featured query engine, using Substrait or otherwise, you 
do at least need to satisfy the expectations that come with it. Anecdotal, but 
in every database I've ever queried, doing the same query twice returned the 
results in the same order.


was (Author: JIRAUSER282962):
{quote}Keeping an index column is an interesting idea and making it optional 
via extensions could possibly be a good way to handle ordering. But I am not 
quite sure, if that has to be a feature of Acero or an attribute of the table 
itself.
{quote}
The way I see it, it would be generated during the read, just like the metadata 
scalar columns that reads append.

A single Arrow IPC file, Parquet file, or in-memory table always has some 
defined order. The zero-based indices of those rows are the indices I'm 
referring to; not to be confused with database keys and indices. AFAIK the 
number of rows in each chunk can be (somewhat?) efficiently read from metadata 
prior to parallelizing, so the start index for each chunk can be computed. I 
can't gauge how difficult it would be to implement this, but I don't think 
there is or should be a fundamental issue here.

Between files/partitions, I imagine you could just order alphabetically or 
something. Generally speaking, anything that is stable between subsequent 
immutable accesses would be good enough if there is no inherent expectation of 
ordering for the storage method. Choosing the storage method is the 
responsibility of the user, so weird orderings due to the storage method should 
not come as a surprise.
{quote}Should a built Acero exec-plan be optimized before running internally? I 
guess that would be an important feature, but I am not quite sure if we have a 
plan for this kind of an implementation.
{quote}
In general, if you want something to run efficiency on some specific 
architecture, you can't expect to rely solely upon a generic optimizer with no 
knowledge of that specific architecture. So I don't think you can ever avoid 
Acero-specific optimizations. I do however think that you can get ~90% of the 
Acero-specific optimizations done in the same tree traversal that you need for 
the Substrait to Acero conversion anyway, as the more complex grunt work of 
pushing filters and projections through joins and such would already have been 
done at the Substrait level. So that's what I've been proposing.

> [C++] Adding ExecNode with Sort and Fetch capability
> ----------------------------------------------------
>
>                 Key: ARROW-17183
>                 URL: https://issues.apache.org/jira/browse/ARROW-17183
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Vibhatha Lakmal Abeykoon
>            Assignee: Vibhatha Lakmal Abeykoon
>            Priority: Major
>
> In Substrait integrations with ACERO, a functionality required is the ability 
> to fetch records sorted and unsorted.
> Fetch operation is defined as selecting `K` number of records with an offset. 
> For instance pick 10 records skipping the first 5 elements. Here we can 
> define this as a Slice operation and records can be easily extracted in a 
> sink-node. 
> Sort and Fetch operation applies when we need to execute a Fetch operation on 
> sorted data. The main issue is we cannot have a sort node followed by a 
> fetch. The reason is that all existing node definitions supporting sort are 
> based on sink nodes. Since there cannot be a node followed by sink, this 
> functionality has to take place in a single node. 
> But this is not a perfect solution for fetch and sort, but one way to do this 
> is define a sink node where the records are sorted and then a set of items 
> are fetched. 
> Another dilema is what if sort is followed by a fetch. In that case, there 
> has to be a flag to enable the order of the operations. 
> The objective of this ticket is to discuss a viable efficient solution and 
> include new nodes or a method to execute such a logic.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to