[
https://issues.apache.org/jira/browse/ARROW-12873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352115#comment-17352115
]
Michal Nowakiewicz commented on ARROW-12873:
--------------------------------------------
I don't have anything against adding arbitrary tags to batches. In my
experience working in query execution I didn't have a need for such a
functionality, which can mean that either I didn't face the specific problems
(likely) or that there are other ways of solving some of the problems that
require tags.
One thing I am wondering about, which is maybe a bit academic, is what are the
rules of preserving tags that operators (ExecNodes) must follow. Let's say I
implement a new one that splits rows from input ExecBatch into two new
ExecBatches. Would I know what to do with the tags? Or do we say that tags are
for pass-through operators only (operators that do not generate new
ExecBatches)?
> [C++][Compute] Support tagging ExecBatches with arbitrary extra information
> ---------------------------------------------------------------------------
>
> Key: ARROW-12873
> URL: https://issues.apache.org/jira/browse/ARROW-12873
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Ben Kietzman
> Priority: Major
>
> Ideally, ExecBatches could be tagged with arbitrary optional objects for
> tracing purposes and to transmit execution hints from one ExecNode to another.
> These should *not* be explicit members like ExecBatch::selection_vector is,
> since they may not originate from the arrow library. For an example within
> the arrow project: {{libarrow_dataset}} will be used to produce ScanNodes and
> a WriteNodes and it's useful to tag scanned batches with their {{Fragment}}
> of origin. However adding {{ExecBatch::fragment}} would result in a cyclic
> dependency.
> To facilitate this tagging capability, we would need a type erased container
> something like
> {code}
> struct AnySet {
> void* Get(tag_t tag);
> void Set(tag_t tag, void* value, FnOnce<void(void*)> destructor);
> };
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)