JerAguilon commented on code in PR #41874:
URL: https://github.com/apache/arrow/pull/41874#discussion_r1619131719


##########
cpp/src/arrow/acero/unmaterialized_table.h:
##########
@@ -204,15 +231,79 @@ class UnmaterializedCompositeTable {
     return builder.Append(data + offset0, offset1 - offset0);
   }
 
+  arrow::Result<std::vector<CompositeEntry>> FlattenSlices(int table_index) {
+    std::vector<CompositeEntry> flattened_blocks;
+
+    arrow::RecordBatch* active_rb = NULL;
+    size_t start = -1;
+    size_t end = -1;
+
+    for (const auto& slice : slices) {

Review Comment:
   If it's not self-evident, the asof-join works by creating a `CompositeEntry` 
for each output row.
   
   Since these so-called "contiguous inputs" are `Slice`able, we squash these 
entries down. For example, suppose `slices` has a LHS table that looks like 
this:
   
   ```
   {rb_addr: 1234, start: 1, end: 2},
   {rb_addr: 1234, start: 2, end: 3},
   {rb_addr: 1234, start: 3, end: 4},
   {rb_addr: 4321, start: 100001, end: 100002},
   {rb_addr: 4321, start: 100002, end: 100003},
   ```
   
   This function will squash this down to a compact vector:
   
   ```
   {rb_addr: 1234, start: 1, end: 4},
   {rb_addr: 4321, start: 100001, end: 100003},
   ```
   
   Which we can quickly use to slice the appropriate output column(s).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to