Re: [PR] GH-41873: [Acero][C++] Reduce the asof-join overhead by minimizing copies of the left hand side [arrow]

via GitHub Wed, 29 May 2024 08:59:14 -0700


JerAguilon commented on code in PR #41874:
URL: https://github.com/apache/arrow/pull/41874#discussion_r1619136116



##########
cpp/src/arrow/acero/unmaterialized_table.h:
##########
@@ -79,11 +94,22 @@ class UnmaterializedCompositeTable {
     DCHECK_LE(Size(), (uint64_t)std::numeric_limits<int64_t>::max());
     std::vector<std::shared_ptr<arrow::Array>> arrays(schema->num_fields());
 
-#define MATERIALIZE_CASE(id)                                                   
       \
-  case arrow::Type::id: {                                                      
       \
-    using T = typename arrow::TypeIdTraits<arrow::Type::id>::Type;             
       \
-    ARROW_ASSIGN_OR_RAISE(arrays.at(i_col), materializeColumn<T>(field_type, 
i_col)); \
-    break;                                                                     
       \
+    std::optional<std::unordered_map<int, std::vector<CompositeEntry>>> 
contiguous_blocks;
+    if (contiguous_srcs.size() > 0) {
+      contiguous_blocks = std::unordered_map<int, 
std::vector<CompositeEntry>>();
+      contiguous_blocks.value().reserve(contiguous_srcs.size());
+      for (int src_table : contiguous_srcs) {
+        ARROW_ASSIGN_OR_RAISE(auto flattened_blocks, FlattenSlices(src_table));

Review Comment:
   Note we do this _outside_ of `materializeColumn`. Let's say the LHS of the 
asof join has 1000 columns that we want to output. We don't want to flatten the 
the LHS slices 1000 times for each of the columns - instead, we just flatten 
once per contiguous table.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-41873: [Acero][C++] Reduce the asof-join overhead by minimizing copies of the left hand side [arrow]

Reply via email to