[GitHub] [arrow] save-buffer commented on a change in pull request #11579: ARROW-13643: [C++][Compute] Implement outer join with support for residual predicates

GitBox Tue, 09 Nov 2021 11:41:33 -0800


save-buffer commented on a change in pull request #11579:
URL: https://github.com/apache/arrow/pull/11579#discussion_r745954520




##########
File path: cpp/src/arrow/compute/exec/hash_join_node.cc
##########
@@ -43,32 +43,49 @@ bool HashJoinSchema::IsTypeSupported(const DataType& type) {
   return is_fixed_width(id) || is_binary_like(id) || is_large_binary_like(id);
 }
 
-Result<std::vector<FieldRef>> HashJoinSchema::VectorDiff(const Schema& schema,
-                                                         const 
std::vector<FieldRef>& a,
-                                                         const 
std::vector<FieldRef>& b) {
-  std::unordered_set<int> b_paths;
-  for (size_t i = 0; i < b.size(); ++i) {
-    ARROW_ASSIGN_OR_RAISE(auto match, b[i].FindOne(schema));
-    b_paths.insert(match[0]);
+Result<std::vector<FieldRef>> HashJoinSchema::ComputePayload(
+    const Schema& schema, const std::vector<FieldRef>& output,
+    const std::vector<FieldRef>& filter, const std::vector<FieldRef>& keys) {
+  // payload = (output + filter) - keys, with no duplicates

Review comment:
       Should be `right_schema[5 - len(left_schema)]` (the pathological case is 
if `len(left_schema) == 1`, then using `%` would always evaluate to `0`).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] save-buffer commented on a change in pull request #11579: ARROW-13643: [C++][Compute] Implement outer join with support for residual predicates

Reply via email to