Tim Armstrong has posted comments on this change. Change subject: IMPALA-3286: Software prefetching for hash table build. ......................................................................
Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/hash-table.h File be/src/exec/hash-table.h: Line 296: TupleRow* expr_values_row_; I personally feel like the original design where the buffers are embedded in the hash table context is a bad idea, so I'm not sure about adding more infrastructure on top of the existing hack. I did the plumbing to pull the buffer out of the hash table a while back (for a patch that didn't end up delivering much benefit): http://gerrit.cloudera.org/#/c/2691/4/be/src/exec/hash-table.h If I cleaned up that plumbing patch would it be useful for this? http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 337: hash_values_.reset(new uint32_t[state->batch_size()]); > Ideally, we should allocate this on the stack. To do so, we need to have a Even if the input batch is big, could you just process a subset of it at a time? -- To view, visit http://gerrit.cloudera.org:8080/2896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib85e7fc162ad25c849b9e716b629e226697cd940 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
