Casey Ching has posted comments on this change. Change subject: IMPALA-3105: rework handling of tuple buffer sizing in RowBatch ......................................................................
Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/2473/2/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: Line 61: int RowBatch::ComputeCapacity(const RowDescriptor& row_desc, int max_capacity) { Just curious, any idea about roughly what the ideal batch size is? Maybe it would be different if the batch is going across the network. I wonder how the caller will know what a good max_capacity value is. Line 62: // Treat 0 byte rows as 1 byte for computing capacity to avoid divide-by-zero. : int row_size = max(1, row_desc.GetRowSize()); I guess you are thinking accuracy here doesn't matter because max_capacity will be the limiting value? If so maybe say that. Line 64: // Leave some capacity for variable-length data, otherwise we will not be able to : // fit many rows into a batch in many cases. : int FIXED_LEN_BUFFER_LIMIT = AT_CAPACITY_MEM_USAGE / 4; Same here, are you assuming that if there are no variable length slots then max_capacity will be the limiting value? -- To view, visit http://gerrit.cloudera.org:8080/2473 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Idfd9cd681875821c1c379d97586d3f4850aae622 Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Casey Ching <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
