Github user prashanth-vasudev commented on a diff in the pull request:
https://github.com/apache/incubator-trafodion/pull/936#discussion_r98654201
--- Diff: core/sql/generator/GenRelMisc.cpp ---
@@ -3071,7 +3074,16 @@ short Sort::generateTdb(Generator * generator,
sort_options->scratchFreeSpaceThresholdPct() = threshold;
sort_options->sortMaxHeapSize() =
(short)getDefault(SORT_MAX_HEAP_SIZE_MB);
sort_options->mergeBufferUnit() =
(short)getDefault(SORT_MERGE_BUFFER_UNIT_56KB);
+
sort_options->scratchIOBlockSize() =
(Int32)getDefault(SCRATCH_IO_BLOCKSIZE_SORT);
+ if(sortRecLen >= sort_options->scratchIOBlockSize())
+ {
+ Int32 maxScratchIOBlockSize =
(Int32)getDefault(SCRATCH_IO_BLOCKSIZE_SORT_MAX);
+ GenAssert(sortRecLen <= maxScratchIOBlockSize,
+ "sortRecLen is greater than SCRATCH_IO_BLOCKSIZE_SORT_MAX");
+ sort_options->scratchIOBlockSize() = MINOF(sortRecLen * 128,
maxScratchIOBlockSize);
--- End diff --
Sort IO block size default value is 512kb. This block is essentially the
buffer used to write into scratch file. Sort record length received from
compiler in most cases is much smaller, so many records fit into default buffer
size of 512kb. In some cases, sort record length is greater than 512kb, this
is where SORT_IO_BLOCKSIZE_SORT_MAX kicks in. Assuming sort record length here
is 512kb or more, 512kb times 128 is almost 64MB block size. Sort allocates
several multiples of blocks especially during merge phase, hence keeping the
block size smaller is preferred, at the same time, having at most one record
fit in one block is also undesirable. User can control the max by
SORT_IO_BLOCKSIZE_SORT_MAX cqd at this point. Currently this cqd defaults to
5MB.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---