[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1075 ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r158150320 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- There may be a misunderstanding of how config options work. We define the defaults in Drill's own source code: `drill-module.conf` in each module. (Here it is in `java-exec`.) To change the default option, we change the value in `drill-module.conf`. In the highly unlikely case that a user has overridden this value in `drill-override.conf`, their value will be used. But, the option is not documented in `drill-override-example.conf` so it is very, very unlikely that anyone created an override. (The property is meant to be internal, for use in tests.) So, rather than introducing yet another variable, we might as well use the existing config property. This has the added advantage that, if experience suggests that we need a smaller or larger limit for some scenarios, we can make the adjustment in the field via the config system. ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157885846 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- IMO, it is better to change the default to avoid upgrade problems. In an upgrade scenario, users may simply overwrite `drill-override.conf` from their prior installations and forget to set the merge limit. Is there a reason not to change the default merge limit? ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1075#discussion_r157830252 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/managed/SortConfig.java --- @@ -84,7 +85,7 @@ public SortConfig(DrillConfig config) { if (limit > 0) { mergeLimit = Math.max(limit, MIN_MERGE_LIMIT); } else { - mergeLimit = Integer.MAX_VALUE; + mergeLimit = DEFAULT_MERGE_LIMIT; --- End diff -- The merge limit is already a config option. (I'd forgotten about that.) The comment on the config option says "Limit on the number of spilled batches that can be merged in a single pass." So, let's just set that default (in `drill-override-conf`) to your new value of 128 and leave the code here unchanged. ---
[GitHub] drill pull request #1075: DRILL-6030: Managed sort should minimize number of...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1075 DRILL-6030: Managed sort should minimize number of batches in a k-way merge @paul-rogers Please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-6030 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1075.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1075 commit f40d962cf35574c0406937de1ee86ef853234b3e Author: Vlad RozovDate: 2017-12-17T17:25:55Z DRILL-6030: Managed sort should minimize number of batches in a k-way merge ---