[
https://issues.apache.org/jira/browse/DRILL-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903474#comment-15903474
]
Paul Rogers commented on DRILL-5342:
------------------------------------
Code movements include:
* Move the bulk of the external sort implementation from {{ExternalSortBatch}}
(ESB) to a new {{SortImpl}} class. The ESB becomes primarily an implementation
of Drill's Volcano-inspired "next" protocol. {{SortImpl}} does the actual sort
work. {{SortImpl}} has a narrow, sub-operator interface that does not depend on
global state.
* Move memory management code to a new {{SortMemoryManager}} class to allow
direct testing of that code. This code was heavily unit tested with the two
memory algorithms (per-batch memory setup and consolidation spill decisions)
undergoing significant simplification.
* The spill logic was split into two parts. Previously, the code would check a
memory condition, then directly spill as a result of that decision. Now, the
logic itself resides in the {{SortMemoryManager}}, takes the required
parameters, and returns a consolidation decision: spill, merge or do nothing.
The result is much easier to test. (Resulting in many code simplifications.)
* Move configuration into a new {{SortConfig}} class to allow direct unit
testing.
* The prior version had taken a first crack at pulling the details of code
generation into a separate class: {{OperatorCodeGen}}. This pass breaks the
prior class into "wrappers" (need a better name) that encapsulate each
algorithm for which we generate code. The wrapper holds the code generation
logic (previously in {{OperatorCodeGen}} and the code that does the work to use
the generated code (previously in various classes.)
* Some rationalization was done to the {{Sort}}, {{ExternalSort}} and {{Order}}
operator definitions to move constants from inside code to declared constants
and so on to allow unit testing.
* Similar adjustments were done to the operator definition base class:
{{AbstractBase}}.
* Statistics-related code moved to a new {{SortMetrics}} class to allow testing
of metrics gathering separate from the rest of the code. The metrics themselves
did not change, only how they are packaged.
> Refactor "managed" external sort for unit tests
> -----------------------------------------------
>
> Key: DRILL-5342
> URL: https://issues.apache.org/jira/browse/DRILL-5342
> Project: Apache Drill
> Issue Type: Sub-task
> Components: Tools, Build & Test
> Affects Versions: 1.10.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> The external sort was heavily refactored in 1.10 to allow adding memory
> management. (See DRILL-5080.) That effort focused on breaking up the large
> functions into smaller chunks to allow us to more easily modify the bits
> relevant to adding memory management.
> This ticket discusses refining those changes to create modular bits that can
> be unit tested individually. For the most part, code will remain unchanged
> functionally, but functions will shift from class to class to break
> dependencies. Also, interfaces will be narrowed to remove global dependencies
> on things like {{FragmentContext}} or {{OperatorContext}}.
> The code reviewer(s), when using GitHub, will simply see large changes. Notes
> below identify what was actually changed vs. what was merely moved.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)