paul-rogers opened a new pull request #1633: DRILL-7024: Refactor ColumnWriter to simplify type-conversion shim URL: https://github.com/apache/drill/pull/1633 DRILL-7006 added a type conversion "shim" within the row set framework. Basically, we insert a "shim" column writer that takes data in one form (String, say), and does reader-specific conversions to a target format (INT, say). The code works fine, but the shim class ends up needing to override a bunch of methods which it then passes along to the base writer. This PR refactors the code so that the conversion shim is simpler. The idea is simple: break each column writer into two parts: an "events" class that handles coordination of vector indexes, skipping nulls, vector overflow, harvesting and restarting batches, and so on. These operations always occur on the base writer: the one that manages actual value vectors. The other part is the column writer itself: the part that offers the various "set" methods for setting values. This is the interface that type conversion shims need to implement. So, how do we tie the two parts together? As it turns out, the code already had most of the solution: most of the "events" methods were already defined via a `WriterEvents` interface. Prior to this PR, every column writer (including shims) implemented that interface. So, we just need to have concrete writers implement the interface, but not conversion shims. Also, the existing code has an `ObjectWriter`: the class that tells us if a writer is a scalar, a map, a union, etc. This then becomes the logical place to connect the two aspects of a writer. When asking for a scalar writer, we return the writer aspect (which may be a shim.) This PR adds a new method to get the events aspect, which returns the concrete writer. Most of the work here is simply to adjust client code to be aware of this new protocol. Also, some classes were also renamed to make the new structure clearer. To make testing easier, all row-set related tests were marked with a new `RowSetTest` category. This allows running all related tests from the command line with a single command: ``` mvn test -Dgroups=org.apache.drill.categories.RowSetTests ``` In my branch, I get a test failure when I run all tests: ``` TestPStoreProviders.verifyZkStore:67 ยป NoSuchElement ``` However, since that test does not use the code touched here, I assume that the failure is unrelated to this change.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
