[ https://issues.apache.org/jira/browse/DRILL-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424957#comment-16424957 ]
ASF GitHub Bot commented on DRILL-6202: --------------------------------------- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1144 My two cents... DrillBuf is the only memory-level abstraction that (low level) Drill code should reference. The UDLE and other bits should be fully encapsulated. This guideline lets us evolve the representation if we ever need to do so. The original design appeared to be that value vectors would be the primary interface to memory. But, a great many issues made that difficult, not least of which is that vector access methods are heavily typed, resulting in far too much casting. Also, the mutator methods try to do the full operation, leading to inefficiency (especially around VarChars). A more general rule is that application code should work with vectors until they can migrate to working with the result set loader or reader. (We should probably call these the row set emitter and collector to be more Hadoop-like...) The higher-level abstractions handle the grunt work currently spread throughout operators. (And, to answer a prior question: we want to use the row set abstractions so we have a uniform way to write to vectors, to control batch size, to handle schema issues and so on on write. And, to have a standard way to handle indirection vectors and vector navigation on read.) Ideally only, the vector mutator or row set loader implementation works with DrillBuf to do actual data reads and writes. In an early version, the row set loader code used `PlatformDependent` to avoid bounds checks. But, with @vrozov's improvements, doing so became unnecessary -- a nice improvement. Still, bounds checks should be done during tests: it is handy to work with a safety net. Since bounds checks are optional (turned off in production), then the changes here make good sense: no code should count on bounds checks from the "unchecked" methods for the simple reason that the checks are normally off. That said, if there is a reason to have "checked" access, we could provide such methods. Those methods would throw the `IndexOutOfBoundsException`. That is, the checked methods would recreate the original "get/set" methods prior to @vrozov's improvements. I can't think of a reason to do that off the top of my head, but someone might present a valid use case. > Deprecate usage of IndexOutOfBoundsException to re-alloc vectors > ---------------------------------------------------------------- > > Key: DRILL-6202 > URL: https://issues.apache.org/jira/browse/DRILL-6202 > Project: Apache Drill > Issue Type: Bug > Reporter: Vlad Rozov > Assignee: Vlad Rozov > Priority: Major > Fix For: 1.14.0 > > > As bounds checking may be enabled or disabled, using > IndexOutOfBoundsException to resize vectors is unreliable. It works only when > bounds checking is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)