[ 
https://issues.apache.org/jira/browse/DRILL-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424957#comment-16424957
 ] 

ASF GitHub Bot commented on DRILL-6202:
---------------------------------------

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/1144
  
    My two cents... DrillBuf is the only memory-level abstraction that (low 
level) Drill code should reference. The UDLE and other bits should be fully 
encapsulated. This guideline lets us evolve the representation if we ever need 
to do so.
    
    The original design appeared to be that value vectors would be the primary 
interface to memory. But, a great many issues made that difficult, not least of 
which is that vector access methods are heavily typed, resulting in far too 
much casting. Also, the mutator methods try to do the full operation, leading 
to inefficiency (especially around VarChars).
    
    A more general rule is that application code should work with vectors until 
they can migrate to working with the result set loader or reader. (We should 
probably call these the row set emitter and collector to be more 
Hadoop-like...) The higher-level abstractions handle the grunt work currently 
spread throughout operators.
    
    (And, to answer a prior question: we want to use the row set abstractions 
so we have a uniform way to write to vectors, to control batch size, to handle 
schema issues and so on on write. And, to have a standard way to handle 
indirection vectors and vector navigation on read.)
    
    Ideally only, the vector mutator or row set loader implementation works 
with DrillBuf to do actual data reads and writes. In an early version, the row 
set loader code used `PlatformDependent` to avoid bounds checks. But, with 
@vrozov's improvements, doing so became unnecessary -- a nice improvement.
    
    Still, bounds checks should be done during tests: it is handy to work with 
a safety net.
    
    Since bounds checks are optional (turned off in production), then the 
changes here make good sense: no code should count on bounds checks from the 
"unchecked" methods for the simple reason that the checks are normally off.
    
    That said, if there is a reason to have "checked" access, we could provide 
such methods. Those methods would throw the `IndexOutOfBoundsException`.  That 
is, the checked methods would recreate the original "get/set" methods prior to 
@vrozov's improvements. I can't think of a reason to do that off the top of my 
head, but someone might present a valid use case.


> Deprecate usage of IndexOutOfBoundsException to re-alloc vectors
> ----------------------------------------------------------------
>
>                 Key: DRILL-6202
>                 URL: https://issues.apache.org/jira/browse/DRILL-6202
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Vlad Rozov
>            Assignee: Vlad Rozov
>            Priority: Major
>             Fix For: 1.14.0
>
>
> As bounds checking may be enabled or disabled, using 
> IndexOutOfBoundsException to resize vectors is unreliable. It works only when 
> bounds checking is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to