[
https://issues.apache.org/jira/browse/DRILL-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423828#comment-16423828
]
ASF GitHub Bot commented on DRILL-6202:
---------------------------------------
Github user parthchandra commented on the issue:
https://github.com/apache/drill/pull/1144
I think we need to include a few other folks into this. @paul-rogers,
@sachouche, have also looked into the issue of excessive bounds checking and
ways to write to direct memory with minimum overhead.
Both Salim and Paul have done work which has tried to eliminate the
excessive checking and use `PlatformDependent` directly, so it might be the
right time to agree on the right approach here. At a high level, I believe
there is agreement that we need to 1) reduce bounds checking to (preferably)
once per write, and 2) to minimise the number of function calls before memory
is actually written to.
We have three layers where we could potentially check bounds - i) the
operators, ii) the vectors, iii) DrillBuf. Right now, we do so at each level,
at multiple times at that. Paul's work on batch sizing provides us with a layer
that gives us the bounds check guarantees at the operator level. This means we
could potentially use value-vectors' set methods (as opposed to the setSafe
methods) and DrillBuf can use PlatformDependent directly.
Some caveats -
UDLE's check for and enforce little-endianness. Checking for endianness is
important for value vectors because they assume little endian, but the
enforcement is sometimes not so desirable. Drill's Java client uses the same
DrillBuf backed by a UDLE and that means that client applications can no longer
run on big endian machines (and yes, I have heard this request from end users).
However, the fact is that UDLE's are an intrinsic part of the drill-memory
design [1] [2]. Eliminating UDLE's can lead to re-doing large parts of very
well tested code.
The caveat to using the vector set methods is that the setSafe methods
provide resizing capability that operators have come to rely upon. Switching
from setSafe to set breaks practically every operator.
[1]
https://github.com/jacques-n/drill/blob/DRILL-4144/exec/memory/base/src/main/java/org/apache/drill/exec/memory/README.md
[2]
https://docs.google.com/nonceSigner?nonce=nj279efks0ro0&continue=https://doc-0o-as-docs.googleusercontent.com/docs/securesc/gipu3hlcf22l6svruqr71h7qe2k3djum/5v7eb2cm4bghq76nj658ai5hkk9h52ur/1522749600000/11021365158327727934/11021365158327727934/0B6CmYjIAywyCalVwcURkaFlkc1U?e%3Ddownload&hash=41l8jspccbj1pp63750c5von8ol4ijtl
> Deprecate usage of IndexOutOfBoundsException to re-alloc vectors
> ----------------------------------------------------------------
>
> Key: DRILL-6202
> URL: https://issues.apache.org/jira/browse/DRILL-6202
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Vlad Rozov
> Assignee: Vlad Rozov
> Priority: Major
> Fix For: 1.14.0
>
>
> As bounds checking may be enabled or disabled, using
> IndexOutOfBoundsException to resize vectors is unreliable. It works only when
> bounds checking is enabled.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)