[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1144 ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r184242257 --- Diff: src/main/resources/checkstyle-config.xml --- @@ -30,10 +30,15 @@ + --- End diff -- So what do you tell the user when you get a runtime exception (any exception) that is the result of a bug? It is silly to show them a stack trace that does not help them. It is better to let the user know that there was an internal error and they should log a bug with support or look for help on the dev list. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r183571786 --- Diff: src/main/resources/checkstyle-config.xml --- @@ -30,10 +30,15 @@ + --- End diff -- IMO the same applies to all run-time exceptions (NPE, CCE, IOBE) and likely conversion for NPE is the same as for IOBE. Note that most likely the root cause of an IOBE or NPE is a bug and not an end user error, misconfiguration, invalid input or a lack of resources, so end user won't be able to "fix" NPE or IOBE. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r183530901 --- Diff: src/main/resources/checkstyle-config.xml --- @@ -30,10 +30,15 @@ + --- End diff -- I think IOBE should be caught and converted to a UserException so that the end user does not get an incomprehensible error message. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r180598658 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -534,15 +534,11 @@ public void setSafe(int index, byte[] bytes) { assert index >= 0; final int currentOffset = offsetVector.getAccessor().get(index); - offsetVector.getMutator().setSafe(index + 1, currentOffset + bytes.length); - try { -data.setBytes(currentOffset, bytes, 0, bytes.length); - } catch (IndexOutOfBoundsException e) { -while (data.capacity() < currentOffset + bytes.length) { - reAlloc(); -} -data.setBytes(currentOffset, bytes, 0, bytes.length); + while (data.capacity() < currentOffset + bytes.length) { --- End diff -- I'm on the fence on this one. The change from 18 months ago resulted in a performance improvement of a few percent for variable length columns. This change essentially undoes that and takes us back to where we were. It seems that with the recent changes to batch sizing, we are on the path to eliminating the realloc behaviour that is part of the performance bottleneck for the setSafe methods. So, perhaps, it might be ok to let this one stay. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179231151 --- Diff: src/main/resources/checkstyle-config.xml --- @@ -30,10 +30,15 @@ + --- End diff -- I commented out the default rule value with the addition of IndexOutOfBoundsException. IndexOutOfBoundsException can be used, but it is not common to handle it. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179229891 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java --- @@ -211,7 +211,7 @@ protected boolean canExpand(int delta) { protected void overflowed() { if (listener == null) { - throw new IndexOutOfBoundsException("Overflow not supported"); + throw new UnsupportedOperationException("Overflow not supported"); --- End diff -- I also considered `OversizedAllocationException` but IMO `UnsupportedOperationException` is a better fit. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179226227 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/DrillBuf.java --- @@ -777,23 +778,20 @@ public int getActualMemoryConsumed() { * @return A hex dump in a String. */ public String toHexString(final int start, final int length) { -final int roundedStart = (start / LOG_BYTES_PER_ROW) * LOG_BYTES_PER_ROW; - -final StringBuilder sb = new StringBuilder("buffer byte dump\n"); -int index = roundedStart; -for (int nLogged = 0; nLogged < length; nLogged += LOG_BYTES_PER_ROW) { - sb.append(String.format(" [%05d-%05d]", index, index + LOG_BYTES_PER_ROW - 1)); - for (int i = 0; i < LOG_BYTES_PER_ROW; ++i) { -try { - final byte b = getByte(index++); - sb.append(String.format(" 0x%02x", b)); -} catch (IndexOutOfBoundsException ioob) { - sb.append(" "); -} +Preconditions.checkArgument(start >= 0); +final StringBuilder sb = new StringBuilder("buffer byte dump"); +final int end = Math.min(length, this.length - start); +for (int i = 0; i < end; i++) { + if (i % LOG_BYTES_PER_ROW == 0) { +sb.append(String.format("%n [%05d-%05d]", i + start, Math.min(i + LOG_BYTES_PER_ROW - 1, end - 1) + start)); } - sb.append('\n'); + byte b = _getByte(i + start); + sb.append(" 0x").append(HEX_CHAR[b >> 4]).append(HEX_CHAR[b & 0x0F]); --- End diff -- The goal is to avoid new String allocation for every output byte inside DrillBuf. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179023466 --- Diff: src/main/resources/checkstyle-config.xml --- @@ -30,10 +30,15 @@ + --- End diff -- Do we want to do this? Why are all these forbidden? `Throwable` was needed by @Ben-Zvi in his very first Drill bug fix. `RuntimeException` is the root of the unchecked exceptions beloved by Drill code and is used as a general "something went wrong" placeholder. These are all commented out, but still... Finally, `IndexOutOfBoundsException` is a perfectly fine exception. Adding this line is probably handy to catch all reference as part of this fix, but probably not helpful as a general policy. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179022090 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/DrillBuf.java --- @@ -777,23 +778,20 @@ public int getActualMemoryConsumed() { * @return A hex dump in a String. */ public String toHexString(final int start, final int length) { -final int roundedStart = (start / LOG_BYTES_PER_ROW) * LOG_BYTES_PER_ROW; - -final StringBuilder sb = new StringBuilder("buffer byte dump\n"); -int index = roundedStart; -for (int nLogged = 0; nLogged < length; nLogged += LOG_BYTES_PER_ROW) { - sb.append(String.format(" [%05d-%05d]", index, index + LOG_BYTES_PER_ROW - 1)); - for (int i = 0; i < LOG_BYTES_PER_ROW; ++i) { -try { - final byte b = getByte(index++); - sb.append(String.format(" 0x%02x", b)); -} catch (IndexOutOfBoundsException ioob) { - sb.append(" "); -} +Preconditions.checkArgument(start >= 0); +final StringBuilder sb = new StringBuilder("buffer byte dump"); +final int end = Math.min(length, this.length - start); +for (int i = 0; i < end; i++) { + if (i % LOG_BYTES_PER_ROW == 0) { +sb.append(String.format("%n [%05d-%05d]", i + start, Math.min(i + LOG_BYTES_PER_ROW - 1, end - 1) + start)); } - sb.append('\n'); + byte b = _getByte(i + start); + sb.append(" 0x").append(HEX_CHAR[b >> 4]).append(HEX_CHAR[b & 0x0F]); --- End diff -- Is this complexity needed relative to the original `format(" 0x02x", b)` implementation? Is this code performance sensitive enough to justify the extra complexity? ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179022533 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -534,15 +534,11 @@ public void setSafe(int index, byte[] bytes) { assert index >= 0; final int currentOffset = offsetVector.getAccessor().get(index); - offsetVector.getMutator().setSafe(index + 1, currentOffset + bytes.length); - try { -data.setBytes(currentOffset, bytes, 0, bytes.length); - } catch (IndexOutOfBoundsException e) { -while (data.capacity() < currentOffset + bytes.length) { - reAlloc(); -} -data.setBytes(currentOffset, bytes, 0, bytes.length); + while (data.capacity() < currentOffset + bytes.length) { --- End diff -- One trick used in the row set reader code is to avoid redundant re-allocs by computing the new size and rounding to a power of two. This is handy at the start with the vector doubles from 256 to 512 to 1K and the data size is, say, 600 bytes. However, it is not clear that such an optimization is as necessary as it once was: the team's been doing a good job at adding up-front vector allocation where it was missing, reducing the gratuitous reallocs. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179023079 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java --- @@ -211,7 +211,7 @@ protected boolean canExpand(int delta) { protected void overflowed() { if (listener == null) { - throw new IndexOutOfBoundsException("Overflow not supported"); + throw new UnsupportedOperationException("Overflow not supported"); --- End diff -- Maybe change to `OversizedAllocationException`. See `FixedValueVectors.setInitialCapacity()`. ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179023675 --- Diff: src/main/resources/checkstyle-suppressions.xml --- @@ -16,4 +16,13 @@ +
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1144#discussion_r179022338 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -534,15 +534,11 @@ public void setSafe(int index, byte[] bytes) { assert index >= 0; final int currentOffset = offsetVector.getAccessor().get(index); - offsetVector.getMutator().setSafe(index + 1, currentOffset + bytes.length); - try { -data.setBytes(currentOffset, bytes, 0, bytes.length); - } catch (IndexOutOfBoundsException e) { -while (data.capacity() < currentOffset + bytes.length) { - reAlloc(); -} -data.setBytes(currentOffset, bytes, 0, bytes.length); + while (data.capacity() < currentOffset + bytes.length) { --- End diff -- If we compare this implementation to the one from, say, a year or 18 months ago, I think we're back where we started. This would actually be a good use case for a "checkedSetBytes" method that does the bounds checks. Still, whether done here or in the underlying method, the `if` statement is needed, so no savings in using a "checked" method, ---
[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1144 DRILL-6202: Deprecate usage of IndexOutOfBoundsException to re-alloc vectors You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-6202 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1144.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1144 commit 2af94a07340f9f13aa152822c2c8d37568ab44ab Author: Vlad RozovDate: 2018-03-01T17:36:05Z DRILL-6202: Deprecate usage of IndexOutOfBoundsException to re-alloc vectors ---