[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1144


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-25 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r184242257
  
--- Diff: src/main/resources/checkstyle-config.xml ---
@@ -30,10 +30,15 @@
 
 
 
+
--- End diff --

So what do you tell the user when you get a runtime exception (any 
exception) that is the result of a bug? It is silly to show them a stack trace 
that does not help them. It is better to let the user know that there was an 
internal error and they should log a bug with support or look for help on the 
dev list.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-23 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r183571786
  
--- Diff: src/main/resources/checkstyle-config.xml ---
@@ -30,10 +30,15 @@
 
 
 
+
--- End diff --

IMO the same applies to all run-time exceptions (NPE, CCE, IOBE) and likely 
conversion for NPE is the same as for IOBE. Note that most likely the root 
cause of an IOBE or NPE is a bug and not an end user error, misconfiguration, 
invalid input or a lack of resources, so end user won't be able to "fix" NPE or 
IOBE.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-23 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r183530901
  
--- Diff: src/main/resources/checkstyle-config.xml ---
@@ -30,10 +30,15 @@
 
 
 
+
--- End diff --

I think IOBE should be caught and converted to a UserException so that the 
end user does not get an incomprehensible error message. 


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-10 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r180598658
  
--- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java 
---
@@ -534,15 +534,11 @@ public void setSafe(int index, byte[] bytes) {
   assert index >= 0;
 
   final int currentOffset = offsetVector.getAccessor().get(index);
-  offsetVector.getMutator().setSafe(index + 1, currentOffset + 
bytes.length);
-  try {
-data.setBytes(currentOffset, bytes, 0, bytes.length);
-  } catch (IndexOutOfBoundsException e) {
-while (data.capacity() < currentOffset + bytes.length) {
-  reAlloc();
-}
-data.setBytes(currentOffset, bytes, 0, bytes.length);
+  while (data.capacity() < currentOffset + bytes.length) {
--- End diff --

I'm on the fence on this one. The change from 18 months ago resulted in a 
performance improvement of a few percent for variable length columns. This 
change essentially undoes that and takes us back to where we were. It seems 
that with the recent changes to batch sizing, we are on the path to eliminating 
the realloc behaviour that is part of the performance bottleneck for the 
setSafe methods. So, perhaps, it might be ok to let this one stay.  


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-04 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179231151
  
--- Diff: src/main/resources/checkstyle-config.xml ---
@@ -30,10 +30,15 @@
 
 
 
+
--- End diff --

I commented out the default rule value with the addition of 
IndexOutOfBoundsException. IndexOutOfBoundsException can be used, but it is not 
common to handle it.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-04 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179229891
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java
 ---
@@ -211,7 +211,7 @@ protected boolean canExpand(int delta) {
 
   protected void overflowed() {
 if (listener == null) {
-  throw new IndexOutOfBoundsException("Overflow not supported");
+  throw new UnsupportedOperationException("Overflow not supported");
--- End diff --

I also considered `OversizedAllocationException` but IMO 
`UnsupportedOperationException` is a better fit.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-04 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179226227
  
--- Diff: exec/memory/base/src/main/java/io/netty/buffer/DrillBuf.java ---
@@ -777,23 +778,20 @@ public int getActualMemoryConsumed() {
* @return A hex dump in a String.
*/
   public String toHexString(final int start, final int length) {
-final int roundedStart = (start / LOG_BYTES_PER_ROW) * 
LOG_BYTES_PER_ROW;
-
-final StringBuilder sb = new StringBuilder("buffer byte dump\n");
-int index = roundedStart;
-for (int nLogged = 0; nLogged < length; nLogged += LOG_BYTES_PER_ROW) {
-  sb.append(String.format(" [%05d-%05d]", index, index + 
LOG_BYTES_PER_ROW - 1));
-  for (int i = 0; i < LOG_BYTES_PER_ROW; ++i) {
-try {
-  final byte b = getByte(index++);
-  sb.append(String.format(" 0x%02x", b));
-} catch (IndexOutOfBoundsException ioob) {
-  sb.append(" ");
-}
+Preconditions.checkArgument(start >= 0);
+final StringBuilder sb = new StringBuilder("buffer byte dump");
+final int end = Math.min(length, this.length - start);
+for (int i = 0; i < end; i++) {
+  if (i % LOG_BYTES_PER_ROW == 0) {
+sb.append(String.format("%n [%05d-%05d]", i + start, Math.min(i + 
LOG_BYTES_PER_ROW - 1, end - 1) + start));
   }
-  sb.append('\n');
+  byte b = _getByte(i + start);
+  sb.append(" 0x").append(HEX_CHAR[b >> 4]).append(HEX_CHAR[b & 0x0F]);
--- End diff --

The goal is to avoid new String allocation for every output byte inside 
DrillBuf.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-03 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179023466
  
--- Diff: src/main/resources/checkstyle-config.xml ---
@@ -30,10 +30,15 @@
 
 
 
+
--- End diff --

Do we want to do this? Why are all these forbidden? `Throwable` was needed 
by @Ben-Zvi in his very first Drill bug fix. `RuntimeException` is the root of 
the unchecked exceptions beloved by Drill code and is used as a general 
"something went wrong" placeholder.

These are all commented out, but still...

Finally, `IndexOutOfBoundsException` is a perfectly fine exception. Adding 
this line is probably handy to catch all reference as part of this fix, but 
probably not helpful as a general policy.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-03 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179022090
  
--- Diff: exec/memory/base/src/main/java/io/netty/buffer/DrillBuf.java ---
@@ -777,23 +778,20 @@ public int getActualMemoryConsumed() {
* @return A hex dump in a String.
*/
   public String toHexString(final int start, final int length) {
-final int roundedStart = (start / LOG_BYTES_PER_ROW) * 
LOG_BYTES_PER_ROW;
-
-final StringBuilder sb = new StringBuilder("buffer byte dump\n");
-int index = roundedStart;
-for (int nLogged = 0; nLogged < length; nLogged += LOG_BYTES_PER_ROW) {
-  sb.append(String.format(" [%05d-%05d]", index, index + 
LOG_BYTES_PER_ROW - 1));
-  for (int i = 0; i < LOG_BYTES_PER_ROW; ++i) {
-try {
-  final byte b = getByte(index++);
-  sb.append(String.format(" 0x%02x", b));
-} catch (IndexOutOfBoundsException ioob) {
-  sb.append(" ");
-}
+Preconditions.checkArgument(start >= 0);
+final StringBuilder sb = new StringBuilder("buffer byte dump");
+final int end = Math.min(length, this.length - start);
+for (int i = 0; i < end; i++) {
+  if (i % LOG_BYTES_PER_ROW == 0) {
+sb.append(String.format("%n [%05d-%05d]", i + start, Math.min(i + 
LOG_BYTES_PER_ROW - 1, end - 1) + start));
   }
-  sb.append('\n');
+  byte b = _getByte(i + start);
+  sb.append(" 0x").append(HEX_CHAR[b >> 4]).append(HEX_CHAR[b & 0x0F]);
--- End diff --

Is this complexity needed relative to the original `format(" 0x02x", b)` 
implementation? Is this code performance sensitive enough to justify the extra 
complexity?


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-03 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179022533
  
--- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java 
---
@@ -534,15 +534,11 @@ public void setSafe(int index, byte[] bytes) {
   assert index >= 0;
 
   final int currentOffset = offsetVector.getAccessor().get(index);
-  offsetVector.getMutator().setSafe(index + 1, currentOffset + 
bytes.length);
-  try {
-data.setBytes(currentOffset, bytes, 0, bytes.length);
-  } catch (IndexOutOfBoundsException e) {
-while (data.capacity() < currentOffset + bytes.length) {
-  reAlloc();
-}
-data.setBytes(currentOffset, bytes, 0, bytes.length);
+  while (data.capacity() < currentOffset + bytes.length) {
--- End diff --

One trick used in the row set reader code is to avoid redundant re-allocs 
by computing the new size and rounding to a power of two. This is handy at the 
start with the vector doubles from 256 to 512 to 1K and the data size is, say, 
600 bytes.

However, it is not clear that such an optimization is as necessary as it 
once was: the team's been doing a good job at adding up-front vector allocation 
where it was missing, reducing the gratuitous reallocs.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-03 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179023079
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java
 ---
@@ -211,7 +211,7 @@ protected boolean canExpand(int delta) {
 
   protected void overflowed() {
 if (listener == null) {
-  throw new IndexOutOfBoundsException("Overflow not supported");
+  throw new UnsupportedOperationException("Overflow not supported");
--- End diff --

Maybe change to `OversizedAllocationException`. See 
`FixedValueVectors.setInitialCapacity()`.


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-03 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179023675
  
--- Diff: src/main/resources/checkstyle-suppressions.xml ---
@@ -16,4 +16,13 @@
 
 
   
+  

[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-04-03 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1144#discussion_r179022338
  
--- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java 
---
@@ -534,15 +534,11 @@ public void setSafe(int index, byte[] bytes) {
   assert index >= 0;
 
   final int currentOffset = offsetVector.getAccessor().get(index);
-  offsetVector.getMutator().setSafe(index + 1, currentOffset + 
bytes.length);
-  try {
-data.setBytes(currentOffset, bytes, 0, bytes.length);
-  } catch (IndexOutOfBoundsException e) {
-while (data.capacity() < currentOffset + bytes.length) {
-  reAlloc();
-}
-data.setBytes(currentOffset, bytes, 0, bytes.length);
+  while (data.capacity() < currentOffset + bytes.length) {
--- End diff --

If we compare this implementation to the one from, say, a year or 18 months 
ago, I think we're back where we started.

This would actually be a good use case for a "checkedSetBytes" method that 
does the bounds checks. Still, whether done here or in the underlying method, 
the `if` statement is needed, so no savings in using a "checked" method,


---


[GitHub] drill pull request #1144: DRILL-6202: Deprecate usage of IndexOutOfBoundsExc...

2018-03-01 Thread vrozov
GitHub user vrozov opened a pull request:

https://github.com/apache/drill/pull/1144

DRILL-6202: Deprecate usage of IndexOutOfBoundsException to re-alloc vectors



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/drill DRILL-6202

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1144.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1144


commit 2af94a07340f9f13aa152822c2c8d37568ab44ab
Author: Vlad Rozov 
Date:   2018-03-01T17:36:05Z

DRILL-6202: Deprecate usage of IndexOutOfBoundsException to re-alloc vectors




---