[GitHub] drill pull request #1058: DRILL-6002: Avoid memory copy from direct buffer t...

2018-01-10 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1058#discussion_r160841030
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/SpillSet.java
 ---
@@ -107,7 +107,7 @@
  * nodes provide insufficient local disk space)
  */
 
-private static final int TRANSFER_SIZE = 32 * 1024;
+private static final int TRANSFER_SIZE = 1024 * 1024;
--- End diff --

Is a 1MB buffer excessive? The point of a buffer is to ensure we write in 
units of a disk block. For the local file system, experience showed no gain 
after 32K. In the MapR FS, each write is in units of 1 MB. Does Hadoop have a 
preferred size?

Given this variation, if we need large buffers, should we choose a buffer 
size based on the underlying file system? For example, is there a preferred 
size for S3?

32K didn't seem large enough to worry about, even if we had 1000 fragments 
busily spilling. But 1MB? 1000 * 1 MB = 1GB, which starts becoming significant, 
especially in light of our efforts to reduce heap usage. Should we worry?


---


[GitHub] drill issue #1069: DRILL-5994 Added webserver maxThreads configuration optio...

2018-01-10 Thread vrozov
Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1069
  
@MitchelLabonte My proposal is to limit the number of acceptors instead of 
allowing to increase the max number of threads in the connection pool. If the 
number of acceptors is limited, the number of "needed" threads will be limited 
too and the exception will not be raised.


---


[GitHub] drill issue #1069: DRILL-5994 Added webserver maxThreads configuration optio...

2018-01-10 Thread MitchelLabonte
Github user MitchelLabonte commented on the issue:

https://github.com/apache/drill/pull/1069
  
@vrozov the Web server definitely doesn't need this much, but Drill won't 
launch on a machine with more than 200 cores and throw an exception:
Exception in thread "main" 
org.apache.drill.exec.exception.DrillStartupException: Failure
during initial startup of Drillbit:
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:313)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:289)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:285)
> Caused by: java.lang.IllegalStateException: Insufficient max threads in 
ThreadPool: max=200
< needed=206
> at org.eclipse.jetty.server.Server.doStart(Server.java:321)
> at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
> at org.eclipse.drill.exec.server.rest.WebServer.start(WebServer.java:197)
> at org.eclipse.drill.exec.server.Drillbit.run(Drillbit.java:140)
> at org.eclipse.drill.exec.server.Drillbit.start(Drillbit.java:309)
> ... 2 more

The configuration is there to have an option to not hit this exception on 
startup on a machine with more than 200 cores. 


---


[jira] [Created] (DRILL-6080) Sort incorrectly limits batch size to 65535 records rather than 65536

2018-01-10 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6080:
--

 Summary: Sort incorrectly limits batch size to 65535 records 
rather than 65536
 Key: DRILL-6080
 URL: https://issues.apache.org/jira/browse/DRILL-6080
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.12.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Priority: Minor
 Fix For: 1.13.0


Drill places an upper limit on the number of rows in a batch of 64K. That is 
65,536 decimal. When we index records, the indexes run from 0 to 64K-1 or 0 to 
65,535.

The sort code incorrectly uses {{Character.MAX_VALUE}} as the maximum row 
count. So, if an incoming batch uses the full 64K size, sort ends up splitting 
batches unnecessarily.

The fix is to instead use the correct constant `ValueVector.MAX_ROW_COUNT`.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #1069: DRILL-5994 Added webserver maxThreads configuration optio...

2018-01-10 Thread vrozov
Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1069
  
@MitchelLabonte @arina-ielchiieva I don't think that Drill needs that many 
threads/acceptors to handle HTTP(s) requests as it is not a real web (REST API) 
server. For proper resource utilization, it will be better to limit the number 
of acceptors to a small value (let's say 2 or 4 by default) instead of the 
current default that uses a number of available processors and can be huge on 
machines with lots of cores (32 or more). @paul-rogers  What is your take on 
this?  


---


[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-01-10 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r160803438
  
--- Diff: distribution/src/resources/drill-config.sh ---
@@ -180,18 +251,46 @@ else
   fi
 fi
 
-# Default memory settings if none provided by the environment or
+# Checking if being executed in context of Drillbit and not SQLLine
+if [ "$DRILLBIT_CONTEXT" == "1" ]; then 
+  # *-auto.sh allows for distrib/user specific checks to be done
+  distribAuto="$DRILL_CONF_DIR/distrib-auto.sh"
+  if [ ! -r "$distribAuto" ]; then 
distribAuto="$DRILL_HOME/conf/distrib-auto.sh"; fi
+  if [ ! -r "$distribAuto" ]; then distribAuto=""; fi
+  drillAuto="$DRILL_CONF_DIR/drill-auto.sh"
+  if [ ! -r "$drillAuto" ]; then 
drillAuto="$DRILL_HOME/conf/drill-auto.sh"; fi
+  if [ ! -r "$drillAuto" ]; then drillAuto=""; fi
+
+  # Enforcing checks in order (distrib-auto.sh , drill-auto.sh)
+  # (NOTE: A script is executed only if it has relevant executable lines)
+  if [ -n "$distribAuto" ] && [ $(executableLineCount $distribAuto) -gt 0 
]; then
+. "$distribAuto"
+if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed 
checks from $distribAuto"; fi
+  fi
+  if [ -n "$drillAuto" ] && [ $(executableLineCount $drillAuto) -gt 0 ]; 
then
--- End diff --

Passed the checks for the file to the renamed function: 
`checkExecutableLineCount`


---


[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-01-10 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r160802962
  
--- Diff: distribution/src/assemble/bin.xml ---
@@ -345,6 +345,16 @@
   0755
   conf
 
+
+  src/resources/drill-auto.sh
--- End diff --

Modifying the base commit (#1081) to reflect the name change from 
`[distrib/drill]-auto.sh` to `[distrib/drill]-setup.sh`


---


[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160759375
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
+  matcherFcn = new Matcher3();
+} else if (patternLength < 10) {
+  matcherFcn = new MatcherN();
+} else {
+  matcherFcn = new BoyerMooreMatcher();
+}
   }
 
   @Override
   public int match(int start, int end, DrillBuf drillBuf) {
+return matcherFcn.match(start, end, drillBuf);
+  }
+
+  
//--
+  // Inner Data Structure
+  // 
--
+
+  /** Abstract matcher class to allow us pick the most efficient 
implementation */
+  private abstract class MatcherFcn {
+protected final byte[] patternArray;
+
+protected MatcherFcn() {
+  assert patternByteBuffer.hasArray();
+
+  patternArray = patternByteBuffer.array();
+}
+
+/**
+ * @return 1 if the pattern was matched; 0 otherwise
+ */
+protected abstract int match(int start, int end, DrillBuf drillBuf);
+  }
+
+  /** Handles patterns with length one */
+  private final class Matcher1 extends MatcherFcn {
 
-if (patternLength == 0) { // Everything should match for null pattern 
string
-  return 1;
+private Matcher1() {
+  super();
 }
 
-final int txtLength = end - start;
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start;
+  final byte firstPattByte  = patternArray[0];
 
-// no match if input string length is less than pattern length
-if (txtLength < patternLength) {
+  // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
+  // so, we can just directly compare.
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+byte inputByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != inputByte) {
+  continue;
+}
+return 1;
+  }
   return 0;
 }
+  }
 
+  /** Handles patterns with length two */
+  private final class Matcher2 extends MatcherFcn {
 
-final int outerEnd = txtLength - patternLength;
+private Matcher2() {
+  super();
+}
 
-outer:
-for (int txtIndex = 0; txtIndex <= outerEnd; txtIndex++) {
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start - 1;
+  final byte firstPattByte  = patternArray[0];
+  final byte secondPattByte = patternArray[1];
 
   // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
   // so, we can just directly compare.
-  for (int patternIndex = 0; patternIndex < patternLength; 
patternIndex++) {
-if (patternByteBuffer.get(patternIndex) != drillBuf.getByte(start 
+ txtIndex + patternIndex)) {
-  continue outer;
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+final byte firstInByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != firstInByte) {
+  continue;
+} else {
+  final byte secondInByte = drillBuf.getByte(start + idx +1);
+
+  if (secondInByte == secondPattByte) {
+return 1;
+  }
 }
   }
+  return 0;
+}
+  }
+
+  /** Handles patterns with length three */
+  private final class Matcher3 extends MatcherFcn {
 
-  

[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160763490
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
+  matcherFcn = new Matcher3();
+} else if (patternLength < 10) {
+  matcherFcn = new MatcherN();
+} else {
+  matcherFcn = new BoyerMooreMatcher();
+}
   }
 
   @Override
   public int match(int start, int end, DrillBuf drillBuf) {
+return matcherFcn.match(start, end, drillBuf);
+  }
+
+  
//--
+  // Inner Data Structure
+  // 
--
+
+  /** Abstract matcher class to allow us pick the most efficient 
implementation */
+  private abstract class MatcherFcn {
+protected final byte[] patternArray;
+
+protected MatcherFcn() {
+  assert patternByteBuffer.hasArray();
+
+  patternArray = patternByteBuffer.array();
+}
+
+/**
+ * @return 1 if the pattern was matched; 0 otherwise
+ */
+protected abstract int match(int start, int end, DrillBuf drillBuf);
+  }
+
+  /** Handles patterns with length one */
+  private final class Matcher1 extends MatcherFcn {
 
-if (patternLength == 0) { // Everything should match for null pattern 
string
-  return 1;
+private Matcher1() {
+  super();
 }
 
-final int txtLength = end - start;
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start;
+  final byte firstPattByte  = patternArray[0];
 
-// no match if input string length is less than pattern length
-if (txtLength < patternLength) {
+  // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
+  // so, we can just directly compare.
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+byte inputByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != inputByte) {
+  continue;
+}
+return 1;
+  }
   return 0;
 }
+  }
 
+  /** Handles patterns with length two */
+  private final class Matcher2 extends MatcherFcn {
 
-final int outerEnd = txtLength - patternLength;
+private Matcher2() {
+  super();
+}
 
-outer:
-for (int txtIndex = 0; txtIndex <= outerEnd; txtIndex++) {
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start - 1;
+  final byte firstPattByte  = patternArray[0];
+  final byte secondPattByte = patternArray[1];
 
   // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
   // so, we can just directly compare.
-  for (int patternIndex = 0; patternIndex < patternLength; 
patternIndex++) {
-if (patternByteBuffer.get(patternIndex) != drillBuf.getByte(start 
+ txtIndex + patternIndex)) {
-  continue outer;
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+final byte firstInByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != firstInByte) {
+  continue;
+} else {
+  final byte secondInByte = drillBuf.getByte(start + idx +1);
+
+  if (secondInByte == secondPattByte) {
+return 1;
+  }
 }
   }
+  return 0;
+}
+  }
+
+  /** Handles patterns with length three */
+  private final class Matcher3 extends MatcherFcn {
 
-  

[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160753323
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
+  matcherFcn = new Matcher3();
+} else if (patternLength < 10) {
+  matcherFcn = new MatcherN();
+} else {
+  matcherFcn = new BoyerMooreMatcher();
+}
   }
 
   @Override
   public int match(int start, int end, DrillBuf drillBuf) {
+return matcherFcn.match(start, end, drillBuf);
+  }
+
+  
//--
+  // Inner Data Structure
+  // 
--
+
+  /** Abstract matcher class to allow us pick the most efficient 
implementation */
+  private abstract class MatcherFcn {
+protected final byte[] patternArray;
+
+protected MatcherFcn() {
+  assert patternByteBuffer.hasArray();
+
+  patternArray = patternByteBuffer.array();
+}
+
+/**
+ * @return 1 if the pattern was matched; 0 otherwise
+ */
+protected abstract int match(int start, int end, DrillBuf drillBuf);
+  }
+
+  /** Handles patterns with length one */
+  private final class Matcher1 extends MatcherFcn {
 
-if (patternLength == 0) { // Everything should match for null pattern 
string
-  return 1;
+private Matcher1() {
+  super();
 }
 
-final int txtLength = end - start;
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start;
+  final byte firstPattByte  = patternArray[0];
 
-// no match if input string length is less than pattern length
-if (txtLength < patternLength) {
+  // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
+  // so, we can just directly compare.
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+byte inputByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != inputByte) {
+  continue;
+}
+return 1;
+  }
   return 0;
 }
+  }
 
+  /** Handles patterns with length two */
+  private final class Matcher2 extends MatcherFcn {
 
-final int outerEnd = txtLength - patternLength;
+private Matcher2() {
+  super();
+}
 
-outer:
-for (int txtIndex = 0; txtIndex <= outerEnd; txtIndex++) {
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start - 1;
+  final byte firstPattByte  = patternArray[0];
+  final byte secondPattByte = patternArray[1];
 
   // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
   // so, we can just directly compare.
-  for (int patternIndex = 0; patternIndex < patternLength; 
patternIndex++) {
-if (patternByteBuffer.get(patternIndex) != drillBuf.getByte(start 
+ txtIndex + patternIndex)) {
-  continue outer;
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+final byte firstInByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != firstInByte) {
+  continue;
+} else {
+  final byte secondInByte = drillBuf.getByte(start + idx +1);
+
+  if (secondInByte == secondPattByte) {
+return 1;
+  }
 }
   }
+  return 0;
+}
+  }
+
+  /** Handles patterns with length three */
+  private final class Matcher3 extends MatcherFcn {
 
-  

[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160765882
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/expr/fn/impl/TestSqlPatterns.java
 ---
@@ -446,6 +446,61 @@ public void testSqlPatternComplex() {
 assertEquals(1, sqlPatternComplex.match(0, byteBuffer.limit(), 
drillBuf)); // should match
   }
 
+  @Test
+  public void testSqlPatternContainsMUltipleMatchers() {
--- End diff --

Paul, the test-suite already has many other tests for SQL matching. But I 
agree, they now might hit only few of these matcher. I will add more tests.


---


[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160755097
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
+  matcherFcn = new Matcher3();
+} else if (patternLength < 10) {
+  matcherFcn = new MatcherN();
+} else {
+  matcherFcn = new BoyerMooreMatcher();
+}
   }
 
   @Override
   public int match(int start, int end, DrillBuf drillBuf) {
+return matcherFcn.match(start, end, drillBuf);
+  }
+
+  
//--
+  // Inner Data Structure
+  // 
--
+
+  /** Abstract matcher class to allow us pick the most efficient 
implementation */
+  private abstract class MatcherFcn {
+protected final byte[] patternArray;
+
+protected MatcherFcn() {
+  assert patternByteBuffer.hasArray();
+
+  patternArray = patternByteBuffer.array();
+}
+
+/**
+ * @return 1 if the pattern was matched; 0 otherwise
+ */
+protected abstract int match(int start, int end, DrillBuf drillBuf);
+  }
+
+  /** Handles patterns with length one */
+  private final class Matcher1 extends MatcherFcn {
 
-if (patternLength == 0) { // Everything should match for null pattern 
string
-  return 1;
+private Matcher1() {
+  super();
 }
 
-final int txtLength = end - start;
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start;
+  final byte firstPattByte  = patternArray[0];
 
-// no match if input string length is less than pattern length
-if (txtLength < patternLength) {
+  // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
+  // so, we can just directly compare.
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+byte inputByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != inputByte) {
+  continue;
+}
+return 1;
+  }
   return 0;
 }
+  }
 
+  /** Handles patterns with length two */
+  private final class Matcher2 extends MatcherFcn {
 
-final int outerEnd = txtLength - patternLength;
+private Matcher2() {
+  super();
+}
 
-outer:
-for (int txtIndex = 0; txtIndex <= outerEnd; txtIndex++) {
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start - 1;
+  final byte firstPattByte  = patternArray[0];
+  final byte secondPattByte = patternArray[1];
 
   // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
   // so, we can just directly compare.
-  for (int patternIndex = 0; patternIndex < patternLength; 
patternIndex++) {
-if (patternByteBuffer.get(patternIndex) != drillBuf.getByte(start 
+ txtIndex + patternIndex)) {
-  continue outer;
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+final byte firstInByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != firstInByte) {
+  continue;
+} else {
+  final byte secondInByte = drillBuf.getByte(start + idx +1);
+
+  if (secondInByte == secondPattByte) {
+return 1;
+  }
 }
   }
+  return 0;
+}
+  }
+
+  /** Handles patterns with length three */
+  private final class Matcher3 extends MatcherFcn {
 
-  

[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160758370
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
+  matcherFcn = new Matcher3();
+} else if (patternLength < 10) {
+  matcherFcn = new MatcherN();
+} else {
+  matcherFcn = new BoyerMooreMatcher();
+}
   }
 
   @Override
   public int match(int start, int end, DrillBuf drillBuf) {
+return matcherFcn.match(start, end, drillBuf);
+  }
+
+  
//--
+  // Inner Data Structure
+  // 
--
+
+  /** Abstract matcher class to allow us pick the most efficient 
implementation */
+  private abstract class MatcherFcn {
+protected final byte[] patternArray;
+
+protected MatcherFcn() {
+  assert patternByteBuffer.hasArray();
+
+  patternArray = patternByteBuffer.array();
+}
+
+/**
+ * @return 1 if the pattern was matched; 0 otherwise
+ */
+protected abstract int match(int start, int end, DrillBuf drillBuf);
+  }
+
+  /** Handles patterns with length one */
+  private final class Matcher1 extends MatcherFcn {
 
-if (patternLength == 0) { // Everything should match for null pattern 
string
-  return 1;
+private Matcher1() {
+  super();
 }
 
-final int txtLength = end - start;
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start;
+  final byte firstPattByte  = patternArray[0];
 
-// no match if input string length is less than pattern length
-if (txtLength < patternLength) {
+  // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
+  // so, we can just directly compare.
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+byte inputByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != inputByte) {
+  continue;
+}
+return 1;
+  }
   return 0;
 }
+  }
 
+  /** Handles patterns with length two */
+  private final class Matcher2 extends MatcherFcn {
 
-final int outerEnd = txtLength - patternLength;
+private Matcher2() {
+  super();
+}
 
-outer:
-for (int txtIndex = 0; txtIndex <= outerEnd; txtIndex++) {
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start - 1;
+  final byte firstPattByte  = patternArray[0];
+  final byte secondPattByte = patternArray[1];
 
   // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
   // so, we can just directly compare.
-  for (int patternIndex = 0; patternIndex < patternLength; 
patternIndex++) {
-if (patternByteBuffer.get(patternIndex) != drillBuf.getByte(start 
+ txtIndex + patternIndex)) {
-  continue outer;
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+final byte firstInByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != firstInByte) {
+  continue;
+} else {
+  final byte secondInByte = drillBuf.getByte(start + idx +1);
+
+  if (secondInByte == secondPattByte) {
+return 1;
+  }
 }
   }
+  return 0;
+}
+  }
+
+  /** Handles patterns with length three */
+  private final class Matcher3 extends MatcherFcn {
 
-  

[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160764681
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
+  matcherFcn = new Matcher3();
+} else if (patternLength < 10) {
+  matcherFcn = new MatcherN();
+} else {
+  matcherFcn = new BoyerMooreMatcher();
+}
   }
 
   @Override
   public int match(int start, int end, DrillBuf drillBuf) {
+return matcherFcn.match(start, end, drillBuf);
+  }
+
+  
//--
+  // Inner Data Structure
+  // 
--
+
+  /** Abstract matcher class to allow us pick the most efficient 
implementation */
+  private abstract class MatcherFcn {
+protected final byte[] patternArray;
+
+protected MatcherFcn() {
+  assert patternByteBuffer.hasArray();
+
+  patternArray = patternByteBuffer.array();
+}
+
+/**
+ * @return 1 if the pattern was matched; 0 otherwise
+ */
+protected abstract int match(int start, int end, DrillBuf drillBuf);
+  }
+
+  /** Handles patterns with length one */
+  private final class Matcher1 extends MatcherFcn {
 
-if (patternLength == 0) { // Everything should match for null pattern 
string
-  return 1;
+private Matcher1() {
+  super();
 }
 
-final int txtLength = end - start;
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start;
+  final byte firstPattByte  = patternArray[0];
 
-// no match if input string length is less than pattern length
-if (txtLength < patternLength) {
+  // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
+  // so, we can just directly compare.
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+byte inputByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != inputByte) {
+  continue;
+}
+return 1;
+  }
   return 0;
 }
+  }
 
+  /** Handles patterns with length two */
+  private final class Matcher2 extends MatcherFcn {
 
-final int outerEnd = txtLength - patternLength;
+private Matcher2() {
+  super();
+}
 
-outer:
-for (int txtIndex = 0; txtIndex <= outerEnd; txtIndex++) {
+/** {@inheritDoc} */
+@Override
+protected final int match(int start, int end, DrillBuf drillBuf) {
+  final int lengthToProcess = end - start - 1;
+  final byte firstPattByte  = patternArray[0];
+  final byte secondPattByte = patternArray[1];
 
   // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
   // so, we can just directly compare.
-  for (int patternIndex = 0; patternIndex < patternLength; 
patternIndex++) {
-if (patternByteBuffer.get(patternIndex) != drillBuf.getByte(start 
+ txtIndex + patternIndex)) {
-  continue outer;
+  for (int idx = 0; idx < lengthToProcess; idx++) {
+final byte firstInByte = drillBuf.getByte(start + idx);
+
+if (firstPattByte != firstInByte) {
+  continue;
+} else {
+  final byte secondInByte = drillBuf.getByte(start + idx +1);
+
+  if (secondInByte == secondPattByte) {
+return 1;
+  }
 }
   }
+  return 0;
+}
+  }
+
+  /** Handles patterns with length three */
+  private final class Matcher3 extends MatcherFcn {
 
-  

[GitHub] drill pull request #1072: DRILL-5879: Improved SQL Pattern Contains Performa...

2018-01-10 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1072#discussion_r160754694
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -19,44 +19,283 @@
 
 import io.netty.buffer.DrillBuf;
 
-public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+/** SQL Pattern Contains implementation */
+public final class SqlPatternContainsMatcher extends 
AbstractSqlPatternMatcher {
+  private final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString) {
 super(patternString);
+
+// Pattern matching is 1) a CPU intensive operation and 2) pattern and 
input dependent. The conclusion is
+// that there is no single implementation that can do it all well. So, 
we use multiple implementations
+// chosen based on the pattern length.
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
+} else if (patternLength == 2) {
+  matcherFcn = new Matcher2();
+} else if (patternLength == 3) {
--- End diff --

Good point Paul!


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160792114
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/util/DecimalUtility.java ---
@@ -159,9 +159,20 @@ public static BigDecimal 
getBigDecimalFromSparse(DrillBuf data, int startIndex,
 }
 
 public static BigDecimal getBigDecimalFromDrillBuf(DrillBuf bytebuf, 
int start, int length, int scale) {
+  if (length <= 0) {
+// if the length is somehow non-positive, interpret this as zero
+//System.out.println("getBigDecimal forces 0 with start " + start 
+ " len " + length);
+  try {
+  throw new Exception("hi there");
--- End diff --

Yes, I'll remove the friendly "hi there".  It was for debugging, I guess.


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160791971
  
--- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java 
---
@@ -539,7 +553,12 @@ public void setValueLengthSafe(int index, int length) {
 }
 
 
-public void setSafe(int index, int start, int end, DrillBuf buffer){
+<#if type.minor == "VarDecimal">
--- End diff --

Is this what you're talking about?  It looks OK to me.
<#if type.minor == "VarDecimal">
public void setSafe(int index, int start, int end, DrillBuf buffer, int 
scale)
<#else>
public void setSafe(int index, int start, int end, DrillBuf buffer)
 <#-- type.minor -->


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160790684
  
--- Diff: exec/vector/src/main/codegen/templates/ComplexWriters.java ---
@@ -99,7 +99,7 @@ public void write(Nullable${minor.class?cap_first}Holder 
h) {
 
   <#if !(minor.class == "Decimal9" || minor.class == "Decimal18" || 
minor.class == "Decimal28Sparse" || minor.class == "Decimal38Sparse" || 
minor.class == "Decimal28Dense" || minor.class == "Decimal38Dense")>
   public void write${minor.class}(<#list fields as field>${field.type} 
${field.name}<#if field_has_next>, ) {
-mutator.addSafe(idx(), <#list fields as field>${field.name}<#if 
field_has_next>, );
+mutator.addSafe(idx(), <#list fields as field><#if field.name == 
"scale"><#break>${field.name}<#if field_has_next && 
fields[field_index+1].name != "scale" >, );
--- End diff --

I had a hard time understanding how this codegen stuff works, so some of 
the code I wrote was due to my struggling to  understand how it is SUPPOSED to 
work.  As such, I found that it needed to "break" out of code generation just 
after the "scale" argument, to avoid compilation failure of the generated code 
(IIRC, it was quite a while ago), and that is why I coded it this way.  What 
exact change are you suggesting?  To AND the field.name check=="scale" check 
with a check that minor.class=="VarDecimal"?


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160786416
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/StringOutputRecordWriter.java ---
@@ -146,7 +146,7 @@ public void writeField() throws IOException {
 // TODO: error check
 addField(fieldId, reader.readObject().toString());
 
-  <#elseif minor.class == "VarChar" || minor.class == "Var16Char" || 
minor.class == "VarBinary">
+  <#elseif minor.class == "VarChar" || minor.class == "Var16Char" || 
minor.class == "VarBinary" || minor.class == "VarDecimal">
--- End diff --

Yes, that works


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160785721
  
--- Diff: exec/java-exec/src/main/codegen/templates/SqlAccessors.java ---
@@ -127,6 +127,25 @@ public String getString(int index) {
 }
   <#break>
 
+<#case "VarDecimal">
+
+@Override
+public String getString(int index){
+<#if mode=="Nullable">
+if(ac.isNull(index)){
+  return null;
+}
+
+try {
--- End diff --

The getString method can throw NumberFormatException for BigDecimal, 
according to javadoc 
https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html.  Should it 
just throw the exception?  If yes, I';ll remove the try/catch.


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160779892
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/Decimal/DecimalFunctions.java ---
@@ -102,7 +111,578 @@
 <#-- For each DECIMAL... type (in DecimalTypes.tdd) ... -->
 <#list comparisonTypesDecimal.decimalTypes as type>
 
-<#if type.name.endsWith("Sparse")>
+<#if type.name.endsWith("VarDecimal")>
+
+<@pp.changeOutputFile 
name="/org/apache/drill/exec/expr/fn/impl/${type.name}Functions.java" />
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.drill.exec.expr.fn.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import 
org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.fn.FunctionGenerationHelper;
+import org.apache.drill.exec.expr.holders.*;
+import org.apache.drill.exec.record.RecordBatch;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.buffer.DrillBuf;
+
+import java.nio.ByteBuffer;
+
+@SuppressWarnings("unused")
+public class ${type.name}Functions {
+private static void initBuffer(DrillBuf buffer) {
+// for VarDecimal, this method of setting initial size is actually 
only a very rough heuristic.
+int size = (${type.storage} * 
(org.apache.drill.exec.util.DecimalUtility.INTEGER_SIZE));
+buffer = buffer.reallocIfNeeded(size);
+ }
+
+@FunctionTemplate(name = "subtract", scope = 
FunctionTemplate.FunctionScope.DECIMAL_ADD_SCALE, nulls = 
NullHandling.NULL_IF_NULL)
--- End diff --

I will try adding that with checkPrecision=false (one size fits all 
precisions), e.g.:
@FunctionTemplate(name = "add",
scope = FunctionTemplate.FunctionScope.SIMPLE,
returnType = FunctionTemplate.ReturnType.DECIMAL_ADD_SCALE,
nulls = NullHandling.NULL_IF_NULL,
checkPrecisionRange = false)



---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160777281
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/Decimal/CastIntDecimal.java ---
@@ -68,15 +68,31 @@ public void setup() {
 
 public void eval() {
 out.scale = (int) scale.value;
+
+<#if !type.to.endsWith("VarDecimal")>
 out.precision = (int) precision.value;
+
 
-<#if type.to == "Decimal9" || type.to == "Decimal18">
+<#if type.to.endsWith("VarDecimal")>
+out.start = 0;
+out.buffer = buffer;
+String s = Long.toString((long)in.value);
+for (int i = 0; i < out.scale; ++i) {  // add 0's to get unscaled 
integer
+s += "0";
--- End diff --

Agreed


---


[GitHub] drill pull request #570: DRILL-4834 decimal implementation is vulnerable to ...

2018-01-10 Thread daveoshinsky
Github user daveoshinsky commented on a diff in the pull request:

https://github.com/apache/drill/pull/570#discussion_r160776544
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/Decimal/DecimalAggrTypeFunctions2.java
 ---
@@ -108,9 +108,12 @@ public void output() {
 out.buffer = buffer;
 out.start  = 0;
 out.scale = outputScale.value;
-out.precision = 38;
 java.math.BigDecimal average = 
((java.math.BigDecimal)(value.obj)).divide(java.math.BigDecimal.valueOf(count.value,
 0), out.scale, java.math.BigDecimal.ROUND_HALF_UP);
+<#if type.inputType.contains("VarDecimal")>
--- End diff --

Yes, I will try "<#if !type.inputType.contains("VarDecimal")>"


---


[GitHub] drill issue #1086: DRILL-6076: Reduce the default memory from a total of 13G...

2018-01-10 Thread priteshm
Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1086
  
@paul-rogers, @parthchandra  can you review/ comment on this change?


---


[HANGOUT] Minutes Jan 9, 2018

2018-01-10 Thread Vlad Rozov
Attendees: Arina, Paul, Pritesh, Vitalii, Volodymyr, Vova, Vlad, Sorabh, 
Anil, Hanumath, Boaz, Karthik


Discussion topics and action items:

- Support Java 8 in 1.13.0 by default. Community to continue work on 
several JIRAs (see 
https://lists.apache.org/thread.html/47f5a82c840cdc16075b9270a2df5e1d803a800684396768179d398f@%3Cdev.drill.apache.org%3E)
- Support Java 9 (released Sep 2017). File JIRA to research compilation 
errors and run-time support.
- Support for Hadoop 3.x. Wait for Hadoop distros to adopt and support 
Hadoop 3.x
- Parquet-mr 1.9.0 upgrade. Check dev@drill for background of the issue 
and file JIRA to summarize findings (see 
https://lists.apache.org/thread.html/ace52bf41f68e1996dcef7ec6f6c78cba9c065f776760751ef15aa56@%3Cdev.drill.apache.org%3E)
- Add support for Yarn in 1.13.0. Review Yarn PR. Document the 
functionality.
- Hangouts. Consider sponsoring Google hangout account for Apache to 
allow more participants. Switch to weekly hangout schedule.


Thank you,

Vlad


[jira] [Resolved] (DRILL-5833) Parquet reader fails with assertion error for Decimal9, Decimal18 types

2018-01-10 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5833.

Resolution: Fixed

> Parquet reader fails with assertion error for Decimal9, Decimal18 types
> ---
>
> Key: DRILL-5833
> URL: https://issues.apache.org/jira/browse/DRILL-5833
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> The {{TestParquetWriter.testDecimal()}} test recently failed. As it turns 
> out, this test never ran properly before against the "old" Parquet reader. 
> Because the {{store.parquet.use_new_reader}} was left at a previous value, 
> sometimes the test would run against the "new" reader (and succeed) or 
> against the "old" reader (and fail.)
> Once the test was forced to run against the "old" reader, it fails deep in 
> the Parquet record reader in 
> {{DrillParquetGroupConverter.getConverterForType()}}.
> The code attempts to create a Decimal9 writer by calling 
> {{SingleMapWriter.decimal9(String name)}} to create the writer. However, the 
> code in this method says:
> {code}
>   public Decimal9Writer decimal9(String name) {
> // returns existing writer
> final FieldWriter writer = fields.get(name.toLowerCase());
> assert writer != null;
> return writer;
>   }
> {code}
> And, indeed, the assertion is triggered.
> As it turns out, the code for Decimal28 shows the proper solution:
> {code}
> mapWriter.decimal28Sparse(name, metadata.getScale(), metadata.getPrecision())
> {code}
> That is, pass the scale and precision to this form of the method which 
> actually creates the writer:
> {code}
>   public Decimal9Writer decimal9(String name, int scale, int precision) {
> {code}
> Applying the same pattern to for the Parquet Decimal9 and Decimal18 types 
> allows the above test to get past the asserts. Given this error, it is clear 
> that this test could never have run, and so the error in the Parquet reader 
> was never detected.
> It also turns out that the test itself is wrong, reversing the validation and 
> test queries:
> {code}
>   public void runTestAndValidate(String selection, String 
> validationSelection, String inputTable, String outputFile) throws Exception {
> try {
>   deleteTableIfExists(outputFile);
>   ...
>   // Query reads from the input (JSON) table
>   String query = String.format("SELECT %s FROM %s", selection, 
> inputTable);
>   String create = "CREATE TABLE " + outputFile + " AS " + query;
>   // validate query reads from the output (Parquet) table
>   String validateQuery = String.format("SELECT %s FROM " + outputFile, 
> validationSelection);
>   test(create);
>   testBuilder()
>   .unOrdered()
>   .sqlQuery(query) // Query under test is input query
>   .sqlBaselineQuery(validateQuery) // Baseline query is output query
>   .go();
> {code}
> Given this, it is the Parquet data that is wrong, not the baseline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1078: DRILL-6054: don't try to split the filter when it ...

2018-01-10 Thread chunhui-shi
Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1078#discussion_r160741806
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/FindPartitionConditions.java
 ---
@@ -195,8 +195,16 @@ private void popOpStackAndBuildFilter() {
  * For all other operators we clear the children if one of the
  * children is a no push.
  */
-assert currentOp.getOp().getKind() == SqlKind.AND;
-newFilter = currentOp.getChildren().get(0);
+if (currentOp.getOp().getKind() == SqlKind.AND) {
+  newFilter = currentOp.getChildren().get(0);
+  for(OpState opState : opStack) {
--- End diff --

done.


---


[GitHub] drill pull request #1078: DRILL-6054: don't try to split the filter when it ...

2018-01-10 Thread chunhui-shi
Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1078#discussion_r160741771
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/FindPartitionConditions.java
 ---
@@ -228,13 +236,16 @@ private boolean isHolisticExpression(RexCall call) {
 return false;
   }
 
+  protected boolean inputRefToPush(RexInputRef inputRef) {
--- End diff --

This is intentionally made to be 'protected' for future extension.
Right now, FindPartitionCondition use position based inputRef(using BitSet 
dirs) to mark which inputRef should be pushed. But in future, we may use name 
based policy to decide which one to push. 


---


[GitHub] drill pull request #1001: JIRA DRILL-5879: Like operator performance improve...

2018-01-10 Thread sachouche
Github user sachouche closed the pull request at:

https://github.com/apache/drill/pull/1001


---


[GitHub] drill issue #1001: JIRA DRILL-5879: Like operator performance improvements

2018-01-10 Thread sachouche
Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/1001
  
Created another pull request #1072to merge my changes with the one done 
with Padma's.


---


[GitHub] drill issue #1087: Attempt to fix memory leak in Parquet

2018-01-10 Thread sachouche
Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/1087
  
Thank you Arina for catching this; I created the commit for QA before 
vacation so that they could verify the fix. At that time, I didn't have an 
Apache JIRA. I have now updated the comment to reflect the JIRA id DRILL-6079 
which is also the name of this remote branch.


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160717181
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
 ---
@@ -250,20 +250,12 @@ private boolean metaDataFileExists(FileSystem fs, 
FileStatus dir) throws IOExcep
 }
 
 boolean isDirReadable(DrillFileSystem fs, FileStatus dir) {
-  Path p = new Path(dir.getPath(), 
ParquetFileWriter.PARQUET_METADATA_FILE);
   try {
-if (fs.exists(p)) {
-  return true;
-} else {
-
-  if (metaDataFileExists(fs, dir)) {
-return true;
-  }
-  List statuses = DrillFileSystemUtil.listFiles(fs, 
dir.getPath(), false);
-  return !statuses.isEmpty() && super.isFileReadable(fs, 
statuses.get(0));
-}
+// There should be at least one file, which is readable by Drill
+List statuses = DrillFileSystemUtil.listFiles(fs, 
dir.getPath(), false);
+return !statuses.isEmpty() && super.isFileReadable(fs, 
statuses.get(0));
--- End diff --

1. How we know that we have file in nested directory if filter checks with 
recursive flag false?
2. The `isReadable` method uses implicit assumption that if metadata file 
exists, there is some data in folder. Plus comment in this file also  points 
out that the same logic is used in `isDirReadable`. Since you are removing this 
logic from `isDirReadable` please make sure the logic is synced between two 
methods or at least comment is updated to reflect new approach.



---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160711830
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/SchemalessScan.java
 ---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.base;
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.google.common.base.Preconditions;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.PhysicalOperatorSetupException;
+import org.apache.drill.exec.planner.logical.DynamicDrillTable;
+import org.apache.drill.exec.proto.CoordinationProtos;
+
+import java.util.List;
+
+/**
+ *  The type of scan operator, which allows to scan schemaless tables 
({@link DynamicDrillTable} with null selection)
+ */
+@JsonTypeName("schemaless-scan")
+public class SchemalessScan extends AbstractGroupScan implements SubScan {
+
+  private String selectionRoot;
--- End diff --

final


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160712431
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSelection.java 
---
@@ -424,6 +429,23 @@ public MetadataContext getMetaContext() {
 return metaContext;
   }
 
+  /**
+   * @return true if this file selectionRoot points to an empty directory, 
false otherwise
+   */
+  public boolean isEmptyDirectory() {
+return emptyDirectory;
+  }
+
+  /**
+   * Setting this as true allows to identify this as empty directory file 
selection
+   *
+   * @param emptyDirectory empty directory flag
+   */
+  public void setEmptyDirectory(boolean emptyDirectory) {
--- End diff --

Please use set empty directory without flag.


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160685002
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
 ---
@@ -250,20 +250,12 @@ private boolean metaDataFileExists(FileSystem fs, 
FileStatus dir) throws IOExcep
 }
 
 boolean isDirReadable(DrillFileSystem fs, FileStatus dir) {
-  Path p = new Path(dir.getPath(), 
ParquetFileWriter.PARQUET_METADATA_FILE);
   try {
-if (fs.exists(p)) {
-  return true;
-} else {
-
-  if (metaDataFileExists(fs, dir)) {
-return true;
-  }
-  List statuses = DrillFileSystemUtil.listFiles(fs, 
dir.getPath(), false);
-  return !statuses.isEmpty() && super.isFileReadable(fs, 
statuses.get(0));
-}
+// There should be at least one file, which is readable by Drill
+List statuses = DrillFileSystemUtil.listFiles(fs, 
dir.getPath(), false);
+return !statuses.isEmpty() && super.isFileReadable(fs, 
statuses.get(0));
--- End diff --

I did it on purpose. With the old logic of isDirReadable() method an empty 
directory, which contains parquet metadata files, will be processes with 
ParquetGroupScan as a Parquet Table. It leads to obtaining an exception:

https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java#L878

To process such table with SchemalessScan, isReadable method should return 
false for that case. In other words it shouldn't check availability of metadata 
cache files, but only really readable files by Drill.


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160666020
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/BaseTestQuery.java ---
@@ -119,6 +125,15 @@ public static void setupDefaultTestCluster() throws 
Exception {
 // turns on the verbose errors in tests
 // sever side stacktraces are added to the message before sending back 
to the client
 test("ALTER SESSION SET `exec.errors.verbose` = true");
+emptyDirCreating();
--- End diff --

Agree.
I have changed it. Empty directory is created in the scope of one test. But 
to avoid duplicating code, when in the class there are several test cases with 
using empty dir, the last is created once for entire class. 


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160663414
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/SchemalessScan.java
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.base;
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import com.google.common.base.Preconditions;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.PhysicalOperatorSetupException;
+import org.apache.drill.exec.planner.logical.DynamicDrillTable;
+import org.apache.drill.exec.proto.CoordinationProtos;
+
+import java.util.List;
+
+/**
+ *  The type of scan operator, which allows to scan schemaless tables 
({@link DynamicDrillTable} with null selection)
+ */
+@JsonTypeName("schemaless-scan")
+public class SchemalessScan extends AbstractGroupScan implements SubScan {
+
+  public SchemalessScan(String userName) {
+super(userName);
+  }
+
+  public SchemalessScan(AbstractGroupScan that) {
+super(that);
+  }
+
+  @Override
+  public void applyAssignments(List 
endpoints) throws PhysicalOperatorSetupException {
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) throws 
ExecutionSetupException {
+return this;
+  }
+
+  @Override
+  public int getMaxParallelizationWidth() {
+return 1;
+  }
+
+  @Override
+  public String getDigest() {
+return toString();
--- End diff --

Thanks. Forgot to override this.

For now toString() is overridden. And to show selectionRoot I have replaced 
null file selection with current File Selection (with `emptyDirectory` flag, 
which indicates whether this FileSelection is an empty directory).

Profiles - > Physical Plan shows the following info:
`Scan(groupscan=[SchemalessScan [selectionRoot = 
file:/home/vitalii/Documents/parquet_for_union/folder2]]) : rowType = 
(DrillRecordRow[*, value]): rowcount = 1.0, cumulative cost = {0.0 rows, 0.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 334`


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160663989
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/TestEmptyInputSql.java ---
@@ -177,4 +177,33 @@ public void testQueryEmptyCsv() throws Exception {
 .run();
   }
 
+  @Test
+  public void testEmptyDirectory() throws Exception {
+final BatchSchema expectedSchema = new SchemaBuilder().build();
+
+testBuilder()
+.sqlQuery("select * from dfs.`%s`", EMPTY_DIR_NAME)
+.schemaBaseLine(expectedSchema)
+.build()
+.run();
+  }
+
+  @Test
+  public void testEmptyDirectoryAndFieldInQuery() throws Exception {
+final List> expectedSchema = 
Lists.newArrayList();
+final TypeProtos.MajorType majorType = 
TypeProtos.MajorType.newBuilder()
+.setMinorType(TypeProtos.MinorType.INT) // field "key" is absent 
in schema-less table
--- End diff --

Done


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160664191
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetGroupScan.java
 ---
@@ -56,65 +56,50 @@ private void prepareTables(final String tableName, 
boolean refreshMetadata) thro
   public void testFix4376() throws Exception {
 prepareTables("4376_1", true);
 
-testBuilder()
-  .sqlQuery("SELECT COUNT(*) AS `count` FROM dfs.tmp.`4376_1/60*`")
-  .ordered()
-  .baselineColumns("count").baselineValues(1984L)
-  .go();
+int actualRecordCount = testSql("SELECT * FROM dfs.tmp.`4376_1/60*`");
+int expectedRecordCount = 1984;
+assertEquals(String.format("Received unexpected number of rows in 
output: expected=%d, received=%s",
--- End diff --

Done


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160663507
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/RefreshMetadataHandler.java
 ---
@@ -78,19 +78,24 @@ public PhysicalPlan getPlan(SqlNode sqlNode) throws 
ValidationException, RelConv
 
   final Table table = schema.getTable(tableName);
 
-  if(table == null){
+  if(table == null) {
--- End diff --

Done


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160663899
  
--- Diff: exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java 
---
@@ -1197,4 +1197,64 @@ public void testFieldWithDots() throws Exception {
   .baselineValues("1", "2", "1", null, "a")
   .go();
   }
-}
\ No newline at end of file
+
+  @Test
+  public void testUnionAllRightEmptyDir() throws Exception {
--- End diff --

1. Union works fine too. I have added similar tests to 
TestUnionDistinct.java class. Thanks
2. Test case is added. The result is the same as for querying single empty 
dir.


---


[GitHub] drill pull request #1083: DRILL-4185: UNION ALL involving empty directory on...

2018-01-10 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1083#discussion_r160666274
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/BaseTestQuery.java ---
@@ -119,6 +125,15 @@ public static void setupDefaultTestCluster() throws 
Exception {
 // turns on the verbose errors in tests
 // sever side stacktraces are added to the message before sending back 
to the client
 test("ALTER SESSION SET `exec.errors.verbose` = true");
+emptyDirCreating();
+  }
+
+  /**
+   * Creates an empty directory under dfs.root schema.
+   */
+  private static void emptyDirCreating() {
--- End diff --

the method is deleted due to the previous comment


---


Re: Drill Hangout Jan 9, 2018

2018-01-10 Thread Arina Yelchiyeva
During hangout was decided to support JDK 8 by default in Apache Drill
1.13.0. We'l consider moving to Java 9 afterwards.
There is already open Jira for this issue -
https://issues.apache.org/jira/browse/DRILL-1491. I have linked all other
related issue to it as well.
Also it seems there is an Jira with open pull request to fix issues related
to JDK8 (https://issues.apache.org/jira/browse/DRILL-5730). Changes were
done by Tim.

Kind regards
Arina

On Tue, Jan 9, 2018 at 6:51 PM, Vlad Rozov  wrote:

> Should we also talk about Java 9 and Hadoop 3.x support?
>
> Thank you,
>
> Vlad
>
>
> On 1/9/18 08:36, Arina Yelchiyeva wrote:
>
>> Mark Java 8 support as critical for 1.13.0 release?
>> For example, in Calcite 1.16 support for Java 7 will be dropped.
>>
>>
>> On Tue, Jan 9, 2018 at 9:34 AM, Aditya Allamraju <
>> aditya.allamr...@gmail.com
>>
>>> wrote:
>>> Hi Vlad,
>>>
>>> If this is open to everyone in this group, i would like to join this
>>> hangout.
>>> Please let me know.
>>>
>>> Thanks
>>> Aditya
>>>
>>> On Tue, Jan 9, 2018 at 1:22 AM, Vlad Rozov  wrote:
>>>
>>> We will have our first bi-weekly hangout in the 2018 tomorrow at 10 am.
 Please reply to this post with proposed topics to discuss.

 Hangout link: https://plus.google.com/hangouts/_/event/

>>> ci4rdiju8bv04a64efj
>>>
 5fedd0lc

 Thank you,

 Vlad


>


[GitHub] drill issue #1074: DRILL-6025: Display execution time of a query in RUNNING ...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1074
  
@prasadns14 please resolve the conflicts.


---


[GitHub] drill pull request #1078: DRILL-6054: don't try to split the filter when it ...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1078#discussion_r160665449
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/FindPartitionConditions.java
 ---
@@ -195,8 +195,16 @@ private void popOpStackAndBuildFilter() {
  * For all other operators we clear the children if one of the
  * children is a no push.
  */
-assert currentOp.getOp().getKind() == SqlKind.AND;
-newFilter = currentOp.getChildren().get(0);
+if (currentOp.getOp().getKind() == SqlKind.AND) {
+  newFilter = currentOp.getChildren().get(0);
+  for(OpState opState : opStack) {
--- End diff --

Please add space: `for (`


---


[GitHub] drill pull request #1078: DRILL-6054: don't try to split the filter when it ...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1078#discussion_r160664549
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/FindPartitionConditions.java
 ---
@@ -228,13 +236,16 @@ private boolean isHolisticExpression(RexCall call) {
 return false;
   }
 
+  protected boolean inputRefToPush(RexInputRef inputRef) {
--- End diff --

Can be made private. Should be placed under public methods.


---


[GitHub] drill issue #1087: Attempt to fix memory leak in Parquet

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1087
  
@sachouche do you have Jira for this fix (if not please create it)? Could 
you please include Jira number in PR and description and PR commit message 
(it's a Drill standard) [1, 2, 3].

[1] https://drill.apache.org/docs/apache-drill-contribution-guidelines/
[2] https://github.com/apache/drill/commits/master
[3] https://github.com/apache/drill/pull/1065


---


[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1084#discussion_r160650090
  
--- Diff: 
exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js
 ---
@@ -0,0 +1,46 @@
+/**
+ * Drill SQL Syntax Snippets
+ */
+
+ace.define("ace/snippets/sql",["require","exports","module"], 
function(require, exports, module) {
+"use strict";
+
+exports.snippetText = "snippet info\n\
+   select * from INFORMATION_SCHEMA.${1:};\n\
+snippet sysmem\n\
--- End diff --

Turns out that my Chrome, it works when using -> Ctrl + space.
Could you please update Jira with list of available snippets and ways how 
to enable them?


---


[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1084#discussion_r160650903
  
--- Diff: 
exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js
 ---
@@ -0,0 +1,46 @@
+/**
+ * Drill SQL Syntax Snippets
+ */
+
+ace.define("ace/snippets/sql",["require","exports","module"], 
function(require, exports, module) {
+"use strict";
+
+exports.snippetText = "snippet info\n\
+   select * from INFORMATION_SCHEMA.${1:};\n\
+snippet sysmem\n\
+   select * from sys.mem;\n\
+snippet sysopt\n\
+   select * from sys.opt;\n\
+snippet sysbit\n\
+   select * from sys.bit;\n\
+snippet sysconn\n\
+   select * from sys.conn;\n\
+snippet sysprof\n\
+   select * from sys.prof;\n\
+snippet cview\n\
--- End diff --

I am not sure that snippets for sys tables are correct. Please see below 
list of current tables present in sys schema:
```
0: jdbc:drill:drillbit=localhost> show tables;
+---+---+
| TABLE_SCHEMA  |  TABLE_NAME   |
+---+---+
| sys   | profiles_json |
| sys   | drillbits |
| sys   | boot  |
| sys   | internal_options  |
| sys   | threads   |
| sys   | options_val   |
| sys   | profiles  |
| sys   | connections   |
| sys   | internal_options_val  |
| sys   | memory|
| sys   | version   |
| sys   | options   |
```


---


[GitHub] drill pull request #1084: DRILL-5868: Support SQL syntax highlighting of que...

2018-01-10 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1084#discussion_r160651815
  
--- Diff: 
exec/java-exec/src/main/resources/rest/static/js/ace-code-editor/snippets/sql.js
 ---
@@ -0,0 +1,46 @@
+/**
+ * Drill SQL Syntax Snippets
+ */
+
+ace.define("ace/snippets/sql",["require","exports","module"], 
function(require, exports, module) {
+"use strict";
+
+exports.snippetText = "snippet info\n\
+   select * from INFORMATION_SCHEMA.${1:};\n\
+snippet sysmem\n\
+   select * from sys.mem;\n\
+snippet sysopt\n\
+   select * from sys.opt;\n\
+snippet sysbit\n\
+   select * from sys.bit;\n\
+snippet sysconn\n\
+   select * from sys.conn;\n\
+snippet sysprof\n\
+   select * from sys.prof;\n\
+snippet cview\n\
+   create view ${1:[workspace]}.${2:} ( ${3:} )  as 
\n\
+   ${4:};\n\
+snippet ctas\n\
+   create table ${1:} ( ${2:} )  as \n\
+   ${3:};\n\
+snippet ctemp\n\
+   create temporary table ${1:} ( ${2:} )  as \n\
--- End diff --

Could you please also add snippet for create function: `CREATE FUNCTION 
USING JAR '.jar';`?


---


[jira] [Resolved] (DRILL-5037) NPE in Parquet Decimal Converter with the complex parquet reader

2018-01-10 Thread Volodymyr Vysotskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi resolved DRILL-5037.

   Resolution: Fixed
Fix Version/s: 1.12.0

Fixed in 
[42fc11e|https://github.com/apache/drill/commit/42fc11e53557477ac01c7dd31c3aa93e22fb4384]

>  NPE in Parquet Decimal Converter with the complex parquet reader
> -
>
> Key: DRILL-5037
> URL: https://issues.apache.org/jira/browse/DRILL-5037
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.9.0
>Reporter: Rahul Challapalli
> Fix For: 1.12.0
>
> Attachments: 
> 0001-Fix-the-DecimalX-writer-invocation-in-DrillParquetGr.patch, 
> drill5037.parquet
>
>
> git.commit.id.abbrev=4b1902c
> The below query fails when we enable the new parquet reader
> Query :
> {code}
> alter session set `store.parquet.use_new_reader` = true;
>  select
>  count(*) as count_star,
>   sum(a.d18)  as sum_d18,
>   --round(avg(a.d18)) as round_avg_d18,
>   cast(avg(a.d18) as bigint)  as round_avg_d18,
>   --trunc(avg(a.d18)) as trunc_avg_d18,
>   cast(avg(a.d18) as bigint)  as trunc_avg_d18,
>   --sum(case when a.d18 = 0 then 100 else round(a.d18/12) end) as 
> case_in_sum_d18,
>   cast(sum(case when a.d18 = 0 then 100 else round(a.d18/12) end) 
> as bigint) as case_in_sum_d18,
>   --coalesce(sum(case when a.d18 = 0 then 100 else 
> round(a.d18/12) end), 0) as case_in_sum_d18
>   cast(coalesce(sum(case when a.d18 = 0 then 100 else 
> round(a.d18/12) end), 0) as bigint) as case_in_sum_d18
>  
> from
>   alltypes_with_nulls a
>   left outer join alltypes_with_nulls b on (a.c_integer = 
> b.c_integer)
>   left outer join alltypes_with_nulls c on (b.c_integer = 
> c.c_integer)
> group by
>   a.c_varchar
>   ,b.c_varchar
>   ,c.c_varchar
>   ,a.c_integer
>   ,b.c_integer
>   ,c.c_integer
>   ,a.d9
>   ,b.d9
>   ,c.d9
>   ,a.d18
>   ,b.d18
>   ,c.d18
>   ,a.d28
>   ,b.d28
>   ,c.d28
>   ,a.d38
>   ,b.d38
>   ,c.d38
>   ,a.c_date
>   ,b.c_date
>   ,c.c_date
>   ,a.c_date
>   ,b.c_date
>   ,c.c_date
>   ,a.c_time
>  order by
>   a.c_varchar
>   ,b.c_varchar
>   ,c.c_varchar
>   ,a.c_integer
>   ,b.c_integer
>   ,c.c_integer
>   ,a.d9
>   ,b.d9
>   ,c.d9
>   ,a.d18
>   ,b.d18
>   ,c.d18
>   ,a.d28
>   ,b.d28
>   ,c.d28
>   ,a.d38
>   ,b.d38
>   ,c.d38
>   ,a.c_date
>   ,b.c_date
>   ,c.c_date
>   ,a.c_date
>   ,b.c_date
>   ,c.c_date
>   ,a.c_time
> {code}
> I attached the data set and error from the log file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)