[jira] [Created] (DRILL-6377) typeof() does not return DECIMAL scale, precision

2018-05-01 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6377:
--

 Summary: typeof() does not return DECIMAL scale, precision
 Key: DRILL-6377
 URL: https://issues.apache.org/jira/browse/DRILL-6377
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.13.0
Reporter: Paul Rogers


The {{typeof()}} function returns the type of a column:

{noformat}
SELECT typeof(CAST(a AS DOUBLE)) FROM (VALUES (1)) AS T(a);
+-+
| EXPR$0  |
+-+
| FLOAT8  |
+-+
{noformat}

In Drill, the {{DECIMAL}} type is parameterized with scale and precision. 
However, {{typeof()}} does not return this information:

{noformat}
ALTER SESSION SET `planner.enable_decimal_data_type` = true;

SELECT typeof(CAST(a AS DECIMAL)) FROM (VALUES (1)) AS T(a);
+--+
|  EXPR$0  |
+--+
| DECIMAL38SPARSE  |
+--+

SELECT typeof(CAST(a AS DECIMAL(6, 3))) FROM (VALUES (1)) AS T(a);
+---+
|  EXPR$0   |
+---+
| DECIMAL9  |
+---+
{noformat}

Expected something of the form {{DECIMAL(, )}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6376) Doc: Return type of ROUND(x, y), TRUNC(x, y), TO_NUMBER is wrong

2018-05-01 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-6376:
--

 Summary: Doc: Return type of ROUND(x, y), TRUNC(x, y), TO_NUMBER 
is wrong
 Key: DRILL-6376
 URL: https://issues.apache.org/jira/browse/DRILL-6376
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.13.0
Reporter: Paul Rogers
Assignee: Bridget Bevens


The documentation for [math 
functions|http://drill.apache.org/docs/math-and-trig/] claims that the return 
value of {{ROUND(x, y)}} and {{TRUNC(x, y)}} is {{DECIMAL}}. A test shows that 
this is not true:

{noformat}
SELECT typeof(ROUND(a, 2)) FROM (VALUES (1.2345)) AS T(a);
+-+
| EXPR$0  |
+-+
| FLOAT8  |
+-+
SELECT typeof(TRUNC(a, 2)) FROM (VALUES (1.2345)) AS T(a);
+-+
| EXPR$0  |
+-+
| FLOAT8  |
+-+
{noformat}

Maybe it is {{DECIMAL}} only if we enable decimal type? Let's try:

{noformat}
ALTER SESSION SET `planner.enable_decimal_data_type` = true;
SELECT typeof(TRUNC(a, 2)) FROM (VALUES (1.2345)) AS T(a);
+-+
| EXPR$0  |
+-+
| FLOAT8  |
+-+
{noformat}

So, {{ROUND()}} and {{TRUNC()}} actually return {{DOUBLE}}.

The [type convertion|http://drill.apache.org/docs/data-type-conversion/] 
documentation says that {{TO_NUMBER(str, fmt)}} returns {{DECIMAL}}. Let's try:

{noformat}
ALTER SESSION SET `planner.enable_decimal_data_type` = true;
SELECT typeof(TO_NUMBER(a, '0')) FROM (VALUES ('1')) AS T(a);
+-+
| EXPR$0  |
+-+
| FLOAT8  |
+-+
{noformat}

So, {{TO_NUMBER()}} actually returns {{DOUBLE}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185383432
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/udf/dynamic/JarBuilder.java 
---
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.udf.dynamic;
+
+import ch.qos.logback.classic.Level;
+import ch.qos.logback.classic.Logger;
+import ch.qos.logback.classic.LoggerContext;
+import ch.qos.logback.classic.spi.ILoggingEvent;
+import ch.qos.logback.core.ConsoleAppender;
+import org.apache.maven.cli.MavenCli;
+import org.apache.maven.cli.logging.Slf4jLogger;
+import org.codehaus.plexus.DefaultPlexusContainer;
+import org.codehaus.plexus.PlexusContainer;
+import org.codehaus.plexus.logging.BaseLoggerManager;
+import org.slf4j.LoggerFactory;
+
+import java.util.LinkedList;
+import java.util.List;
+
+public class JarBuilder {
+
+  private final MavenCli cli;
+
+  public JarBuilder() {
+this.cli = new MavenCli() {
+  @Override
+  protected void customizeContainer(PlexusContainer container) {
+((DefaultPlexusContainer) container).setLoggerManager(new 
BaseLoggerManager() {
+  @Override
+  protected org.codehaus.plexus.logging.Logger createLogger(String 
s) {
+return new Slf4jLogger(setupLogger(JarBuilder.class.getName(), 
Level.INFO));
+  }
+});
+  }
+};
+  }
+
+  /**
+   * Builds jars using embedded maven. Includes files / resources based 
given pattern,
+   * otherwise using defaults provided in pom.xml.
+   *
+   * @param jarName jar name
+   * @param projectDir project dir
+   * @param includeFiles pattern indicating which files should be included
+   * @param includeResources pattern indicating which resources should be 
included
+   *
+   * @return build exit code, 0 if build was successful
+   */
+  public int build(String jarName, String projectDir, String includeFiles, 
String includeResources) {
+System.setProperty("maven.multiModuleProjectDirectory", projectDir);
+List params = new LinkedList<>();
+params.add("clean");
+params.add("package");
+params.add("-DskipTests");
+params.add("-Djar.finalName=" + jarName);
+if (includeFiles != null) {
+  params.add("-Dinclude.files=" + includeFiles);
+}
+if (includeResources != null) {
+  params.add("-Dinclude.resources=" + includeResources);
+}
+return cli.doMain(params.toArray(new String[params.size()]), 
projectDir, System.out, System.err);
+  }
+
+  private static Logger setupLogger(String string, Level logLevel) {
--- End diff --

Is this necessary?


---


[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185385028
  
--- Diff: exec/java-exec/src/test/resources/drill-udf/pom.xml ---
@@ -0,0 +1,90 @@
+
+
+http://maven.apache.org/POM/4.0.0;
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+  4.0.0
+
+  org.apache.drill.udf
+  drill-udf
+  1.0
+
+  
+${project.name}
+1.13.0
--- End diff --

Is it OK to use old version? Does Drill support semver API compatibility 
for UDFs? If yes, how is it enforced? If no, compilation may fail.


---


[GitHub] drill pull request #1248: DRIL-6027: Implement Spilling for the Hash-Join

2018-05-01 Thread Ben-Zvi
GitHub user Ben-Zvi opened a pull request:

https://github.com/apache/drill/pull/1248

DRIL-6027: Implement Spilling for the Hash-Join

This PR covers the work to enable the Hash-Join operator (*HJ*) to spill - 
when its limited memory becomes too small to hold the incoming data. 
 @ilooner is a co-contributor of this work.

Below is a high level description of the main changes, to help the 
reviewers. More design detail is available in the design document 
(https://docs.google.com/document/d/1-c_oGQY4E5d58qJYv_zc7ka834hSaB3wDQwqKcMoSAI/)
Some of this work follows a prior similar work done for the Hash-Aggregate 
(*HAG*) operator; some similarity to the HAG is mentioned to help reviewrs 
familiar with those changes.

h2. Partitions:
Just like the HAG spilling, the main idea to enable spilling is to split 
the incoming rows into separate *Partitions*, such that the HJ can gradually 
adopt to a memory pressure situation by picking an in-memory partition and 
spilling it as the need arises, thus freeing some memory.
Unlike the HAG, the HJ has two incomings - the build/inner/right and the 
probe/outer/left. The HJ partitions its Build side first, and if needed, may 
spill some of these partitions as data is read. Later the Probe side is read 
and partitioned the same way, where outer partitions matching spilled inner 
partitions are spilled as well - unconditionally.

h6. {{HashPartition}} class:
A new class {{HashPartition}} was created to encapsulate the work of each 
partition; this class handles the pair - the build-side partition and its 
matching probe-side partition. Most of its code was extracted from prior code 
in {{HashJoinBatch}}.

h4. Hash Values:
The hash-values are computed at first time, then saved into a special 
column (named "Hash_Values"), which may be spilled, etc. This avoids 
recomputation (unlike the HAG, which recomputes). After reading a batch from a 
spill file, this Hash-values vector is separated (into {{read_HV_vector}}) and 
used instead of computing the hash values.

h4. Build Hash Table:
Unlike the HAG - the hash-table (and "helper") are built (per each inner 
partition) only *after* that whole partition was read into memory. (This avoids 
wasted work, in case the partition needs to spill). Another improvement: As the 
number of entries is known at that final time (ignoring duplicates), then the 
hash table can be initially sized right, avoiding the need for later costly 
resizings (see {{hashTable.updateInitialCapacity()}}). 

h4. Same as the HAG:
* Same metrics (NUM_PARTITIONS, SPILLED_PARTITIONS, SPILL_MB, 
SPILL_CYCLE) 
* Using the {{SpillSet}} class.
* Recursive spilling. (Nearly the same code - see {{innerNext()}} in 
{{HashJoinBatch.java}}). Except that the HJ may have duplicate entries - so 
when the spill cycle has consumed more than 20 bits of the hash value, then err.
* Option controlling the number of partitions (and when that number is 1 
--> spilling is disabled).

h6. Avoid copying:
Copying the incoming build data into the partitions' batches is a new extra 
step, adding some overhead. To match performance with prior Drill, in case of a 
single partition (no spilling, no memory checks) -- the incoming vectors are 
used as is, without copying. Future work may extend this for the general case 
(involving memory checks, etc.)

h2. Memory Calculations:
h4. Initial memory allocation:
The HJ was made a "buffered" operator (see {{isBufferedOperator()}}, just 
like the HAG and the External Sort), hence gets assigned an equal memory share 
(out of the "memory per query per node"; see 
{{setupBufferedOpsMemoryAllocations()}}). Except when the number of partitions 
is forced to be 1, when it "falls back" to the "old uncontrolled" behavior 
(similar to what was done for the HAG).

h4. Memory Calculator:
The memory calculator is knowlegable of the current and future memory needs 
(including current memory usage of all the partitions, an outgoing batch, an 
incoming outer batch, and the hash tables and "helpers"). The calculator is 
used first to find an optimal number of partitions (starting from a number 
controll by {{hashjoin_num_partitions}}, default 32, and lowering if that 
number requires too much memory). The second use of the calculator is to 
determine if a spill is needed, prior to allocating more memory (see 
{{shouldSpill()}}). This chack is performed at two places: When reading the 
build side and about to allocate a new batch (see {{appendInnerRow()}}). And 
when hash tables (and helpers) are allocated for the in-memory partitions (in 
{{executeBuildPhase()}}).

h6. Implementation:
The {{HashJoinMemoryCalculator}} is an interface, implemented by 
{{HashJoinMemoryCalculatorImpl}} for regular work. For testing, we can limit 
the number of batches {{max_batches_in_memory}} - and then anther 
implementation 

[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185353418
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestTpchDistributedConcurrent.java
 ---
@@ -177,7 +177,7 @@ public void run() {
 }
   }
 
-  //@Test
--- End diff --

What is the reason the test was disabled before?


---


[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185374827
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTTAS.java ---
@@ -164,121 +166,113 @@ public void 
testResolveTemporaryTableWithPartialSchema() throws Exception {
   @Test
   public void testPartitionByWithTemporaryTables() throws Exception {
 String temporaryTableName = "temporary_table_with_partitions";
-mockRandomUUID(UUID.nameUUIDFromBytes(temporaryTableName.getBytes()));
+cleanSessionDirectory();
 test("create TEMPORARY table %s partition by (c1) as select * from (" +
 "select 'A' as c1 from (values(1)) union all select 'B' as c1 from 
(values(1))) t", temporaryTableName);
-checkPermission(temporaryTableName);
+checkPermission();
   }
 
-  @Test(expected = UserRemoteException.class)
+  @Test
   public void testCreationOutsideOfDefaultTemporaryWorkspace() throws 
Exception {
-try {
-  String temporaryTableName = 
"temporary_table_outside_of_default_workspace";
-  test("create TEMPORARY table %s.%s as select 'A' as c1 from 
(values(1))", temp2_schema, temporaryTableName);
-} catch (UserRemoteException e) {
-  assertThat(e.getMessage(), containsString(String.format(
-  "VALIDATION ERROR: Temporary tables are not allowed to be 
created / dropped " +
-  "outside of default temporary workspace [%s].", 
DFS_TMP_SCHEMA)));
-  throw e;
-}
+String temporaryTableName = 
"temporary_table_outside_of_default_workspace";
+
+thrown.expect(UserRemoteException.class);
+thrown.expectMessage(containsString(String.format(
+"VALIDATION ERROR: Temporary tables are not allowed to be created 
/ dropped " +
+"outside of default temporary workspace [%s].", 
DFS_TMP_SCHEMA)));
+
+test("create TEMPORARY table %s.%s as select 'A' as c1 from 
(values(1))", temp2_schema, temporaryTableName);
   }
 
-  @Test(expected = UserRemoteException.class)
+  @Test
   public void testCreateWhenTemporaryTableExistsWithoutSchema() throws 
Exception {
 String temporaryTableName = "temporary_table_exists_without_schema";
-try {
-  test("create TEMPORARY table %s as select 'A' as c1 from 
(values(1))", temporaryTableName);
-  test("create TEMPORARY table %s as select 'A' as c1 from 
(values(1))", temporaryTableName);
-} catch (UserRemoteException e) {
-  assertThat(e.getMessage(), containsString(String.format(
- "VALIDATION ERROR: A table or view with given name [%s]" +
- " already exists in schema [%s]", temporaryTableName, 
DFS_TMP_SCHEMA)));
-  throw e;
-}
+
+thrown.expect(UserRemoteException.class);
+thrown.expectMessage(containsString(String.format(
+"VALIDATION ERROR: A table or view with given name [%s]" +
+" already exists in schema [%s]", temporaryTableName, 
DFS_TMP_SCHEMA)));
+
+test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", 
temporaryTableName);
+test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", 
temporaryTableName);
   }
 
-  @Test(expected = UserRemoteException.class)
+  @Test
   public void testCreateWhenTemporaryTableExistsCaseInsensitive() throws 
Exception {
 String temporaryTableName = "temporary_table_exists_without_schema";
-try {
-  test("create TEMPORARY table %s as select 'A' as c1 from 
(values(1))", temporaryTableName);
-  test("create TEMPORARY table %s as select 'A' as c1 from 
(values(1))", temporaryTableName.toUpperCase());
-} catch (UserRemoteException e) {
-  assertThat(e.getMessage(), containsString(String.format(
-  "VALIDATION ERROR: A table or view with given name [%s]" +
-  " already exists in schema [%s]", 
temporaryTableName.toUpperCase(), DFS_TMP_SCHEMA)));
-  throw e;
-}
+
+thrown.expect(UserRemoteException.class);
+thrown.expectMessage(containsString(String.format(
+"VALIDATION ERROR: A table or view with given name [%s]" +
--- End diff --

and possibly `expectUserRemoteExceptionWithTableExistsMessage(String 
tableName, String schemaName)`.


---


[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185375540
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTTAS.java ---
@@ -498,47 +489,50 @@ public void 
testDropTemporaryTableAsViewWithoutException() throws Exception {
 .go();
   }
 
-  @Test(expected = UserRemoteException.class)
+  @Test
   public void testDropTemporaryTableAsViewWithException() throws Exception 
{
 String temporaryTableName = 
"temporary_table_to_drop_like_view_with_exception";
 test("create TEMPORARY table %s as select 'A' as c1 from (values(1))", 
temporaryTableName);
 
-try {
-  test("drop view %s.%s", DFS_TMP_SCHEMA, temporaryTableName);
-} catch (UserRemoteException e) {
-  assertThat(e.getMessage(), containsString(String.format(
-  "VALIDATION ERROR: Unknown view [%s] in schema [%s]", 
temporaryTableName, DFS_TMP_SCHEMA)));
-  throw e;
+thrown.expect(UserRemoteException.class);
+thrown.expectMessage(containsString(String.format(
+"VALIDATION ERROR: Unknown view [%s] in schema [%s]", 
temporaryTableName, DFS_TMP_SCHEMA)));
+
+test("drop view %s.%s", DFS_TMP_SCHEMA, temporaryTableName);
+  }
+
+  private static String getSessionId() throws Exception {
--- End diff --

Consider mocking getSessionId() in the `UserSession`. This method needs to 
be tested by itself.


---


[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185369981
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/coord/zk/TestZookeeperClient.java
 ---
@@ -125,7 +125,7 @@ public void testHasPathThrowsDrillRuntimeException() {
 
 Mockito
 .when(client.getCache().getCurrentData(absPath))
-.thenThrow(Exception.class);
+.thenThrow(RuntimeException.class);
--- End diff --

OK, but I am not sure what does this method test. 
`ZookeeperClient.hasPath(String path)` is not used in production.


---


[GitHub] drill pull request #1225: DRILL-6272: Refactor dynamic UDFs and function ini...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1225#discussion_r185374205
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTTAS.java ---
@@ -164,121 +166,113 @@ public void 
testResolveTemporaryTableWithPartialSchema() throws Exception {
   @Test
   public void testPartitionByWithTemporaryTables() throws Exception {
 String temporaryTableName = "temporary_table_with_partitions";
-mockRandomUUID(UUID.nameUUIDFromBytes(temporaryTableName.getBytes()));
+cleanSessionDirectory();
 test("create TEMPORARY table %s partition by (c1) as select * from (" +
 "select 'A' as c1 from (values(1)) union all select 'B' as c1 from 
(values(1))) t", temporaryTableName);
-checkPermission(temporaryTableName);
+checkPermission();
   }
 
-  @Test(expected = UserRemoteException.class)
+  @Test
   public void testCreationOutsideOfDefaultTemporaryWorkspace() throws 
Exception {
-try {
-  String temporaryTableName = 
"temporary_table_outside_of_default_workspace";
-  test("create TEMPORARY table %s.%s as select 'A' as c1 from 
(values(1))", temp2_schema, temporaryTableName);
-} catch (UserRemoteException e) {
-  assertThat(e.getMessage(), containsString(String.format(
-  "VALIDATION ERROR: Temporary tables are not allowed to be 
created / dropped " +
-  "outside of default temporary workspace [%s].", 
DFS_TMP_SCHEMA)));
-  throw e;
-}
+String temporaryTableName = 
"temporary_table_outside_of_default_workspace";
+
+thrown.expect(UserRemoteException.class);
--- End diff --

Consider introducing a new method to set `thrown` and message, something 
like `void expectUserRemoteExceptionWithMessage(String message)`.


---


[jira] [Created] (DRILL-6375) ANY_VALUE aggregate function

2018-05-01 Thread Gautam Kumar Parai (JIRA)
Gautam Kumar Parai created DRILL-6375:
-

 Summary: ANY_VALUE aggregate function
 Key: DRILL-6375
 URL: https://issues.apache.org/jira/browse/DRILL-6375
 Project: Apache Drill
  Issue Type: New Feature
  Components: Functions - Drill
Affects Versions: 1.13.0
Reporter: Gautam Kumar Parai
Assignee: Gautam Kumar Parai
 Fix For: 1.14.0


We had discussions on the Apache Calcite [1] and Apache Drill [2] mailing lists 
regarding an equivalent for DISTINCT ON. The community seems to prefer the 
ANY_VALUE. This Jira is a placeholder for implementing the ANY_VALUE aggregate 
function in Apache Drill. We should also eventually contribute it to Apache 
Calcite.

[1]https://lists.apache.org/thread.html/f2007a489d3a5741875bcc8a1edd8d5c3715e5114ac45058c3b3a42d@%3Cdev.calcite.apache.org%3E

[2]https://lists.apache.org/thread.html/2517eef7410aed4e88b9515f7e4256335215c1ad39a2676a08d21cb9@%3Cdev.drill.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill issue #1247: DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Dr...

2018-05-01 Thread jiang-wu
Github user jiang-wu commented on the issue:

https://github.com/apache/drill/pull/1247
  
@parthchandra @vdiravka A new pull request that uses 
java.time.Local{Date|Time|DateTime}.


---


[GitHub] drill issue #1184: DRILL-6242 - Use java.sql.[Date|Time|Timestamp] classes t...

2018-05-01 Thread jiang-wu
Github user jiang-wu commented on the issue:

https://github.com/apache/drill/pull/1184
  
@parthchandra @vdiravka I finally completed the changes on using 
Local{Date|Time|DateTime}.  I made a new clean pull request for that here: 
https://github.com/apache/drill/pull/1247 



---


[GitHub] drill issue #1247: DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Dr...

2018-05-01 Thread jiang-wu
Github user jiang-wu commented on the issue:

https://github.com/apache/drill/pull/1247
  
Please see 
https://issues.apache.org/jira/browse/DRILL-6242?focusedCommentId=16459369=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16459369
 on the results of this change.  The behavior is the same as the current Drill 
behavior, except returning Local{Date|Time|DateTime} upon reading from the 
vectors.

Notice the differences in Drill behavior in handling the date time data 
from different data sources.  We can separately decide how to make those 
consistent.  Fixing those differences are out of scope for this pull request.


---


[GitHub] drill pull request #1247: DRILL-6242 Use java.time.Local{Date|Time|DateTime}...

2018-05-01 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1247#discussion_r185358628
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/expr/fn/impl/DateUtility.java 
---
@@ -639,29 +648,95 @@ public static String getTimeZone(int index) {
 return timezoneList[index];
   }
 
+  /**
+   * Parse given string into a LocalDate
+   */
+  public static LocalDate parseLocalDate(final String value) {
+  return LocalDate.parse(value, formatDate);
+  }
+
+  /**
+   * Parse given string into a LocalTime
+   */
+  public static LocalTime parseLocalTime(final String value) {
+  return LocalTime.parse(value, formatTime);
+  }
+
+  /**
+   * Parse the given string into a LocalDateTime.
+   */
+  public static LocalDateTime parseLocalDateTime(final String value) {
+  return LocalDateTime.parse(value, formatTimeStamp);
+  }
+
   // Returns the date time formatter used to parse date strings
   public static DateTimeFormatter getDateTimeFormatter() {
 
 if (dateTimeTZFormat == null) {
-  DateTimeFormatter dateFormatter = 
DateTimeFormat.forPattern("-MM-dd");
-  DateTimeParser optionalTime = DateTimeFormat.forPattern(" 
HH:mm:ss").getParser();
-  DateTimeParser optionalSec = 
DateTimeFormat.forPattern(".SSS").getParser();
-  DateTimeParser optionalZone = DateTimeFormat.forPattern(" 
ZZZ").getParser();
+  DateTimeFormatter dateFormatter = 
DateTimeFormatter.ofPattern("-MM-dd");
+  DateTimeFormatter optionalTime = DateTimeFormatter.ofPattern(" 
HH:mm:ss");
+  DateTimeFormatter optionalSec = DateTimeFormatter.ofPattern(".SSS");
+  DateTimeFormatter optionalZone = DateTimeFormatter.ofPattern(" ZZZ");
 
-  dateTimeTZFormat = new 
DateTimeFormatterBuilder().append(dateFormatter).appendOptional(optionalTime).appendOptional(optionalSec).appendOptional(optionalZone).toFormatter();
+  dateTimeTZFormat = new DateTimeFormatterBuilder().parseLenient()
+  .append(dateFormatter)
+  .appendOptional(optionalTime)
+  .appendOptional(optionalSec)
+  .appendOptional(optionalZone)
+  .toFormatter();
 }
 
 return dateTimeTZFormat;
   }
 
+  /**
--- End diff --

parseBest is used only by JUnit tests when the string value is not very 
strict.  Example, "2018-1-1 12:1" instead of "2018-01-01 12:01".  This method 
is more lenient and tolerating missing parts when parsing a date time. 


---


[GitHub] drill pull request #1247: DRILL-6242 Use java.time.Local{Date|Time|DateTime}...

2018-05-01 Thread jiang-wu
Github user jiang-wu commented on a diff in the pull request:

https://github.com/apache/drill/pull/1247#discussion_r185358343
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/expr/fn/impl/DateUtility.java 
---
@@ -639,29 +648,95 @@ public static String getTimeZone(int index) {
 return timezoneList[index];
   }
 
+  /**
--- End diff --

The "parseLocalDate", "parseLocalTime", "parseLocalDateTime" are used by 
various junit tests.  These parsers are strict in that if the input string 
doesn't have all the specified fields, it will fail to parse.


---


[GitHub] drill pull request #1247: DRILL-6242 Use java.time.Local{Date|Time|DateTime}...

2018-05-01 Thread jiang-wu
GitHub user jiang-wu opened a pull request:

https://github.com/apache/drill/pull/1247

DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, 
and Timestamp types

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold 
values from corresponding Drill date, time, and timestamp types.
* See 
https://issues.apache.org/jira/browse/DRILL-6242?focusedCommentId=16459369=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16459369
* This is a revised version of https://github.com/apache/drill/pull/1184

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiang-wu/drill DRILL-6242-LocalDateTime

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1247


commit b00638da507e6211d57c9ea7d6308f323aad9519
Author: jiang-wu 
Date:   2018-05-01T21:48:26Z

DRILL-6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, 
Timestamp types. (#3)

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold 
values from corresponding Drill date, time, and timestamp types.




---


[GitHub] drill pull request #1224: DRILL-6321: Customize Drill's conformance. Allow s...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1224#discussion_r185351651
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillConformance.java
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import org.apache.calcite.sql.validate.SqlConformanceEnum;
+import org.apache.calcite.sql.validate.SqlDelegatingConformance;
+
+/**
+ * Drill's SQL conformance is SqlConformanceEnum.DEFAULT except for method 
isApplyAllowed().
+ * Since Drill is going to allow OUTER APPLY and CROSS APPLY to allow each 
row from left child of Join
+ * to join with output of right side (sub-query or table function that 
will be invoked for each row).
+ * Refer to DRILL-5999 for more information.
+ */
+public class DrillConformance extends SqlDelegatingConformance {
--- End diff --

Why not to introduce top-level class when needed. To override the behavior 
of the single method an anonymous class is more than sufficient.


---


[jira] [Created] (DRILL-6374) TPCH Queries regressed and OOM when run concurrency test

2018-05-01 Thread Dechang Gu (JIRA)
Dechang Gu created DRILL-6374:
-

 Summary: TPCH Queries regressed and OOM when run concurrency test
 Key: DRILL-6374
 URL: https://issues.apache.org/jira/browse/DRILL-6374
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.14.0
 Environment: RHEL 7
Reporter: Dechang Gu
Assignee: Vitalii Diravka
 Fix For: 1.14.0
 Attachments: TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json, 
TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json

Run TPCH regression test on Apache Drill 1.14.0 master commit 
6fcaf4268eddcb09010b5d9c5dfb3b3be5c3f903 (DRILL-6173), most of the queries 
regressed. in particular, Queries 9 takes about 4x time (36 sec vs 8.6 sec), 
comparing to that when run against the parent commit  
(9173308710c3decf8ff745493ad3e85ccdaf7c37).   Further in the concurrency test 
for the commit, with 48 clients each running 16 TPCH queries  (so total 768 
queries are executed) with planner.width.max_per_node=5,  some queries hit OOM 
and caused 266 queries failed, while for the parent commit all the 768 queries 
completed successfully.   Profiles for TPCH_09 in the regression tests are 
uploaded (the failing commit 
file name: TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json, and the 
parent commit file name: TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json 
).




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill issue #1224: DRILL-6321: Customize Drill's conformance. Allow support ...

2018-05-01 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1224
  
+1 Overall. Note that this is needed for implementing Lateral join and 
Unnest support.


---


[GitHub] drill pull request #1246: Drill 6242 Use java.time.Local{Date|Time|DateTime}...

2018-05-01 Thread jiang-wu
Github user jiang-wu closed the pull request at:

https://github.com/apache/drill/pull/1246


---


[GitHub] drill pull request #1241: DRILL-6364: Handle Cluster Info in WebUI when exis...

2018-05-01 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1241#discussion_r185339738
  
--- Diff: exec/java-exec/src/main/resources/rest/index.ftl ---
@@ -252,33 +255,129 @@
   timeout = setTimeout(reloadStatus, refreshTime);
   }
 
-  function fillStatus(data,size) {
-  var status_map = (data.responseJSON);
-  for (i = 1; i <= size; i++) {
-var address = 
$("#row-"+i).find("#address").contents().get(0).nodeValue;
-address = address.trim();
-var port = $("#row-"+i).find("#port").html();
-var key = address+"-"+port;
+  function fillStatus(dataResponse,size) {
+  var status_map = (dataResponse.responseJSON);
+  //In case localhost has gone down (i.e. we don't know status 
from ZK)
+  if (typeof status_map == 'undefined') {
+//Query other nodes for state details
+for (j = 1; j <= size; j++) {
+  if ($("#row-"+j).find("#current").html() == "Current") {
+continue; //Skip LocalHost
+  }
+  var address = 
$("#row-"+j).find("#address").contents().get(0).nodeValue.trim();
+  var restPort = 
$("#row-"+j).find("#httpPort").contents().get(0).nodeValue.trim();
+  var altStateUrl = location.protocol + "//" + 
address+":"+restPort + "/state";
+  var goatResponse = $.getJSON(altStateUrl)
+.done(function(stateDataJson) {
+//Update Status & Buttons for alternate stateData
+if (typeof status_map == 'undefined') {
+  status_map = (stateDataJson); //Update
+  updateStatusAndShutdown(stateDataJson);
+}
+  });
+  //Don't loop any more
+  if (typeof status_map != 'undefined') {
+break;
+  }
+}
+  } else {
+updateStatusAndShutdown(status_map);
+  }
+  }
+
+  function updateStatusAndShutdown(status_map) {
+let bitMap = {};
+if (typeof status_map != 'undefined') {
+for (var k in status_map) {
+  bitMap[k] = status_map[k];
+}
+}
+for (i = 1; i <= size; i++) {
+let key = "";
+if ($("#row-"+i).find("#stateKey").length > 0) { //Check if 
newBit that has no stateKey
--- End diff --

Made a switch to `currentRow`
The stateKey is irrelevant and not needed any more. We actually weren't 
injecting it anymore, so the block was never executed. 


---


[GitHub] drill pull request #1246: Drill 6242 Use java.time.Local{Date|Time|DateTime}...

2018-05-01 Thread jiang-wu
GitHub user jiang-wu opened a pull request:

https://github.com/apache/drill/pull/1246

Drill 6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Ti…

* DRILL-6242 - Use java.time.Local[Date|Time|DateTime] classes to hold 
values from corresponding Drill date, time, and timestamp types.
* See 
https://issues.apache.org/jira/browse/DRILL-6242?focusedCommentId=16459369=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16459369
* This is a revised version of https://github.com/apache/drill/pull/1184

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiang-wu/drill DRILL-6242-LocalDateTime

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1246.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1246


commit b7f5938fa65d7b54b407c244ffe9c28613bcfa0f
Author: jiang-wu 
Date:   2018-05-01T21:10:08Z

Drill 6242 Use java.time.Local{Date|Time|DateTime} for Drill Date, Time, 
Timestamp types (#2)

* DRILL-6242 - Use java.time.Local[Date|Time|DateTime] classes to hold 
values from corresponding Drill date, time, and timestamp types.




---


[GitHub] drill pull request #1245: Drill 6242 - Use Java.time.Local{Date|Time|DateTim...

2018-05-01 Thread jiang-wu
GitHub user jiang-wu opened a pull request:

https://github.com/apache/drill/pull/1245

Drill 6242 - Use Java.time.Local{Date|Time|DateTime} classes for values 
from Drill Date, Time, and Timestamp vectors

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold 
values from corresponding Drill date, time, and timestamp types.
* See 
https://issues.apache.org/jira/browse/DRILL-6242?focusedCommentId=16459369=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16459369
* This is a revised version of https://github.com/apache/drill/pull/1184

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiang-wu/drill DRILL-6242

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1245.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1245


commit acd97a5f512bf06871f12a064c867f443da8bd6f
Author: jiang-wu 
Date:   2018-05-01T20:54:28Z

Drill 6242 master local (#1)

* DRILL-6242 - Use java.time.Local{Date|Time|DateTime} classes to hold 
values from corresponding Drill date, time, and timestamp types.




---


[GitHub] drill pull request #1224: DRILL-6321: Customize Drill's conformance. Allow s...

2018-05-01 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1224#discussion_r185225979
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillConformance.java
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import org.apache.calcite.sql.validate.SqlConformanceEnum;
+import org.apache.calcite.sql.validate.SqlDelegatingConformance;
+
+/**
+ * Drill's SQL conformance is SqlConformanceEnum.DEFAULT except for method 
isApplyAllowed().
+ * Since Drill is going to allow OUTER APPLY and CROSS APPLY to allow each 
row from left child of Join
+ * to join with output of right side (sub-query or table function that 
will be invoked for each row).
+ * Refer to DRILL-5999 for more information.
+ */
+public class DrillConformance extends SqlDelegatingConformance {
--- End diff --

If changes are introduced to `DrillConformance`, it can be refactored later 
to be a top-level class. For now, I suggest avoid optimizing for future 
requirements that may never materialize.


---


ApacheCon North America 2018 schedule is now live.

2018-05-01 Thread Rich Bowen

Dear Apache Enthusiast,

We are pleased to announce our schedule for ApacheCon North America 
2018. ApacheCon will be held September 23-27 at the Montreal Marriott 
Chateau Champlain in Montreal, Canada.


Registration is open! The early bird rate of $575 lasts until July 21, 
at which time it goes up to $800. And the room block at the Marriott 
($225 CAD per night, including wifi) closes on August 24th.


We will be featuring more than 100 sessions on Apache projects. The 
schedule is now online at https://apachecon.com/acna18/


The schedule includes full tracks of content from Cloudstack[1], 
Tomcat[2], and our GeoSpatial community[3].


We will have 4 keynote speakers, two of whom are Apache members, and two 
from the wider community.


On Tuesday, Apache member and former board member Cliff Schmidt will be 
speaking about how Amplio uses technology to educate and improve the 
quality of life of people living in very difficult parts of the 
world[4]. And Apache Fineract VP Myrle Krantz will speak about how Open 
Source banking is helping the global fight against poverty[5].


Then, on Wednesday, we’ll hear from Bridget Kromhout, Principal Cloud 
Developer Advocate from Microsoft, about the really hard problem in 
software - the people[6]. And Euan McLeod, ‎VP VIPER at ‎Comcast will 
show us the many ways that Apache software delivers your favorite shows 
to your living room[7].


ApacheCon will also feature old favorites like the Lightning Talks, the 
Hackathon (running the duration of the event), PGP key signing, and lots 
of hallway-track time to get to know your project community better.


Follow us on Twitter, @ApacheCon, and join the disc...@apachecon.com 
mailing list (send email to discuss-subscr...@apachecon.com) to stay up 
to date with developments. And if your company wants to sponsor this 
event, get in touch at h...@apachecon.com for opportunities that are 
still available.


See you in Montreal!

Rich Bowen
VP Conferences, The Apache Software Foundation
h...@apachecon.com
@ApacheCon

[1] http://cloudstackcollab.org/
[2] http://tomcat.apache.org/conference.html
[3] http://apachecon.dukecon.org/acna/2018/#/schedule?search=geospatial
[4] 
http://apachecon.dukecon.org/acna/2018/#/scheduledEvent/df977fd305a31b903
[5] 
http://apachecon.dukecon.org/acna/2018/#/scheduledEvent/22c6c30412a3828d6
[6] 
http://apachecon.dukecon.org/acna/2018/#/scheduledEvent/fbbb2384fa91ebc6b
[7] 
http://apachecon.dukecon.org/acna/2018/#/scheduledEvent/88d50c3613852c2de


[GitHub] drill issue #1042: DRILL-5261: Expose REST endpoint in zookeeper

2018-05-01 Thread xhochy
Github user xhochy commented on the issue:

https://github.com/apache/drill/pull/1042
  
@kkhatua Thanks, I'll have a look at that PR.


---


[GitHub] drill pull request #1042: DRILL-5261: Expose REST endpoint in zookeeper

2018-05-01 Thread xhochy
Github user xhochy closed the pull request at:

https://github.com/apache/drill/pull/1042


---