date:20180205

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-02-05 Thread kkhatua

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r166196043
  
--- Diff: distribution/src/resources/auto-setup.sh ---
@@ -0,0 +1,222 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This file is invoked by drill-config.sh during a Drillbit startup and 
provides
+# default checks and autoconfiguration.
+# Distributions should not put anything in this file. Checks can be
+# specified in ${DRILL_HOME}/conf/distrib-setup.sh
+# Users should not put anything in this file. Additional checks can be 
defined
+# and put in ${DRILL_CONF_DIR}/drill-setup.sh instead.
+# To FAIL any check, return with a non-zero return code
+# e.g.
+# if [ $status == "FAILED" ]; return 1; fi
+

+###==
+# FEATURES
+# 1. Provides checks and auto-configuration for memory settings

+###==
+
+# Convert Java memory value to MB
+function valueInMB() {
+  if [ -z "$1" ]; then echo ""; return; fi
+  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
+  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
+  # Extracting Numeric Value
+  if [[ "$inputTxt" == *g ]]; then
+let valueInMB=$inputValue*1024
+  elif [[ "$DbitMaxProcMem" == *k ]]; then
+let valueInMB=$inputValue/1024
+  elif [[ "$inputTxt" == *m ]]; then
+let valueInMB=$inputValue
+  elif [[ "$inputTxt" == *% ]]; then
+#TotalRAM_inMB*percentage [Works on Linux]
+let valueInMB=$inputValue*$totalRAM_inMB/100;
+  else
+echo error;
+return 1;
+  fi
+  echo "$valueInMB"
+  return
+}
+
+# Convert Java memory value to GB
+function valueInGB() {
+  if [ -z "$1" ]; then echo ""; return; fi
+  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
+  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
+  # Extracting Numeric Value
+  if [[ "$inputTxt" == *g ]]; then
+let valueInGB=$inputValue
+  elif [[ "$DbitMaxProcMem" == *k ]]; then
+let valueInGB=$inputValue/1024/1024
+  elif [[ "$inputTxt" == *m ]]; then
+let valueInGB=$inputValue/1024
+  elif [[ "$inputTxt" == *% ]]; then
+#TotalRAM_inMB*percentage [Works on Linux]
+let valueInGB=$inputValue*`cat /proc/meminfo | grep MemTotal | tr ' ' 
'\n'| grep '[0-9]'`/1024/1024/100;
+  else
+echo error;
+return 1;
+  fi
+  echo "$valueInGB"
+  return
+}
+
+# Estimates code cache based on total heap and direct
+function estCodeCacheInMB() {
+  local totalHeapAndDirect=$1
+  if [ $totalHeapAndDirect -le 4096 ]; then echo 512;
+  elif [ $totalHeapAndDirect -le 10240 ]; then echo 768;
+  else echo 1024;
+  fi
+}
+
+#Print Current Allocation
+function printCurrAllocation()
+{
+  if [ -n "$DRILLBIT_MAX_PROC_MEM" ]; then echo -e 
"\tDRILLBIT_MAX_PROC_MEM=$DRILLBIT_MAX_PROC_MEM"; fi
+  if [ -n "$DRILL_HEAP" ]; then echo -e "\tDRILL_HEAP=$DRILL_HEAP"; fi
+  if [ -n "$DRILL_MAX_DIRECT_MEMORY" ]; then echo -e 
"\tDRILL_MAX_DIRECT_MEMORY=$DRILL_MAX_DIRECT_MEMORY"; fi
+  if [ -n "$DRILLBIT_CODE_CACHE_SIZE" ]; then
+echo -e "\tDRILLBIT_CODE_CACHE_SIZE=$DRILLBIT_CODE_CACHE_SIZE "
+echo -e "\t*NOTE: It is recommended not to specify 
DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize 
and would not exceed 1GB"
+  fi
+}
+

+#
+# Check and auto-configuration for memory settings

+#
+#Default (Track status of this check: "" => Continue checking ; "PASSED" 
=> no more check required)
+AutoMemConfigStatus=""
+
+#Computing existing system information
+# Tested on Linux (CentOS/RHEL/Ubuntu); Cygwin (Win10Pro-64bit)
+if [[ "$OSTYPE" == *linux* ]] || [[

[jira] [Created] (DRILL-6139) Travis CI hangs on TestVariableWidthWriter#testRestartRow

2018-02-05 Thread Boaz Ben-Zvi (JIRA)

Boaz Ben-Zvi created DRILL-6139:
---

 Summary: Travis CI hangs on TestVariableWidthWriter#testRestartRow
 Key: DRILL-6139
 URL: https://issues.apache.org/jira/browse/DRILL-6139
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.12.0
Reporter: Boaz Ben-Zvi


The Travis CI fails (probably hangs, then times out) in the following test:
{code:java}
Running org.apache.drill.test.rowSet.test.DummyWriterTest Running 
org.apache.drill.test.rowSet.test.DummyWriterTest#testDummyScalar Running 
org.apache.drill.test.rowSet.test.DummyWriterTest#testDummyMap Tests run: 2, 
Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.109 sec - in 
org.apache.drill.test.rowSet.test.DummyWriterTest Running 
org.apache.drill.test.rowSet.test.TestVariableWidthWriter Running 
org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testSkipNulls Running 
org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testWrite Running 
org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testFillEmpties 
Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testRollover 
Running org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testSizeLimit 
Running 
org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testRolloverWithEmpties
 Running 
org.apache.drill.test.rowSet.test.TestVariableWidthWriter#testRestartRow Killed
 
Results : 

Tests run: 1554, Failures: 0, Errors: 0, Skipped: 66{code}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] drill issue #1113: DRILL-5902: Regression: Queries encounter random failure ...

2018-02-05 Thread vrozov

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1113
  
@arina-ielchiieva Please review


---

[GitHub] drill pull request #1113: DRILL-5902: Regression: Queries encounter random f...

2018-02-05 Thread vrozov

GitHub user vrozov opened a pull request:

https://github.com/apache/drill/pull/1113

DRILL-5902: Regression: Queries encounter random failure due to RPC 
connection timed out



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/drill DRILL-5902

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1113.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1113


commit fe329c2517710bb9fdec273de24321717fc954e6
Author: Vlad Rozov 
Date:   2018-02-06T03:15:56Z

DRILL-5902: Regression: Queries encounter random failure due to RPC 
connection timed out




---

[jira] [Created] (DRILL-6138) Move RecordBatchSizer to org.apache.drill.exec.record package

2018-02-05 Thread Padma Penumarthy (JIRA)

Padma Penumarthy created DRILL-6138:
---

 Summary: Move RecordBatchSizer to org.apache.drill.exec.record 
package
 Key: DRILL-6138
 URL: https://issues.apache.org/jira/browse/DRILL-6138
 Project: Apache Drill
  Issue Type: Task
  Components: Execution - Flow
Affects Versions: 1.12.0
Reporter: Padma Penumarthy
Assignee: Padma Penumarthy
 Fix For: 1.13.0


Move RecordBatchSizer from org.apache.drill.exec.physical.impl.spill package to 
org.apache.drill.exec.record package.

Minor refactoring - change columnSizes from list to map. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] drill pull request #1112: DRILL-6114: Metadata revisions

2018-02-05 Thread paul-rogers

GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/1112

DRILL-6114: Metadata revisions

This PR is part of the "[batch handling 
updates\https://github.com/paul-rogers/drill/wiki/Batch-Handling-Upgrades]; 
project. It completes the new internal metadata system by including support for 
the remaining vector types: unions, lists and repeated lists.

The metadata code was refactored. Previously, it was small enough to fit 
into a single file (with nested classes). With the added complexity, the 
metadata classes were split out into separate classes, grouped into its own 
Java package.

A few fixes were made here and there to ensure the unit tests pass.

@ppadma or @bitblender, can one of you run the pre-commit tests and send me 
the details of any failures? 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-6114B

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1112.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1112


commit e0f11f923ea48070f829a859395ee66b81b9ba18
Author: Paul Rogers 
Date:   2018-02-06T04:18:18Z

DRILL-6114: Metadata revisions

Support for union vectors, list vectors, repeated list vectors. Refactored 
metadata classes.




---

Re: Apache Drill with Azure Data Lake Store

2018-02-05 Thread Saurabh Mahapatra

Hi Kamal,

My understanding was that the file system running on top of Azure data store 
was still HDFS? Is that true? If that be the case, the DFS plugin should work. 
It is worth a test.

Thanks,
Saurabh

Sent from my iPhone

> On Feb 3, 2018, at 6:02 PM, Kamal Baig  wrote:
> 
> Hi
> 
> I am looking for some help around connecting and processing data stored in
> Azure Data lake store (Not the Azure Blob)
> 
> using Apache Drill
> 
> Any help and suggestion would be highly appreciated. I am a beginner with
> Apache Drill so any docs or steps would be great to get started
> 
> Thanks

Apache Drill with Azure Data Lake Store

2018-02-05 Thread Kamal Baig

Hi

I am looking for some help around connecting and processing data stored in
Azure Data lake store (Not the Azure Blob)

using Apache Drill

Any help and suggestion would be highly appreciated. I am a beginner with
Apache Drill so any docs or steps would be great to get started

Thanks

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-02-05 Thread kkhatua

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r166158426
  
--- Diff: distribution/src/resources/drill-config.sh ---
@@ -180,18 +251,61 @@ else
   fi
 fi
 
-# Default memory settings if none provided by the environment or
+# Execute distrib-setup.sh for any distribution-specific setup (e.g. 
checks).
+# distrib-setup.sh is optional; it is created by some distribution 
installers
+# that need additional distribution-specific setup to be done.
+# Because installers will have site-specific steps, the file
+# should be moved into the site directory, if the user employs one.
+
+# Checking if being executed in context of Drillbit and not SQLLine
+if [ "$DRILLBIT_CONTEXT" == "1" ]; then
+  # Check whether to run exclusively distrib-setup.sh OR auto-setup.sh
+  distribSetup="$DRILL_CONF_DIR/distrib-setup.sh" ; #Site-based 
distrib-setup.sh
+  if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
+distribSetup="$DRILL_HOME/conf/distrib-setup.sh" ; #Install-based 
distrib-setup.sh
+if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
+  # Run Default Auto Setup
+  distribSetup="$DRILL_HOME/bin/auto-setup.sh"
+fi
+  fi
+  # Check and run additional setup defined by user
+  drillSetup="$DRILL_CONF_DIR/drill-setup.sh" ; #Site-based drill-setup.sh
+  if [ $(checkExecutableLineCount $drillSetup) -eq 0 ]; then
+drillSetup="$DRILL_HOME/conf/drill-setup.sh" ; #Install-based 
drill-setup.sh
+if [ $(checkExecutableLineCount $drillSetup) -eq 0 ]; then 
drillSetup=""; fi
+  fi
+
+  # Enforcing checks in order (distrib-setup.sh , drill-setup.sh)
+  # (NOTE: A script is executed only if it has relevant executable lines)
+  # Both distribSetup & drillSetup are executed because the user might 
have introduced additional checks
+  if [ -n "$distribSetup" ]; then
+. "$distribSetup"
+if [ $? -gt 0 ]; then fatal_error "Aborting Drill Startup due failed 
setup by $distribSetup"; fi
--- End diff --

The auto-configuration scripts do indeed do that. However, I thought it 
would be good to have a higher level error message also indicating the source 
of the failure. This allows us to catch any non-zero exit codes that might be 
thrown and not handled cleanly. Other sections of `drill-config.sh` followed 
this principle.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-02-05 Thread kkhatua

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r166157864
  
--- Diff: distribution/src/resources/drill-config.sh ---
@@ -180,18 +251,61 @@ else
   fi
 fi
 
-# Default memory settings if none provided by the environment or
+# Execute distrib-setup.sh for any distribution-specific setup (e.g. 
checks).
+# distrib-setup.sh is optional; it is created by some distribution 
installers
+# that need additional distribution-specific setup to be done.
+# Because installers will have site-specific steps, the file
+# should be moved into the site directory, if the user employs one.
+
+# Checking if being executed in context of Drillbit and not SQLLine
+if [ "$DRILLBIT_CONTEXT" == "1" ]; then
+  # Check whether to run exclusively distrib-setup.sh OR auto-setup.sh
+  distribSetup="$DRILL_CONF_DIR/distrib-setup.sh" ; #Site-based 
distrib-setup.sh
+  if [ $(checkExecutableLineCount $distribSetup) -eq 0 ]; then
--- End diff --

I'd have liked the KISS principle, but I thought there was a need for 
placeholder `distrib-setup.sh` file.
Based on that, I need to figure out whether there is a distribtion-specific 
setup, or should we revert to executing the `auto-setup.sh`. Unlike sourcing 
environment files, where an unset variable can be set, for auto-setup, the 
choice of execution has to be mutually exclusive.
This block looks complicated (and verbose with the comments), but is only 
identifying *what* setup script needs to execute. Hence, all we do here is an 
assignment of the variables.


---

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-02-05 Thread kkhatua

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r166157561
  
--- Diff: distribution/src/resources/auto-setup.sh ---
@@ -0,0 +1,222 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This file is invoked by drill-config.sh during a Drillbit startup and 
provides
+# default checks and autoconfiguration.
+# Distributions should not put anything in this file. Checks can be
+# specified in ${DRILL_HOME}/conf/distrib-setup.sh
+# Users should not put anything in this file. Additional checks can be 
defined
+# and put in ${DRILL_CONF_DIR}/drill-setup.sh instead.
+# To FAIL any check, return with a non-zero return code
+# e.g.
+# if [ $status == "FAILED" ]; return 1; fi
+

+###==
+# FEATURES
+# 1. Provides checks and auto-configuration for memory settings

+###==
+
+# Convert Java memory value to MB
+function valueInMB() {
+  if [ -z "$1" ]; then echo ""; return; fi
+  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
+  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
+  # Extracting Numeric Value
+  if [[ "$inputTxt" == *g ]]; then
+let valueInMB=$inputValue*1024
+  elif [[ "$DbitMaxProcMem" == *k ]]; then
+let valueInMB=$inputValue/1024
+  elif [[ "$inputTxt" == *m ]]; then
+let valueInMB=$inputValue
+  elif [[ "$inputTxt" == *% ]]; then
+#TotalRAM_inMB*percentage [Works on Linux]
+let valueInMB=$inputValue*$totalRAM_inMB/100;
+  else
+echo error;
+return 1;
+  fi
+  echo "$valueInMB"
+  return
+}
+
+# Convert Java memory value to GB
+function valueInGB() {
+  if [ -z "$1" ]; then echo ""; return; fi
+  local inputTxt=`echo $1| tr '[A-Z]' '[a-z]'`
+  local inputValue=`echo ${inputTxt:0:${#inputTxt}-1}`;
+  # Extracting Numeric Value
+  if [[ "$inputTxt" == *g ]]; then
+let valueInGB=$inputValue
+  elif [[ "$DbitMaxProcMem" == *k ]]; then
+let valueInGB=$inputValue/1024/1024
+  elif [[ "$inputTxt" == *m ]]; then
+let valueInGB=$inputValue/1024
+  elif [[ "$inputTxt" == *% ]]; then
+#TotalRAM_inMB*percentage [Works on Linux]
+let valueInGB=$inputValue*`cat /proc/meminfo | grep MemTotal | tr ' ' 
'\n'| grep '[0-9]'`/1024/1024/100;
+  else
+echo error;
+return 1;
+  fi
+  echo "$valueInGB"
+  return
+}
+
+# Estimates code cache based on total heap and direct
+function estCodeCacheInMB() {
+  local totalHeapAndDirect=$1
+  if [ $totalHeapAndDirect -le 4096 ]; then echo 512;
+  elif [ $totalHeapAndDirect -le 10240 ]; then echo 768;
+  else echo 1024;
+  fi
+}
+
+#Print Current Allocation
+function printCurrAllocation()
+{
+  if [ -n "$DRILLBIT_MAX_PROC_MEM" ]; then echo -e 
"\tDRILLBIT_MAX_PROC_MEM=$DRILLBIT_MAX_PROC_MEM"; fi
+  if [ -n "$DRILL_HEAP" ]; then echo -e "\tDRILL_HEAP=$DRILL_HEAP"; fi
+  if [ -n "$DRILL_MAX_DIRECT_MEMORY" ]; then echo -e 
"\tDRILL_MAX_DIRECT_MEMORY=$DRILL_MAX_DIRECT_MEMORY"; fi
+  if [ -n "$DRILLBIT_CODE_CACHE_SIZE" ]; then
+echo -e "\tDRILLBIT_CODE_CACHE_SIZE=$DRILLBIT_CODE_CACHE_SIZE "
+echo -e "\t*NOTE: It is recommended not to specify 
DRILLBIT_CODE_CACHE_SIZE as this will be auto-computed based on the HeapSize 
and would not exceed 1GB"
+  fi
+}
+

+#
+# Check and auto-configuration for memory settings

+#
+#Default (Track status of this check: "" => Continue checking ; "PASSED" 
=> no more check required)
+AutoMemConfigStatus=""
+
+#Computing existing system information
+# Tested on Linux (CentOS/RHEL/Ubuntu); Cygwin (Win10Pro-64bit)
+if [[ "$OSTYPE" == *linux* ]] || [[

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

2018-02-05 Thread kkhatua

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1082#discussion_r166157369
  
--- Diff: distribution/src/assemble/bin.xml ---
@@ -345,6 +345,21 @@
   0755
   conf
 
+
+  src/resources/auto-setup.sh
+  0755
+  bin
+
+
+  src/resources/drill-setup.sh
+  0755
+  conf
+
+
+  src/resources/distrib-setup.sh
--- End diff --

The `distrib-setup.sh` file is empty, but provided the placeholder to 
indicate where distributions should make the change.
This is identical to the intent of having `distrib-env.sh` in the Apache 
distribution, which is also empty but serves the same purpose.

https://github.com/apache/drill/blob/master/distribution/src/resources/distrib-env.sh
Just following the same convention.


---

[GitHub] drill issue #1011: Drill 1170: Drill-on-YARN

2018-02-05 Thread sachouche

Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/1011
  
+1
I have reviewed the code and overall looks good. My main feedback is that 
the current implementation doesn't currently support secure clusters (at least 
didn't see any logic associated with that). Yarn applications have issues 
staying up for a long time because of ticket renewal limitations. We might want 
to create another enhancement JIRA to support such use-cases.


---

[GitHub] drill issue #1107: DRILL-6123: Limit batch size for Merge Join based on memo...

2018-02-05 Thread ppadma

Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1107
  
@sachouche @ilooner @paul-rogers Can one of you review this PR for me ?


---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

2018-02-05 Thread ppadma

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/1101#discussion_r166096630
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -215,6 +206,7 @@ public BatchHolder() {
   MaterializedField outputField = materializedValueFields[i];
   // Create a type-specific ValueVector for this value
   vector = TypeHelper.getNewVector(outputField, allocator);
+  int columnSize = new RecordBatchSizer.ColumnSize(vector).estSize;
--- End diff --

there is already stdSize which is kind of doing the same thing. can we use 
that instead of knownSize ?


---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

2018-02-05 Thread ppadma

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/1101#discussion_r166142178
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java
 ---
@@ -65,6 +70,14 @@
 
 public int stdSize;
 
+/**
+ * If the we can determine the exact width of the row of a vector 
upfront,
+ * the row widths is saved here. If we cannot determine the exact width
+ * (for example for VarChar or Repeated vectors), then
+ */
+
+private int knownSize = -1;
--- End diff --

Like I mentioned in other comment, seems like we can just use stdSize. 


---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

2018-02-05 Thread ppadma

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/1101#discussion_r166098364
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -140,6 +131,9 @@
   private OperatorContext oContext;
   private BufferAllocator allocator;
 
+  private Map keySizes;
+  // The size estimates for varchar value columns. The keys are the index 
of the varchar value columns.
+  private Map varcharValueSizes;
--- End diff --

Don't you need to adjust size estimates for repeated types also ?


---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

2018-02-05 Thread ppadma

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/1101#discussion_r166141274
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -733,28 +780,32 @@ private void restoreReservedMemory() {
* @param records
*/
   private void allocateOutgoing(int records) {
-// Skip the keys and only allocate for outputting the workspace values
-// (keys will be output through splitAndTransfer)
-Iterator outgoingIter = outContainer.iterator();
-for (int i = 0; i < numGroupByOutFields; i++) {
-  outgoingIter.next();
-}
-
 // try to preempt an OOM by using the reserved memory
 useReservedOutgoingMemory();
 long allocatedBefore = allocator.getAllocatedMemory();
 
-while (outgoingIter.hasNext()) {
+for (int columnIndex = numGroupByOutFields; columnIndex < 
outContainer.getNumberOfColumns(); columnIndex++) {
+  final VectorWrapper wrapper = 
outContainer.getValueVector(columnIndex);
   @SuppressWarnings("resource")
-  ValueVector vv = outgoingIter.next().getValueVector();
+  final ValueVector vv = wrapper.getValueVector();
 
-  AllocationHelper.allocatePrecomputedChildCount(vv, records, 
maxColumnWidth, 0);
+  final RecordBatchSizer.ColumnSize columnSizer = new 
RecordBatchSizer.ColumnSize(wrapper.getValueVector());
+  int columnSize;
+
+  if (columnSizer.hasKnownSize()) {
+// For fixed width vectors we know the size of each record
+columnSize = columnSizer.getKnownSize();
+  } else {
+// For var chars we need to use the input estimate
+columnSize = varcharValueSizes.get(columnIndex);
+  }
+
+  AllocationHelper.allocatePrecomputedChildCount(vv, records, 
columnSize, 0);
--- End diff --

I think we should also get elementCount from sizer and use that instead of 
passing 0. 


---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

2018-02-05 Thread ppadma

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/1101#discussion_r166142507
  
--- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java ---
@@ -298,6 +298,11 @@ public int getPayloadByteCount(int valueCount) {
 return valueCount * ${type.width};
   }
 
+  @Override
+  public int getValueWidth() {
--- End diff --

If we are using stdSize and get the value from TypeHelper, we don't need 
all value vectors to have this new function.


---

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

2018-02-05 Thread ppadma

Github user ppadma commented on a diff in the pull request:

https://github.com/apache/drill/pull/1101#discussion_r166136279
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
@@ -226,7 +221,7 @@ public BatchHolder() {
 ((FixedWidthVector) vector).allocateNew(HashTable.BATCH_SIZE);
   } else if (vector instanceof VariableWidthVector) {
 // This case is never used  a varchar falls under 
ObjectVector which is allocated on the heap !
-((VariableWidthVector) vector).allocateNew(maxColumnWidth, 
HashTable.BATCH_SIZE);
+((VariableWidthVector) vector).allocateNew(columnSize, 
HashTable.BATCH_SIZE);
--- End diff --

for a just allocated vector,  estSize will return 0. how can we use that 
for allocation ? 


---

[GitHub] drill issue #1011: Drill 1170: Drill-on-YARN

2018-02-05 Thread priteshm

Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1011
  
@sachouche @vrozov @arina-ielchiieva please review


---

[jira] [Created] (DRILL-6137) Join Failure When Some Json File Partitions Empty

2018-02-05 Thread Timothy Farkas (JIRA)

Timothy Farkas created DRILL-6137:
-

 Summary: Join Failure When Some Json File Partitions Empty
 Key: DRILL-6137
 URL: https://issues.apache.org/jira/browse/DRILL-6137
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas


The following exception can occurr when the following query is executed:

{code}
select t.p_partkey, t1.ps_suppkey from dfs.`join/empty_part/part` as t RIGHT 
JOIN dfs.`join/empty_part/partsupp` as t1 ON t.p_partkey = t1.ps_partkey where 
t1.ps_partkey > 1
{code}

* part has one nonempty file 0_0_0.json
* partsupp has one nonempty file 0_0_0.json and one empty file 0_0_1.json

{code}
  (java.lang.IllegalStateException) next() [on #10, RemovingRecordBatch] called 
again after it returned NONE.  Caller should not have called next() again.

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():220
org.apache.drill.exec.record.AbstractRecordBatch.next():119

org.apache.drill.exec.test.generated.HashJoinProbeGen2.executeProbePhase():119
org.apache.drill.exec.test.generated.HashJoinProbeGen2.probeAndProject():227
org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():222
org.apache.drill.exec.record.AbstractRecordBatch.next():164

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():228
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
org.apache.drill.exec.record.AbstractRecordBatch.next():164

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():228
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
org.apache.drill.exec.record.AbstractRecordBatch.next():164

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():228
org.apache.drill.exec.physical.impl.BaseRootExec.next():105

org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():93
org.apache.drill.exec.physical.impl.BaseRootExec.next():95
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():233
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():415
org.apache.hadoop.security.UserGroupInformation.doAs():1657
org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Google Hangouts: Lateral Join High Level Design Presentation

2018-02-05 Thread Timothy Farkas

Hi All,

Aman and Sorabh will be talking about the high level design of lateral join in 
the next hangout session tomorrow. Since lateral join is a big topic they'll 
talk about more of the details of the design after Parth comes back in another 
hangout session.

Thanks,
Tim

[GitHub] drill issue #1111: Upgrade drill-hive libraries to 2.1.1 version.

2018-02-05 Thread priteshm

Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/
  
@vrozov can you please review this change?


---

RE: PCAP files with Apache Drill and Sergeant R

2018-02-05 Thread Kunal Khatua

I don’t think you can (or even want to) directly access them, assuming that the 
HTTP link you shared is your intended way of accessing the data. 

Bringing them into Amazon S3 will make it easier to spin up Drill and access 
the data, and you could even use the 'tmp' workspace or create temporary tables 
within a Drill session to work on the data without having to repeatedly pull in 
the raw data from S3. 

-Original Message-
From: Houssem Hosni [mailto:houssem.ho...@lip6.fr] 
Sent: Monday, February 05, 2018 9:44 AM
To: dev@drill.apache.org
Subject: PCAP files with Apache Drill and Sergeant R

Hi,
I am sending this mail with a hope to get some help from you.
I am working on making some analysis and prediction models on large pcap files.
Can Apache Drill with R Sergeant library help me in this context.
Actually the pcap files are so large (MAWI) and they are available on the 
web(https://urldefense.proofpoint.com/v2/url?u=http-3A__mawi.wide.ad.jp_mawi_samplepoint-2DF_2018_=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=-cT6otg6lpT_XkmYy7yg3A=ph9AC7KBFF30DWucRa-rMCB36AlwdjoovGNbm5YOzDk=Q-snTp608TWJGp5jKX5QCGEkQYOQMLem3NOc3khl0xE=).
 I want to access them via apache Drill and then make some analysis using 
Sergeant package
(R) that works well with Drill.
Should I bring those large MAWI pcap files on the web to Amazon S3 and then 
access them with DRILL or is it possible to access them directly without amazon 
storage ?
What steps should I start with ?
Special THANKS in advance for considering my request.
Best regards,
Houssem Hosni
LIP6 - Sorbonne University
houssem.ho...@lip6.fr
Place Jussieu, 75005 Paris.
Tel: (+0033)0644087200

[GitHub] drill pull request #1104: DRILL-6118: Handle item star columns during projec...

2018-02-05 Thread chunhui-shi

Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1104#discussion_r166066830
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/project/ProjectRecordBatch.java
 ---
@@ -596,10 +596,10 @@ private void classifyExpr(final NamedExpression ex, 
final RecordBatch incoming,
 final NameSegment ref = ex.getRef().getRootSegment();
 final boolean exprHasPrefix = 
expr.getPath().contains(StarColumnHelper.PREFIX_DELIMITER);
 final boolean refHasPrefix = 
ref.getPath().contains(StarColumnHelper.PREFIX_DELIMITER);
-final boolean exprIsStar = expr.getPath().equals(SchemaPath.WILDCARD);
-final boolean refContainsStar = 
ref.getPath().contains(SchemaPath.WILDCARD);
-final boolean exprContainsStar = 
expr.getPath().contains(SchemaPath.WILDCARD);
-final boolean refEndsWithStar = 
ref.getPath().endsWith(SchemaPath.WILDCARD);
+final boolean exprIsStar = 
expr.getPath().equals(SchemaPath.DYNAMIC_STAR);
--- End diff --

Why don't we need to handle WILDCARD case anymore?


---

[GitHub] drill pull request #1104: DRILL-6118: Handle item star columns during projec...

2018-02-05 Thread chunhui-shi

Github user chunhui-shi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1104#discussion_r166094020
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterItemStarReWriterRule.java
 ---
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableSet;
+import org.apache.calcite.adapter.enumerable.EnumerableTableScan;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.RelOptTable;
+import org.apache.calcite.prepare.RelOptTableImpl;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.CorrelationId;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rel.core.TableScan;
+import org.apache.calcite.rel.logical.LogicalFilter;
+import org.apache.calcite.rel.logical.LogicalProject;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelDataTypeFactory;
+import org.apache.calcite.rel.type.RelDataTypeField;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexInputRef;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexVisitorImpl;
+import org.apache.calcite.schema.Table;
+import org.apache.drill.exec.planner.types.RelDataTypeDrillImpl;
+import org.apache.drill.exec.planner.types.RelDataTypeHolder;
+import org.apache.drill.exec.util.Utilities;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static 
org.apache.drill.exec.planner.logical.FieldsReWriterUtil.DesiredField;
+import static 
org.apache.drill.exec.planner.logical.FieldsReWriterUtil.FieldsReWriter;
+
+/**
+ * Rule will transform filter -> project -> scan call with item star 
fields in filter
+ * into project -> filter -> project -> scan where item star fields are 
pushed into scan
+ * and replaced with actual field references.
+ *
+ * This will help partition pruning and push down rules to detect fields 
that can be pruned or push downed.
+ * Item star operator appears when sub-select or cte with star are used as 
source.
+ */
+public class DrillFilterItemStarReWriterRule extends RelOptRule {
+
+  public static final DrillFilterItemStarReWriterRule INSTANCE = new 
DrillFilterItemStarReWriterRule(
+  RelOptHelper.some(Filter.class, RelOptHelper.some(Project.class, 
RelOptHelper.any( TableScan.class))),
+  "DrillFilterItemStarReWriterRule");
+
+  private DrillFilterItemStarReWriterRule(RelOptRuleOperand operand, 
String id) {
+super(operand, id);
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+Filter filterRel = call.rel(0);
+Project projectRel = call.rel(1);
+TableScan scanRel = call.rel(2);
+
+ItemStarFieldsVisitor itemStarFieldsVisitor = new 
ItemStarFieldsVisitor(filterRel.getRowType().getFieldNames());
--- End diff --

Other test cases should be covered are: 
nested field names, 
refer to two different fields under the same parent, eg. a.b and a.c.
and array type referred in filters and projects.


---

[GitHub] drill pull request #1111: Upgrade drill-hive libraries to 2.1.1 version.

2018-02-05 Thread vdiravka

GitHub user vdiravka opened a pull request:

https://github.com/apache/drill/pull/

Upgrade drill-hive libraries to 2.1.1 version.

Updating hive properties for tests and resolving dependencies and API 
conflicts:

* Allowing of using Hive's own calcite-core and avatica versions by 
hive-exec.
Calcite version is removed from root POM Dependency Management
* Fix for "hive.metastore.schema.verification", MetaException(message:
Version information not found in metastore)
https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool
METASTORE_SCHEMA_VERIFICATION="false" property is added
* Fix JSONException class is not found (excluded banned org.json dependency)
* Added METASTORE_AUTO_CREATE_ALL="true", properties to tests, because some 
additional
tables are necessary in Hive metastore
* Disabling calcite CBO for (Hive's CalcitePlanner) for tests, because it 
is in conflict
with Drill's Calcite version for Drill unit tests. HIVE_CBO_ENABLED="false" 
property
* jackson and parquet libraries are relocated in hive-exec-shade module
* org.apache.parquet:parquet-column Drill version is added to "hive-exec" to
allow of using Parquet empty group on MessageType level (PARQUET-278)
* Removing of commons-codec exclusion from hive core. This dependency is
necessary for hive-exec and hive-metastore.
* Setting Hive's internal properties for transactional scan:
HiveConf.HIVE_TRANSACTIONAL_TABLE_SCAN and for schema evolution: 
HiveConf.HIVE_SCHEMA_EVOLUTION,
IOConstants.SCHEMA_EVOLUTION_COLUMNS, 
IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vdiravka/drill DRILL-5978

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #


commit 476c44cce7a38ff818c5e328e3db61773bab18a3
Author: Vitalii Diravka 
Date:   2017-11-13T16:04:03Z

Upgrade drill-hive libraries to 2.1.1 version.

Updating hive properties for tests and resolving dependencies and API 
conflicts:
* Allowing of using Hive's own calcite-core and avatica versions by 
hive-exec.
Calcite version is removed from root POM Dependency Management
* Fix for "hive.metastore.schema.verification", MetaException(message:
Version information not found in metastore)
https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool
METASTORE_SCHEMA_VERIFICATION="false" property is added
* Fix JSONException class is not found (excluded banned org.json dependency)
* Added METASTORE_AUTO_CREATE_ALL="true", properties to tests, because some 
additional
tables are necessary in Hive metastore
* Disabling calcite CBO for (Hive's CalcitePlanner) for tests, because it 
is in conflict
with Drill's Calcite version for Drill unit tests. HIVE_CBO_ENABLED="false" 
property
* jackson and parquet libraries are relocated in hive-exec-shade module
* org.apache.parquet:parquet-column Drill version is added to "hive-exec" to
allow of using Parquet empty group on MessageType level (PARQUET-278)
* Removing of commons-codec exclusion from hive core. This dependency is
necessary for hive-exec and hive-metastore.
* Setting Hive's internal properties for transactional scan:
HiveConf.HIVE_TRANSACTIONAL_TABLE_SCAN and for schema evolution: 
HiveConf.HIVE_SCHEMA_EVOLUTION,
IOConstants.SCHEMA_EVOLUTION_COLUMNS, 
IOConstants.SCHEMA_EVOLUTION_COLUMNS_TYPES




---

PCAP files with Apache Drill and Sergeant R

2018-02-05 Thread Houssem Hosni


Hi,
I am sending this mail with a hope to get some help from you.
I am working on making some analysis and prediction models on large pcap
files.
Can Apache Drill with R Sergeant library help me in this context.
Actually the pcap files are so large (MAWI) and they are available on the
web(http://mawi.wide.ad.jp/mawi/samplepoint-F/2018/). I want to access
them via apache Drill and then make some analysis using Sergeant package
(R) that works well with Drill.
Should I bring those large MAWI pcap files on the web to Amazon S3 and  
then access them with DRILL or is it possible to access them directly  
without amazon storage ?

What steps should I start with ?
Special THANKS in advance for considering my request.
Best regards,
Houssem Hosni
LIP6 - Sorbonne University
houssem.ho...@lip6.fr
Place Jussieu, 75005 Paris.
Tel: (+0033)0644087200

[jira] [Created] (DRILL-6136) drill-jdbc-all jar missing dependencies

2018-02-05 Thread Craig Foote (JIRA)

Craig Foote created DRILL-6136:
--

 Summary: drill-jdbc-all jar missing dependencies
 Key: DRILL-6136
 URL: https://issues.apache.org/jira/browse/DRILL-6136
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.12.0
Reporter: Craig Foote


Using drill-jdbc-all-1.12.0,jar with logstash (elasticsearch ingester) returns 
NoClassDefFoundError for oadd.org.apache.drill.exec.store.StoragePluginRegistry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-6135) New Feature: SHOW CREATE VIEW command

2018-02-05 Thread Hari Sekhon (JIRA)

Hari Sekhon created DRILL-6135:
--

 Summary: New Feature: SHOW CREATE VIEW command
 Key: DRILL-6135
 URL: https://issues.apache.org/jira/browse/DRILL-6135
 Project: Apache Drill
  Issue Type: New Feature
  Components: Metadata, Storage - Information Schema
Affects Versions: 1.10.0
 Environment: MapR 5.2 + Kerberos
Reporter: Hari Sekhon


Feature Request to implement
{code:java}
SHOW CREATE VIEW ;{code}
A colleague and I just had to cat the view file which is non-pretty json and 
hard to read a large view creation statement that could have been presented in 
drill shell and formatted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-6134) Many Drill queries fail when using JDBC Driver from Simba

2018-02-05 Thread Robert Hou (JIRA)

Robert Hou created DRILL-6134:
-

 Summary: Many Drill queries fail when using JDBC Driver from Simba
 Key: DRILL-6134
 URL: https://issues.apache.org/jira/browse/DRILL-6134
 Project: Apache Drill
  Issue Type: Bug
Reporter: Robert Hou
Assignee: Pritesh Maker


Here is an example:

Query: 
/root/drillAutomation/framework-master/framework/resources/Functional/limit0/union/data/union_51.q
{noformat}
(SELECT c2 FROM `union_01_v` ORDER BY c5 DESC nulls first) UNION (SELECT c2 
FROM `union_02_v` ORDER BY c5 ASC nulls first){noformat}
This is the error:
{noformat}
Exception:

java.sql.SQLException: [JDBC Driver]The field c2(BIGINT:OPTIONAL) 
[$bits$(UINT1:REQUIRED), $values$(BIGINT:OPTIONAL)] doesn't match the provided 
metadata major_type {
  minor_type: BIGINT
  mode: OPTIONAL
}
name_part {
  name: "$values$"
}
value_count: 18
buffer_length: 144
.
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:145)
at org.apache.drill.exec.vector.BigIntVector.load(BigIntVector.java:287)
at 
org.apache.drill.exec.vector.NullableBigIntVector.load(NullableBigIntVector.java:274)
at 
org.apache.drill.exec.record.RecordBatchLoader.load(RecordBatchLoader.java:131)
at 
com.mapr.drill.drill.dataengine.DRJDBCResultSet.doLoadRecordBatchData(Unknown 
Source)
at com.mapr.drill.drill.dataengine.DRJDBCResultSet.hasMoreRows(Unknown 
Source)
at 
com.mapr.drill.drill.dataengine.DRJDBCResultSet.doMoveToNextRow(Unknown Source)
at com.mapr.drill.jdbc.common.CommonResultSet.moveToNextRow(Unknown 
Source)
at com.mapr.drill.jdbc.common.SForwardResultSet.next(Unknown Source)
at 
org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:255)
at 
org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: The field c2(BIGINT:OPTIONAL) 
[$bits$(UINT1:REQUIRED), $values$(BIGINT:OPTIONAL)] doesn't match the provided 
metadata major_type {
  minor_type: BIGINT
  mode: OPTIONAL
}
name_part {
  name: "$values$"
}
value_count: 18
buffer_length: 144
.
... 16 more{noformat}
 

The commit that causes these errors to occur is:
{noformat}
https://issues.apache.org/jira/browse/DRILL-6049
Rollup of hygiene changes from "batch size" project
commit ID e791ed62b1c91c39676c4adef438c689fd84fd4b{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

[jira] [Created] (DRILL-6139) Travis CI hangs on TestVariableWidthWriter#testRestartRow

[GitHub] drill issue #1113: DRILL-5902: Regression: Queries encounter random failure ...

[GitHub] drill pull request #1113: DRILL-5902: Regression: Queries encounter random f...

[jira] [Created] (DRILL-6138) Move RecordBatchSizer to org.apache.drill.exec.record package

[GitHub] drill pull request #1112: DRILL-6114: Metadata revisions

Re: Apache Drill with Azure Data Lake Store

Apache Drill with Azure Data Lake Store

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

[GitHub] drill pull request #1082: DRILL-5741: Automatically manage memory allocation...

[GitHub] drill issue #1011: Drill 1170: Drill-on-YARN

[GitHub] drill issue #1107: DRILL-6123: Limit batch size for Merge Join based on memo...

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

[GitHub] drill pull request #1101: DRILL-6032: Made the batch sizing for HashAgg more...

[GitHub] drill issue #1011: Drill 1170: Drill-on-YARN

[jira] [Created] (DRILL-6137) Join Failure When Some Json File Partitions Empty

Google Hangouts: Lateral Join High Level Design Presentation

[GitHub] drill issue #1111: Upgrade drill-hive libraries to 2.1.1 version.

RE: PCAP files with Apache Drill and Sergeant R

[GitHub] drill pull request #1104: DRILL-6118: Handle item star columns during projec...

[GitHub] drill pull request #1104: DRILL-6118: Handle item star columns during projec...

[GitHub] drill pull request #1111: Upgrade drill-hive libraries to 2.1.1 version.

PCAP files with Apache Drill and Sergeant R

[jira] [Created] (DRILL-6136) drill-jdbc-all jar missing dependencies

[jira] [Created] (DRILL-6135) New Feature: SHOW CREATE VIEW command

[jira] [Created] (DRILL-6134) Many Drill queries fail when using JDBC Driver from Simba

32 matches

Site Navigation

Mail list logo

Footer information