[jira] [Commented] (DRILL-4530) Improve metadata cache performance for queries with single partition

2016-06-16 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335374#comment-15335374
 ] 

Aman Sinha commented on DRILL-4530:
---

I got the performance numbers for the query2 (see description) scenario where 
there are 400K files in the table. 
{noformat}
query2:
  current master branch (without the patch): 60 seconds
  with the patch: 3 seconds  (this time is comparable to that of query1)
{noformat}

  

> Improve metadata cache performance for queries with single partition 
> -
>
> Key: DRILL-4530
> URL: https://issues.apache.org/jira/browse/DRILL-4530
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
> Fix For: 1.7.0
>
>
> Consider two types of queries which are run with Parquet metadata caching: 
> {noformat}
> query 1:
> SELECT col FROM  `A/B/C`;
> query 2:
> SELECT col FROM `A` WHERE dir0 = 'B' AND dir1 = 'C';
> {noformat}
> For a certain dataset, the query1 elapsed time is 1 sec whereas query2 
> elapsed time is 9 sec even though both are accessing the same amount of data. 
>  The user expectation is that they should perform roughly the same.  The main 
> difference comes from reading the bigger metadata cache file at the root 
> level 'A' for query2 and then applying the partitioning filter.  query1 reads 
> a much smaller metadata cache file at the subdirectory level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335370#comment-15335370
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user ssriniva123 commented on the issue:

https://github.com/apache/drill/pull/518
  
I have also squashed my commits


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
> Fix For: 1.7.0
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4525) Query with BETWEEN clause on Date and Timestamp values fails with Validation Error

2016-06-16 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-4525:
--
Fix Version/s: (was: 1.7.0)
   1.8.0

> Query with BETWEEN clause on Date and Timestamp values fails with Validation 
> Error
> --
>
> Key: DRILL-4525
> URL: https://issues.apache.org/jira/browse/DRILL-4525
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.8.0
>
>
> Query: (simplified variant of TPC-DS Query37)
> {code}
> SELECT
>*
> FROM   
>date_dim
> WHERE   
>d_date BETWEEN Cast('1999-03-06' AS DATE) AND  (
>   Cast('1999-03-06' AS DATE) + INTERVAL '60' day)
> LIMIT 10;
> {code}
> Error:
> {code}
> Error: VALIDATION ERROR: From line 6, column 8 to line 7, column 64: Cannot 
> apply 'BETWEEN ASYMMETRIC' to arguments of type ' BETWEEN ASYMMETRIC 
>  AND '. Supported form(s): ' BETWEEN 
>  AND '
> SQL Query null
> [Error Id: 223fb37c-f561-4a37-9283-871dc6f4d6d0 on abhi2:31010] 
> (state=,code=0)
> {code}
> This is a regression from 1.6.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4525) Query with BETWEEN clause on Date and Timestamp values fails with Validation Error

2016-06-16 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335365#comment-15335365
 ] 

Aman Sinha commented on DRILL-4525:
---

I had a discussion with [~jni] and agree that the preferred fix should be in 
Calcite; given that I am removing the 1.7.0 tag. 

> Query with BETWEEN clause on Date and Timestamp values fails with Validation 
> Error
> --
>
> Key: DRILL-4525
> URL: https://issues.apache.org/jira/browse/DRILL-4525
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.7.0
>
>
> Query: (simplified variant of TPC-DS Query37)
> {code}
> SELECT
>*
> FROM   
>date_dim
> WHERE   
>d_date BETWEEN Cast('1999-03-06' AS DATE) AND  (
>   Cast('1999-03-06' AS DATE) + INTERVAL '60' day)
> LIMIT 10;
> {code}
> Error:
> {code}
> Error: VALIDATION ERROR: From line 6, column 8 to line 7, column 64: Cannot 
> apply 'BETWEEN ASYMMETRIC' to arguments of type ' BETWEEN ASYMMETRIC 
>  AND '. Supported form(s): ' BETWEEN 
>  AND '
> SQL Query null
> [Error Id: 223fb37c-f561-4a37-9283-871dc6f4d6d0 on abhi2:31010] 
> (state=,code=0)
> {code}
> This is a regression from 1.6.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4525) Query with BETWEEN clause on Date and Timestamp values fails with Validation Error

2016-06-16 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335128#comment-15335128
 ] 

Jinfeng Ni commented on DRILL-4525:
---

I agreed with what [~julianhyde] suggested. The change should be in Calcite.  
Opened a Calcite JIRA: https://issues.apache.org/jira/browse/CALCITE-1296

If we adopt the proposal in this PR, Drill has to add a new wrapper and 
maintain this code in the future.  Fixing in Calcite seems to be a better 
approach.

Given this,  we may not be able to fix this issue in 1.7.0.  However, there 
would be a work-around for this issue : add explicit cast to make the query 
pass the check enforced by Calcite.



> Query with BETWEEN clause on Date and Timestamp values fails with Validation 
> Error
> --
>
> Key: DRILL-4525
> URL: https://issues.apache.org/jira/browse/DRILL-4525
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.7.0
>
>
> Query: (simplified variant of TPC-DS Query37)
> {code}
> SELECT
>*
> FROM   
>date_dim
> WHERE   
>d_date BETWEEN Cast('1999-03-06' AS DATE) AND  (
>   Cast('1999-03-06' AS DATE) + INTERVAL '60' day)
> LIMIT 10;
> {code}
> Error:
> {code}
> Error: VALIDATION ERROR: From line 6, column 8 to line 7, column 64: Cannot 
> apply 'BETWEEN ASYMMETRIC' to arguments of type ' BETWEEN ASYMMETRIC 
>  AND '. Supported form(s): ' BETWEEN 
>  AND '
> SQL Query null
> [Error Id: 223fb37c-f561-4a37-9283-871dc6f4d6d0 on abhi2:31010] 
> (state=,code=0)
> {code}
> This is a regression from 1.6.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4715) Java compilation error for a query with large number of expressions

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335075#comment-15335075
 ] 

ASF GitHub Bot commented on DRILL-4715:
---

Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/521
  
Just to be sure, I confirmed with @jinfengni that the new generated code 
would look like the following:  (assume 100 expressions in the Project and 
limit of 50 exprs per block):
   doEval() { 
  {
  generated code for 50 expressions (by default)
  }
  doEval1();
  }
  doEval1() {
 {
 50 expressions
 }
 //  doEval2();  // if needed...each doEval_N() calls doEval_N+1()
  }


> Java compilation error for a query with large number of expressions
> ---
>
> Key: DRILL-4715
> URL: https://issues.apache.org/jira/browse/DRILL-4715
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Reporter: Jinfeng Ni
>
> The following query would hit compilation error, when Drill generates and 
> compiles the run-time code. 
> Q1 :
> {code}
> select  expr1, expr2, expr3, , exprN
> from table
> {code} 
> In Q1, expr1, expr2, ..., exprN are non-trivial expression (in stead of 
> simply column reference), and N is big enough, then the run-time generated 
> code may have a method which goes beyond the 64k limit imposed by JVM. 
> This seems to be a regression from DRILL-3912 (Common subexpression 
> elimination). CSE tries to put as many expressions in one block as possible, 
> to detect and eliminate as many CSE as possible. However, this implies we may 
> end up with big method with compilation error.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4715) Java compilation error for a query with large number of expressions

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335061#comment-15335061
 ] 

ASF GitHub Bot commented on DRILL-4715:
---

Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/521
  
I had couple of minor comments.  Changes LGTM.  +1.


> Java compilation error for a query with large number of expressions
> ---
>
> Key: DRILL-4715
> URL: https://issues.apache.org/jira/browse/DRILL-4715
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Reporter: Jinfeng Ni
>
> The following query would hit compilation error, when Drill generates and 
> compiles the run-time code. 
> Q1 :
> {code}
> select  expr1, expr2, expr3, , exprN
> from table
> {code} 
> In Q1, expr1, expr2, ..., exprN are non-trivial expression (in stead of 
> simply column reference), and N is big enough, then the run-time generated 
> code may have a method which goes beyond the 64k limit imposed by JVM. 
> This seems to be a regression from DRILL-3912 (Common subexpression 
> elimination). CSE tries to put as many expressions in one block as possible, 
> to detect and eliminate as many CSE as possible. However, this implies we may 
> end up with big method with compilation error.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4715) Java compilation error for a query with large number of expressions

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334986#comment-15334986
 ] 

ASF GitHub Bot commented on DRILL-4715:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/521#discussion_r67442811
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/SizedJBlock.java ---
@@ -0,0 +1,48 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.expr;
+
+import com.sun.codemodel.JBlock;
+
+/**
+ * Uses this class to keep track # of Drill Logical Expressions that are
+ * put to JBlock.
+ */
+public class SizedJBlock {
+  private final JBlock block;
--- End diff --

Can you add a comment about why the jblock is a member variable rather than 
SizedJBlock being a derived class of JBlock (which is a final class but is not 
obvious when looking at this code). 


> Java compilation error for a query with large number of expressions
> ---
>
> Key: DRILL-4715
> URL: https://issues.apache.org/jira/browse/DRILL-4715
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Reporter: Jinfeng Ni
>
> The following query would hit compilation error, when Drill generates and 
> compiles the run-time code. 
> Q1 :
> {code}
> select  expr1, expr2, expr3, , exprN
> from table
> {code} 
> In Q1, expr1, expr2, ..., exprN are non-trivial expression (in stead of 
> simply column reference), and N is big enough, then the run-time generated 
> code may have a method which goes beyond the 64k limit imposed by JVM. 
> This seems to be a regression from DRILL-3912 (Common subexpression 
> elimination). CSE tries to put as many expressions in one block as possible, 
> to detect and eliminate as many CSE as possible. However, this implies we may 
> end up with big method with compilation error.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-4236) ExternalSort should use the new allocator functionality to better manage it's memory usage

2016-06-16 Thread Kunal Khatua (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua closed DRILL-4236.
---
Assignee: Kunal Khatua  (was: Deneche A. Hakim)

Verified on Drill 1.6.0.

> ExternalSort should use the new allocator functionality to better manage it's 
> memory usage
> --
>
> Key: DRILL-4236
> URL: https://issues.apache.org/jira/browse/DRILL-4236
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
>Assignee: Kunal Khatua
>  Labels: documentation
> Fix For: 1.5.0
>
>
> Before DRILL-4215, the sort operator wasn't able to correctly compute it's 
> memory usage, and so it jumped through a bunch of hoops to try to figure out 
> when it should spill to disk.
> With the transfer accounting in place, this code can be greatly simplified to 
> just use the current operator memory allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4236) ExternalSort should use the new allocator functionality to better manage it's memory usage

2016-06-16 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334890#comment-15334890
 ] 

Kunal Khatua commented on DRILL-4236:
-

Setting direct memory to 10GB and altered the session by setting 
`planner.memory.max_query_memory_per_node`=21474836 (20MB).
The data begins to spill to disk with no significant increase in the Drillbit 
memory footprint. 
Verified on Drill 1.6.0.

> ExternalSort should use the new allocator functionality to better manage it's 
> memory usage
> --
>
> Key: DRILL-4236
> URL: https://issues.apache.org/jira/browse/DRILL-4236
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>  Labels: documentation
> Fix For: 1.5.0
>
>
> Before DRILL-4215, the sort operator wasn't able to correctly compute it's 
> memory usage, and so it jumped through a bunch of hoops to try to figure out 
> when it should spill to disk.
> With the transfer accounting in place, this code can be greatly simplified to 
> just use the current operator memory allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334765#comment-15334765
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user ssriniva123 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67431369
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +191,33 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
-//Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
-//  logger.debug("Wrote record.");
-  recordCount++;
-}else{
-//  logger.debug("Exiting.");
-  break outside;
-}
-
+outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+try
+  {
+writer.setPosition(recordCount);
--- End diff --

Aman,
maven checkstyle:checkstyle did not report any errors before I did my last
check in. I have changed to reflect 2 spaces for indendation.

On Thu, Jun 16, 2016 at 2:22 PM, Aman Sinha 
wrote:

> In
> 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
> :
>
> > -  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
> > -writer.setPosition(recordCount);
> > -write = jsonReader.write(writer);
> > -
> > -if(write == ReadState.WRITE_SUCCEED) {
> > -//  logger.debug("Wrote record.");
> > -  recordCount++;
> > -}else{
> > -//  logger.debug("Exiting.");
> > -  break outside;
> > -}
> > -
> > +outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
> > +try
> > +  {
> > +writer.setPosition(recordCount);
>
> seems this is still doing indent of 4. We use 2 spaces (see
> https://drill.apache.org/docs/apache-drill-contribution-guidelines/
> scroll down to Step 2). Did it pass the mvn command line build without
> checkstyle violations ?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> 
,
> or mute the thread
> 

> .
>



> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
> Fix For: 1.7.0
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3845) PartitionSender doesn't send last batch for receivers that already terminated

2016-06-16 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334737#comment-15334737
 ] 

Kunal Khatua edited comment on DRILL-3845 at 6/16/16 9:36 PM:
--

Marking as closed based on DRILL-2274 status


was (Author: kkhatua):
Marking as closed based on https://issues.apache.org/jira/browse/DRILL-2274 
status

> PartitionSender doesn't send last batch for receivers that already terminated
> -
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Kunal Khatua
> Fix For: 1.5.0
>
> Attachments: 29c45a5b-e2b9-72d6-89f2-d49ba88e2939.sys.drill
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3845) PartitionSender doesn't send last batch for receivers that already terminated

2016-06-16 Thread Kunal Khatua (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334737#comment-15334737
 ] 

Kunal Khatua commented on DRILL-3845:
-

Marking as closed based on https://issues.apache.org/jira/browse/DRILL-2274 
status

> PartitionSender doesn't send last batch for receivers that already terminated
> -
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Kunal Khatua
> Fix For: 1.5.0
>
> Attachments: 29c45a5b-e2b9-72d6-89f2-d49ba88e2939.sys.drill
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (DRILL-3845) PartitionSender doesn't send last batch for receivers that already terminated

2016-06-16 Thread Kunal Khatua (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua closed DRILL-3845.
---

Marking as closed based on DRILL-2274 status

> PartitionSender doesn't send last batch for receivers that already terminated
> -
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Kunal Khatua
> Fix For: 1.5.0
>
> Attachments: 29c45a5b-e2b9-72d6-89f2-d49ba88e2939.sys.drill
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3845) PartitionSender doesn't send last batch for receivers that already terminated

2016-06-16 Thread Kunal Khatua (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua reassigned DRILL-3845:
---

Assignee: Kunal Khatua  (was: Deneche A. Hakim)

> PartitionSender doesn't send last batch for receivers that already terminated
> -
>
> Key: DRILL-3845
> URL: https://issues.apache.org/jira/browse/DRILL-3845
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Deneche A. Hakim
>Assignee: Kunal Khatua
> Fix For: 1.5.0
>
> Attachments: 29c45a5b-e2b9-72d6-89f2-d49ba88e2939.sys.drill
>
>
> Even if a receiver has finished and informed the corresponding partition 
> sender, the sender will still try to send a "last batch" to the receiver when 
> it's done. In most cases this is fine as those batches will be silently 
> dropped by the receiving DataServer, but if a receiver has finished +10 
> minutes ago, DataServer will throw an exception as it couldn't find the 
> corresponding FragmentManager (WorkEventBus has a 10 minutes recentlyFinished 
> cache).
> DRILL-2274 is a reproduction for this case (after the corresponding fix is 
> applied).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-4653:
--
Fix Version/s: 1.7.0

> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
> Fix For: 1.7.0
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334716#comment-15334716
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67426795
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -179,4 +180,43 @@ public void testNestedFilter() throws Exception {
 .sqlBaselineQuery(baselineQuery)
 .go();
   }
+
+ @Test // See DRILL-4653
+public void testSkippingInvalidJSONRecords() throws Exception {
+try
+{
+String set = "alter session set `" + 
ExecConstants.JSON_READER_SKIP_INVALID_RECORDS_FLAG+ "` = true";
--- End diff --

these should be indented inside the try block with 2 spaces.   It is best 
to set the indent level in your IDE (I can help with Eclipse if you are using 
it;  if you are using IntelliJ I can find out from other developers using 
IntelliJ). 


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334713#comment-15334713
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67426381
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +191,33 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
-//Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
-//  logger.debug("Wrote record.");
-  recordCount++;
-}else{
-//  logger.debug("Exiting.");
-  break outside;
-}
-
+outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+try
+  {
+writer.setPosition(recordCount);
--- End diff --

seems this is still doing indent of 4.  We use 2 spaces (see 
https://drill.apache.org/docs/apache-drill-contribution-guidelines/   scroll 
down to Step 2).   Did it pass the mvn command line build without checkstyle 
violations ? 


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334706#comment-15334706
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on the issue:

https://github.com/apache/drill/pull/518
  
Looks much better.  Sorry for the nitpick but I still have a couple more 
related to the coding conventions. :)Also, could you squash the commits 
into 1 and use the DRILL-:   format for the commit ?   
thanks !


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4525) Query with BETWEEN clause on Date and Timestamp values fails with Validation Error

2016-06-16 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-4525:
--
Fix Version/s: 1.7.0

> Query with BETWEEN clause on Date and Timestamp values fails with Validation 
> Error
> --
>
> Key: DRILL-4525
> URL: https://issues.apache.org/jira/browse/DRILL-4525
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Abhishek Girish
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.7.0
>
>
> Query: (simplified variant of TPC-DS Query37)
> {code}
> SELECT
>*
> FROM   
>date_dim
> WHERE   
>d_date BETWEEN Cast('1999-03-06' AS DATE) AND  (
>   Cast('1999-03-06' AS DATE) + INTERVAL '60' day)
> LIMIT 10;
> {code}
> Error:
> {code}
> Error: VALIDATION ERROR: From line 6, column 8 to line 7, column 64: Cannot 
> apply 'BETWEEN ASYMMETRIC' to arguments of type ' BETWEEN ASYMMETRIC 
>  AND '. Supported form(s): ' BETWEEN 
>  AND '
> SQL Query null
> [Error Id: 223fb37c-f561-4a37-9283-871dc6f4d6d0 on abhi2:31010] 
> (state=,code=0)
> {code}
> This is a regression from 1.6.0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4727) Exclude netty from HBase Client's transitive dependencies

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334668#comment-15334668
 ] 

ASF GitHub Bot commented on DRILL-4727:
---

GitHub user adityakishore opened a pull request:

https://github.com/apache/drill/pull/525

DRILL-4727: Exclude netty from HBase Client's transitive dependencies

Excluded `netty-all` from the list of transitive dependencies pulled by 
`hbase-client`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adityakishore/drill DRILL-4727

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #525


commit 3512043873a5b2a26f48a4005496afedd8da03b4
Author: Aditya Kishore 
Date:   2016-06-16T20:39:21Z

DRILL-4727: Exclude netty from HBase Client's transitive dependencies




> Exclude netty from HBase Client's transitive dependencies
> -
>
> Key: DRILL-4727
> URL: https://issues.apache.org/jira/browse/DRILL-4727
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.7.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
>
> Reported on dev/user list after moving to HBase 1.1
> {noformat}
> Hi Aditya,
> I tested the latest version and got this exception and the drillbit fail to 
> startup .
> Exception in thread "main" java.lang.NoSuchMethodError: 
> io.netty.util.UniqueName.(Ljava/lang/String;)V
> at io.netty.channel.ChannelOption.(ChannelOption.java:136)
> at io.netty.channel.ChannelOption.valueOf(ChannelOption.java:99)
> at io.netty.channel.ChannelOption.(ChannelOption.java:42)
> at org.apache.drill.exec.rpc.BasicServer.(BasicServer.java:63)
> at 
> org.apache.drill.exec.rpc.user.UserServer.(UserServer.java:74)
> at 
> org.apache.drill.exec.service.ServiceEngine.(ServiceEngine.java:78)
> at org.apache.drill.exec.server.Drillbit.(Drillbit.java:108)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:285)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:271)
> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:267)
> It will working if I remove jars/3rdparty/netty-all-4.0.23.Final.jar, the 
> drill can startup. I think there have some package dependency version issue, 
> do you think so ?
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334666#comment-15334666
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user ssriniva123 commented on the issue:

https://github.com/apache/drill/pull/518
  
I have made changes are recommended by reviewers:

- Changed JSON_READER_SKIP_INVALID_RECORDS_FLAG constant
- Modified unit test to use builder framework
- Code indendation changes.  



> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4727) Exclude netty from HBase Client's transitive dependencies

2016-06-16 Thread Aditya Kishore (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334658#comment-15334658
 ] 

Aditya Kishore commented on DRILL-4727:
---

The problem is that Drill uses {{netty}} in its various bits and pieces 
{{(netty-buffer, netty-codec, netty-common, etc...)}} instead of using the uber 
netty library, i.e. {{netty-all}}.

So if a class, {{io.netty.channel.ChannelOption}} in this case, is loaded from 
one version of one of the bits (netty-transport) while a dependent class 
(io.netty.util.UniqueName) from the uber jar, conflicts like this can happen.

The right fix would be to move Drill to use the uber jar so that its version 
can override any other components'.

For now, since we are so close to release 1.7 release, I'd like to just exclude 
the {{netty-all}} jar from {{hbase-client}}'s list of dependencies.

> Exclude netty from HBase Client's transitive dependencies
> -
>
> Key: DRILL-4727
> URL: https://issues.apache.org/jira/browse/DRILL-4727
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.7.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
>
> Reported on dev/user list after moving to HBase 1.1
> {noformat}
> Hi Aditya,
> I tested the latest version and got this exception and the drillbit fail to 
> startup .
> Exception in thread "main" java.lang.NoSuchMethodError: 
> io.netty.util.UniqueName.(Ljava/lang/String;)V
> at io.netty.channel.ChannelOption.(ChannelOption.java:136)
> at io.netty.channel.ChannelOption.valueOf(ChannelOption.java:99)
> at io.netty.channel.ChannelOption.(ChannelOption.java:42)
> at org.apache.drill.exec.rpc.BasicServer.(BasicServer.java:63)
> at 
> org.apache.drill.exec.rpc.user.UserServer.(UserServer.java:74)
> at 
> org.apache.drill.exec.service.ServiceEngine.(ServiceEngine.java:78)
> at org.apache.drill.exec.server.Drillbit.(Drillbit.java:108)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:285)
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:271)
> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:267)
> It will working if I remove jars/3rdparty/netty-all-4.0.23.Final.jar, the 
> drill can startup. I think there have some package dependency version issue, 
> do you think so ?
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4727) Exclude netty from HBase Client's transitive dependencies

2016-06-16 Thread Aditya Kishore (JIRA)
Aditya Kishore created DRILL-4727:
-

 Summary: Exclude netty from HBase Client's transitive dependencies
 Key: DRILL-4727
 URL: https://issues.apache.org/jira/browse/DRILL-4727
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Affects Versions: 1.7.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
Priority: Critical


Reported on dev/user list after moving to HBase 1.1

{noformat}
Hi Aditya,

I tested the latest version and got this exception and the drillbit fail to 
startup .

Exception in thread "main" java.lang.NoSuchMethodError: 
io.netty.util.UniqueName.(Ljava/lang/String;)V
at io.netty.channel.ChannelOption.(ChannelOption.java:136)
at io.netty.channel.ChannelOption.valueOf(ChannelOption.java:99)
at io.netty.channel.ChannelOption.(ChannelOption.java:42)
at org.apache.drill.exec.rpc.BasicServer.(BasicServer.java:63)
at org.apache.drill.exec.rpc.user.UserServer.(UserServer.java:74)
at 
org.apache.drill.exec.service.ServiceEngine.(ServiceEngine.java:78)
at org.apache.drill.exec.server.Drillbit.(Drillbit.java:108)
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:285)
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:271)
at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:267)

It will working if I remove jars/3rdparty/netty-all-4.0.23.Final.jar, the drill 
can startup. I think there have some package dependency version issue, do you 
think so ?
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4725) Improvements to InfoSchema RecordGenerator needed for DRILL-4714

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334628#comment-15334628
 ] 

ASF GitHub Bot commented on DRILL-4725:
---

Github user vkorukanti commented on the issue:

https://github.com/apache/drill/pull/524
  
Updated patch addresses the review comment. Separated the binary and 
ternary into separate cases with proper variable names.


> Improvements to InfoSchema RecordGenerator needed for DRILL-4714
> 
>
> Key: DRILL-4725
> URL: https://issues.apache.org/jira/browse/DRILL-4725
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>
> 1. Add support for pushing the filter on following fields into 
> InfoSchemaRecordGenerator:
>- CATALOG_NAME
>- COLUMN_NAME
> 2. Pushdown LIKE with ESCAPE. Add test.
> 3. Add a method visitCatalog() to InfoSchemaRecordGenerator to decide whether 
> to explore the catalog or not



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4725) Improvements to InfoSchema RecordGenerator needed for DRILL-4714

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334594#comment-15334594
 ] 

ASF GitHub Bot commented on DRILL-4725:
---

Github user jaltekruse commented on the issue:

https://github.com/apache/drill/pull/524
  
+1 one small comment, otherwise LGTM


> Improvements to InfoSchema RecordGenerator needed for DRILL-4714
> 
>
> Key: DRILL-4725
> URL: https://issues.apache.org/jira/browse/DRILL-4725
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>
> 1. Add support for pushing the filter on following fields into 
> InfoSchemaRecordGenerator:
>- CATALOG_NAME
>- COLUMN_NAME
> 2. Pushdown LIKE with ESCAPE. Add test.
> 3. Add a method visitCatalog() to InfoSchemaRecordGenerator to decide whether 
> to explore the catalog or not



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4725) Improvements to InfoSchema RecordGenerator needed for DRILL-4714

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334587#comment-15334587
 ] 

ASF GitHub Bot commented on DRILL-4725:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/524#discussion_r67415245
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ischema/InfoSchemaFilterBuilder.java
 ---
@@ -73,9 +73,12 @@ public ExprNode visitFunctionCall(FunctionCall call, 
Void value) throws RuntimeE
   case "like": {
 ExprNode arg0 = call.args.get(0).accept(this, value);
 ExprNode arg1 = call.args.get(1).accept(this, value);
+ExprNode arg2 = call.args.size() > 2 ? 
call.args.get(2).accept(this, value) : null;
--- End diff --

do you want to update these variables to have meaningful names about what 
each of the arguments will be used for like you did above? Would make it a 
little clearer why the ternary is only needed in the last case because the 
escape is the only optional parameter.


> Improvements to InfoSchema RecordGenerator needed for DRILL-4714
> 
>
> Key: DRILL-4725
> URL: https://issues.apache.org/jira/browse/DRILL-4725
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>
> 1. Add support for pushing the filter on following fields into 
> InfoSchemaRecordGenerator:
>- CATALOG_NAME
>- COLUMN_NAME
> 2. Pushdown LIKE with ESCAPE. Add test.
> 3. Add a method visitCatalog() to InfoSchemaRecordGenerator to decide whether 
> to explore the catalog or not



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-4723) GC Memory Allocation Issues

2016-06-16 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334422#comment-15334422
 ] 

Deneche A. Hakim edited comment on DRILL-4723 at 6/16/16 6:51 PM:
--

Looking at the heap analysis I see the following:
||Class Name||Objects||Shallow Heap||Retained Heap||
|byte[] | 129,386 | 6,984,838,792 | >= 6,984,838,792|
|java.lang.Object[] | 86,392 | 13,846,488 | >= 5,561,720,280|
|java.util.ArrayList | 57,382 | 1,377,168 | >= 5,549,808,040|
|A | 4,056 | 292,032 | >= 5,303,197,648|
|B | 4,056 | 97,344 | >= 5,289,404,432|

A: org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter
B: org.apache.parquet.bytes.ConcatenatingByteArrayCollector


was (Author: adeneche):
Looking at the heap analysis I see the following:
||Class Name||  Objects||   Shallow Heap||  Retained Heap||
|byte[] |   129,386|6,984,838,792|  >= 6,984,838,792|
|java.lang.Object[] | 86,392|   13,846,488| >= 5,561,720,280|
|java.util.ArrayList|57,382|1,377,168|  >= 5,549,808,040|
|org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter| 
4,056|  292,032|>= 5,303,197,648|
|org.apache.parquet.bytes.ConcatenatingByteArrayCollector|4,056|97,344| 
>= 5,289,404,432|


> GC Memory Allocation Issues
> ---
>
> Key: DRILL-4723
> URL: https://issues.apache.org/jira/browse/DRILL-4723
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.6.0
> Environment: 5 Drill bits, MapR 1.6 release. 84GB of Direct Memory, 
> 24GB of Heap
>Reporter: John Omernik
>  Labels: memory, stability
> Fix For: Future
>
> Attachments: node3.log
>
>
> This issue is reposted from the Drill User List.  More logs available on 
> request in comments (please specify which logs you may want)
> High level: Drill Cluster gets into a bad state when this occurs. 
> This is what I have thus far... I can provide more complete logs on a one on 
> one basis. 
> The cluster was completely mine, with fresh logs. I ran a CTAS query on a 
> large table that over 100 fields. This query works well in other cases, 
> however I was working with the Block size, both in MapR FS and Drill Parquet. 
> I had successfully tested 512m on each, this case was different. Here are the 
> facts in this setup:
> - No Compression in MapRFS - Using Standard Parquet Snappy Compression
> - MapR Block Size 1024m
> - Parquet Block size 1024m
> - Query  ends up disappearing in the profiles
> - The UI page listing bits only show 4 bits however 5 are running (node 03 
> process is running, but no longer in the bit)
> - Error (copied below)  from rest API
> - No output in STD out or STD error on node3. Only two nodes actually had 
> "Parquet Writing" logs. The other three on Stdout, did not have any 
> issues/errors/
> - I have standard log files, gclogs, the profile.json (before it 
> disappeared), and the physical plan.  Only some components that looked 
> possibly at issue included here
> - The Node 3 GC log shows a bunch of "Full GC Allocation Failures"  that take 
> 4 seconds or more (When I've seen this in other cases, this time can balloon 
> to 8 secs or more)
> - The node 3 output log show some issues with really long RPC issues. Perhaps 
> the GCs prevent RPC communication and create a snowball loop effect?
> Other logs if people are interested can be provided upon request. I just 
> didn't want to flood the whole list with all the logs. 
> Thanks!
> John
> Rest Error:
> ./load_day.py 2016-05-09
> Drill Rest Endpoint: https://drillprod.marathonprod.zeta:2
> Day: 2016-05-09
> /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:769: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> https://urllib3.readthedocs.org/en/latest/security.html
>   InsecureRequestWarning)
> Authentication successful
> Error encountered: 500
> {
>   "errorMessage" : "SYSTEM ERROR: ForemanException: One more more nodes lost 
> connectivity during query.  Identified nodes were 
> [atl1ctuzeta03:20001].\n\n\n[Error Id: d7dd0120-f7c0-44ef-ac54-29c746b26354 
> on atl1ctuzeta01:20001"
> }
> Possible issue in Node3 Log:
> 2016-06-14 17:25:27,860 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:90] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:90: State to report: RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 

[jira] [Commented] (DRILL-4723) GC Memory Allocation Issues

2016-06-16 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334429#comment-15334429
 ] 

Deneche A. Hakim commented on DRILL-4723:
-

This confirms what [~jacq...@dremio.com] says. Looks like the parquet library 
is doing a lot of the writing work in the heap

> GC Memory Allocation Issues
> ---
>
> Key: DRILL-4723
> URL: https://issues.apache.org/jira/browse/DRILL-4723
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.6.0
> Environment: 5 Drill bits, MapR 1.6 release. 84GB of Direct Memory, 
> 24GB of Heap
>Reporter: John Omernik
>  Labels: memory, stability
> Fix For: Future
>
> Attachments: node3.log
>
>
> This issue is reposted from the Drill User List.  More logs available on 
> request in comments (please specify which logs you may want)
> High level: Drill Cluster gets into a bad state when this occurs. 
> This is what I have thus far... I can provide more complete logs on a one on 
> one basis. 
> The cluster was completely mine, with fresh logs. I ran a CTAS query on a 
> large table that over 100 fields. This query works well in other cases, 
> however I was working with the Block size, both in MapR FS and Drill Parquet. 
> I had successfully tested 512m on each, this case was different. Here are the 
> facts in this setup:
> - No Compression in MapRFS - Using Standard Parquet Snappy Compression
> - MapR Block Size 1024m
> - Parquet Block size 1024m
> - Query  ends up disappearing in the profiles
> - The UI page listing bits only show 4 bits however 5 are running (node 03 
> process is running, but no longer in the bit)
> - Error (copied below)  from rest API
> - No output in STD out or STD error on node3. Only two nodes actually had 
> "Parquet Writing" logs. The other three on Stdout, did not have any 
> issues/errors/
> - I have standard log files, gclogs, the profile.json (before it 
> disappeared), and the physical plan.  Only some components that looked 
> possibly at issue included here
> - The Node 3 GC log shows a bunch of "Full GC Allocation Failures"  that take 
> 4 seconds or more (When I've seen this in other cases, this time can balloon 
> to 8 secs or more)
> - The node 3 output log show some issues with really long RPC issues. Perhaps 
> the GCs prevent RPC communication and create a snowball loop effect?
> Other logs if people are interested can be provided upon request. I just 
> didn't want to flood the whole list with all the logs. 
> Thanks!
> John
> Rest Error:
> ./load_day.py 2016-05-09
> Drill Rest Endpoint: https://drillprod.marathonprod.zeta:2
> Day: 2016-05-09
> /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:769: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> https://urllib3.readthedocs.org/en/latest/security.html
>   InsecureRequestWarning)
> Authentication successful
> Error encountered: 500
> {
>   "errorMessage" : "SYSTEM ERROR: ForemanException: One more more nodes lost 
> connectivity during query.  Identified nodes were 
> [atl1ctuzeta03:20001].\n\n\n[Error Id: d7dd0120-f7c0-44ef-ac54-29c746b26354 
> on atl1ctuzeta01:20001"
> }
> Possible issue in Node3 Log:
> 2016-06-14 17:25:27,860 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:90] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:90: State to report: RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State to report: RUNNING
> 2016-06-14 17:43:41,869 [BitServer-4] WARN  
> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc type 1 
> took longer than 500ms.  Actual duration was 4192ms.
> 2016-06-14 17:45:36,720 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State change requested RUNNING --> 
> CANCELLATION_REQUESTED
> 2016-06-14 17:45:45,740 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State to report: 
> CANCELLATION_REQUESTED
> 2016-06-14 17:46:15,318 [BitServer-3] WARN  
> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc type 6 
> took longer than 500ms.  Actual duration was 55328ms.
> 2016-06-14 17:46:36,057 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:5: State change requested RUNNING --> 
> 

[jira] [Commented] (DRILL-4723) GC Memory Allocation Issues

2016-06-16 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334422#comment-15334422
 ] 

Deneche A. Hakim commented on DRILL-4723:
-

Looking at the heap analysis I see the following:
||Class Name||  Objects||   Shallow Heap||  Retained Heap||
|byte[] |   129,386|6,984,838,792|  >= 6,984,838,792|
|java.lang.Object[] | 86,392|   13,846,488| >= 5,561,720,280|
|java.util.ArrayList|57,382|1,377,168|  >= 5,549,808,040|
|org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter| 
4,056|  292,032|>= 5,303,197,648|
|org.apache.parquet.bytes.ConcatenatingByteArrayCollector|4,056|97,344| 
>= 5,289,404,432|


> GC Memory Allocation Issues
> ---
>
> Key: DRILL-4723
> URL: https://issues.apache.org/jira/browse/DRILL-4723
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.6.0
> Environment: 5 Drill bits, MapR 1.6 release. 84GB of Direct Memory, 
> 24GB of Heap
>Reporter: John Omernik
>  Labels: memory, stability
> Fix For: Future
>
> Attachments: node3.log
>
>
> This issue is reposted from the Drill User List.  More logs available on 
> request in comments (please specify which logs you may want)
> High level: Drill Cluster gets into a bad state when this occurs. 
> This is what I have thus far... I can provide more complete logs on a one on 
> one basis. 
> The cluster was completely mine, with fresh logs. I ran a CTAS query on a 
> large table that over 100 fields. This query works well in other cases, 
> however I was working with the Block size, both in MapR FS and Drill Parquet. 
> I had successfully tested 512m on each, this case was different. Here are the 
> facts in this setup:
> - No Compression in MapRFS - Using Standard Parquet Snappy Compression
> - MapR Block Size 1024m
> - Parquet Block size 1024m
> - Query  ends up disappearing in the profiles
> - The UI page listing bits only show 4 bits however 5 are running (node 03 
> process is running, but no longer in the bit)
> - Error (copied below)  from rest API
> - No output in STD out or STD error on node3. Only two nodes actually had 
> "Parquet Writing" logs. The other three on Stdout, did not have any 
> issues/errors/
> - I have standard log files, gclogs, the profile.json (before it 
> disappeared), and the physical plan.  Only some components that looked 
> possibly at issue included here
> - The Node 3 GC log shows a bunch of "Full GC Allocation Failures"  that take 
> 4 seconds or more (When I've seen this in other cases, this time can balloon 
> to 8 secs or more)
> - The node 3 output log show some issues with really long RPC issues. Perhaps 
> the GCs prevent RPC communication and create a snowball loop effect?
> Other logs if people are interested can be provided upon request. I just 
> didn't want to flood the whole list with all the logs. 
> Thanks!
> John
> Rest Error:
> ./load_day.py 2016-05-09
> Drill Rest Endpoint: https://drillprod.marathonprod.zeta:2
> Day: 2016-05-09
> /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:769: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> https://urllib3.readthedocs.org/en/latest/security.html
>   InsecureRequestWarning)
> Authentication successful
> Error encountered: 500
> {
>   "errorMessage" : "SYSTEM ERROR: ForemanException: One more more nodes lost 
> connectivity during query.  Identified nodes were 
> [atl1ctuzeta03:20001].\n\n\n[Error Id: d7dd0120-f7c0-44ef-ac54-29c746b26354 
> on atl1ctuzeta01:20001"
> }
> Possible issue in Node3 Log:
> 2016-06-14 17:25:27,860 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:90] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:90: State to report: RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State to report: RUNNING
> 2016-06-14 17:43:41,869 [BitServer-4] WARN  
> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc type 1 
> took longer than 500ms.  Actual duration was 4192ms.
> 2016-06-14 17:45:36,720 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State change requested RUNNING --> 
> CANCELLATION_REQUESTED
> 2016-06-14 17:45:45,740 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State 

[jira] [Issue Comment Deleted] (DRILL-4723) GC Memory Allocation Issues

2016-06-16 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-4723:

Comment: was deleted

(was: Here are the links to the output of the heap dump analysis:
[System Overview|https://db.tt/UjqfUzt4]
[Leak Suspects|https://db.tt/sqevHKix])

> GC Memory Allocation Issues
> ---
>
> Key: DRILL-4723
> URL: https://issues.apache.org/jira/browse/DRILL-4723
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.6.0
> Environment: 5 Drill bits, MapR 1.6 release. 84GB of Direct Memory, 
> 24GB of Heap
>Reporter: John Omernik
>  Labels: memory, stability
> Fix For: Future
>
> Attachments: node3.log
>
>
> This issue is reposted from the Drill User List.  More logs available on 
> request in comments (please specify which logs you may want)
> High level: Drill Cluster gets into a bad state when this occurs. 
> This is what I have thus far... I can provide more complete logs on a one on 
> one basis. 
> The cluster was completely mine, with fresh logs. I ran a CTAS query on a 
> large table that over 100 fields. This query works well in other cases, 
> however I was working with the Block size, both in MapR FS and Drill Parquet. 
> I had successfully tested 512m on each, this case was different. Here are the 
> facts in this setup:
> - No Compression in MapRFS - Using Standard Parquet Snappy Compression
> - MapR Block Size 1024m
> - Parquet Block size 1024m
> - Query  ends up disappearing in the profiles
> - The UI page listing bits only show 4 bits however 5 are running (node 03 
> process is running, but no longer in the bit)
> - Error (copied below)  from rest API
> - No output in STD out or STD error on node3. Only two nodes actually had 
> "Parquet Writing" logs. The other three on Stdout, did not have any 
> issues/errors/
> - I have standard log files, gclogs, the profile.json (before it 
> disappeared), and the physical plan.  Only some components that looked 
> possibly at issue included here
> - The Node 3 GC log shows a bunch of "Full GC Allocation Failures"  that take 
> 4 seconds or more (When I've seen this in other cases, this time can balloon 
> to 8 secs or more)
> - The node 3 output log show some issues with really long RPC issues. Perhaps 
> the GCs prevent RPC communication and create a snowball loop effect?
> Other logs if people are interested can be provided upon request. I just 
> didn't want to flood the whole list with all the logs. 
> Thanks!
> John
> Rest Error:
> ./load_day.py 2016-05-09
> Drill Rest Endpoint: https://drillprod.marathonprod.zeta:2
> Day: 2016-05-09
> /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:769: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> https://urllib3.readthedocs.org/en/latest/security.html
>   InsecureRequestWarning)
> Authentication successful
> Error encountered: 500
> {
>   "errorMessage" : "SYSTEM ERROR: ForemanException: One more more nodes lost 
> connectivity during query.  Identified nodes were 
> [atl1ctuzeta03:20001].\n\n\n[Error Id: d7dd0120-f7c0-44ef-ac54-29c746b26354 
> on atl1ctuzeta01:20001"
> }
> Possible issue in Node3 Log:
> 2016-06-14 17:25:27,860 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:90] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:90: State to report: RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State to report: RUNNING
> 2016-06-14 17:43:41,869 [BitServer-4] WARN  
> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc type 1 
> took longer than 500ms.  Actual duration was 4192ms.
> 2016-06-14 17:45:36,720 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State change requested RUNNING --> 
> CANCELLATION_REQUESTED
> 2016-06-14 17:45:45,740 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State to report: 
> CANCELLATION_REQUESTED
> 2016-06-14 17:46:15,318 [BitServer-3] WARN  
> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc type 6 
> took longer than 500ms.  Actual duration was 55328ms.
> 2016-06-14 17:46:36,057 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:5: State change requested RUNNING --> 
> 

[jira] [Updated] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread Deneche A. Hakim (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deneche A. Hakim updated DRILL-4653:

Reviewer:   (was: Deneche A. Hakim)

> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4723) GC Memory Allocation Issues

2016-06-16 Thread Deneche A. Hakim (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334380#comment-15334380
 ] 

Deneche A. Hakim commented on DRILL-4723:
-

Here are the links to the output of the heap dump analysis:
[System Overview|https://db.tt/UjqfUzt4]
[Leak Suspects|https://db.tt/sqevHKix]

> GC Memory Allocation Issues
> ---
>
> Key: DRILL-4723
> URL: https://issues.apache.org/jira/browse/DRILL-4723
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.6.0
> Environment: 5 Drill bits, MapR 1.6 release. 84GB of Direct Memory, 
> 24GB of Heap
>Reporter: John Omernik
>  Labels: memory, stability
> Fix For: Future
>
> Attachments: node3.log
>
>
> This issue is reposted from the Drill User List.  More logs available on 
> request in comments (please specify which logs you may want)
> High level: Drill Cluster gets into a bad state when this occurs. 
> This is what I have thus far... I can provide more complete logs on a one on 
> one basis. 
> The cluster was completely mine, with fresh logs. I ran a CTAS query on a 
> large table that over 100 fields. This query works well in other cases, 
> however I was working with the Block size, both in MapR FS and Drill Parquet. 
> I had successfully tested 512m on each, this case was different. Here are the 
> facts in this setup:
> - No Compression in MapRFS - Using Standard Parquet Snappy Compression
> - MapR Block Size 1024m
> - Parquet Block size 1024m
> - Query  ends up disappearing in the profiles
> - The UI page listing bits only show 4 bits however 5 are running (node 03 
> process is running, but no longer in the bit)
> - Error (copied below)  from rest API
> - No output in STD out or STD error on node3. Only two nodes actually had 
> "Parquet Writing" logs. The other three on Stdout, did not have any 
> issues/errors/
> - I have standard log files, gclogs, the profile.json (before it 
> disappeared), and the physical plan.  Only some components that looked 
> possibly at issue included here
> - The Node 3 GC log shows a bunch of "Full GC Allocation Failures"  that take 
> 4 seconds or more (When I've seen this in other cases, this time can balloon 
> to 8 secs or more)
> - The node 3 output log show some issues with really long RPC issues. Perhaps 
> the GCs prevent RPC communication and create a snowball loop effect?
> Other logs if people are interested can be provided upon request. I just 
> didn't want to flood the whole list with all the logs. 
> Thanks!
> John
> Rest Error:
> ./load_day.py 2016-05-09
> Drill Rest Endpoint: https://drillprod.marathonprod.zeta:2
> Day: 2016-05-09
> /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:769: 
> InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
> certificate verification is strongly advised. See: 
> https://urllib3.readthedocs.org/en/latest/security.html
>   InsecureRequestWarning)
> Authentication successful
> Error encountered: 500
> {
>   "errorMessage" : "SYSTEM ERROR: ForemanException: One more more nodes lost 
> connectivity during query.  Identified nodes were 
> [atl1ctuzeta03:20001].\n\n\n[Error Id: d7dd0120-f7c0-44ef-ac54-29c746b26354 
> on atl1ctuzeta01:20001"
> }
> Possible issue in Node3 Log:
> 2016-06-14 17:25:27,860 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:90] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:90: State to report: RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2016-06-14 17:25:27,871 [289fc208-7266-6a81-73a1-709efff6c412:frag:1:70] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:70: State to report: RUNNING
> 2016-06-14 17:43:41,869 [BitServer-4] WARN  
> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of rpc type 1 
> took longer than 500ms.  Actual duration was 4192ms.
> 2016-06-14 17:45:36,720 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State change requested RUNNING --> 
> CANCELLATION_REQUESTED
> 2016-06-14 17:45:45,740 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:0: State to report: 
> CANCELLATION_REQUESTED
> 2016-06-14 17:46:15,318 [BitServer-3] WARN  
> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of rpc type 6 
> took longer than 500ms.  Actual duration was 55328ms.
> 2016-06-14 17:46:36,057 [CONTROL-rpc-event-queue] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 289fc208-7266-6a81-73a1-709efff6c412:1:5: State change requested RUNNING 

[jira] [Commented] (DRILL-4530) Improve metadata cache performance for queries with single partition

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334300#comment-15334300
 ] 

ASF GitHub Bot commented on DRILL-4530:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/519#discussion_r67392462
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
 ---
@@ -208,8 +209,18 @@ public DrillTable isReadable(DrillFileSystem fs, 
FileSelection selection,
 FileSystemPlugin fsPlugin, String storageEngineName, String 
userName)
 throws IOException {
   // TODO: we only check the first file for directory reading.
-  if(selection.containsDirectories(fs)){
-if(isDirReadable(fs, selection.getFirstPath(fs))){
+  if(selection.containsDirectories(fs)) {
+Path dirMetaPath = new Path(selection.getSelectionRoot(), 
Metadata.METADATA_DIRECTORIES_FILENAME);
+if (fs.exists(dirMetaPath)) {
+  ParquetTableMetadataDirs mDirs = Metadata.readMetadataDirs(fs, 
dirMetaPath.toString());
+  if (mDirs.getDirectories().size() > 0) {
+FileSelection dirSelection = 
FileSelection.createFromDirectories(mDirs.getDirectories(), selection);
+dirSelection.setExpandedPartial();
+return new DynamicDrillTable(fsPlugin, storageEngineName, 
userName,
--- End diff --

make sense.. I will add a comment.  thanks for reviewing. 


> Improve metadata cache performance for queries with single partition 
> -
>
> Key: DRILL-4530
> URL: https://issues.apache.org/jira/browse/DRILL-4530
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
> Fix For: 1.7.0
>
>
> Consider two types of queries which are run with Parquet metadata caching: 
> {noformat}
> query 1:
> SELECT col FROM  `A/B/C`;
> query 2:
> SELECT col FROM `A` WHERE dir0 = 'B' AND dir1 = 'C';
> {noformat}
> For a certain dataset, the query1 elapsed time is 1 sec whereas query2 
> elapsed time is 9 sec even though both are accessing the same amount of data. 
>  The user expectation is that they should perform roughly the same.  The main 
> difference comes from reading the bigger metadata cache file at the root 
> level 'A' for query2 and then applying the partitioning filter.  query1 reads 
> a much smaller metadata cache file at the subdirectory level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334270#comment-15334270
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67390934
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -116,6 +117,7 @@ public void testMixedNumberTypes() throws Exception {
   .jsonBaselineFile("jsoninput/mixed_number_types.json")
   .build().run();
 } catch (Exception ex) {
+  ex.printStackTrace();
--- End diff --

Not a good idea to print stack trace in unit tests. The output of our unit 
tests is already too verbose.
Use junit.fail with the message from the exception instead?



> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334261#comment-15334261
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67390506
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -179,4 +181,28 @@ public void testNestedFilter() throws Exception {
 .sqlBaselineQuery(baselineQuery)
 .go();
   }
+
+
+ @Test // See DRILL-4653
+  public void testSkippingInvalidJSONRecords() throws Exception {
--- End diff --

For both these tests could you pls use the testBuilder() framework ?  This 
is the recommended way to write the unit tests .. you can see one of the other 
tests in this file.  


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334254#comment-15334254
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389956
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/json/TestJsonRecordReader.java
 ---
@@ -116,6 +117,7 @@ public void testMixedNumberTypes() throws Exception {
   .jsonBaselineFile("jsoninput/mixed_number_types.json")
   .build().run();
 } catch (Exception ex) {
+  ex.printStackTrace();
--- End diff --

not sure why this printStackTrace was added in a different test from the 
ones that you added...


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334250#comment-15334250
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389846
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +194,37 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
 //Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
+   // try
+   // {
+  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+  try{
+writer.setPosition(recordCount);
+write = jsonReader.write(writer);
+if(write == ReadState.WRITE_SUCCEED) {
 //  logger.debug("Wrote record.");
-  recordCount++;
-}else{
+  recordCount++;
+}else{
 //  logger.debug("Exiting.");
-  break outside;
-}
-
+  break outside;
+}
   }
-
-  jsonReader.ensureAtLeastOneField(writer);
-
+  catch(Exception ex)
+  {
+   ++parseErrorCount;
--- End diff --

the indentations seem to be off here as well as other places.. can you make 
sure the indentations match the rest of the code ?


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334248#comment-15334248
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389726
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +194,37 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
 //Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
+   // try
+   // {
+  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH){
+  try{
+writer.setPosition(recordCount);
+write = jsonReader.write(writer);
+if(write == ReadState.WRITE_SUCCEED) {
 //  logger.debug("Wrote record.");
-  recordCount++;
-}else{
+  recordCount++;
+}else{
 //  logger.debug("Exiting.");
-  break outside;
-}
-
+  break outside;
+}
   }
-
-  jsonReader.ensureAtLeastOneField(writer);
-
+  catch(Exception ex)
--- End diff --

minor style convention: can you put the catch() on the previous line to 
match the closing paren 


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334242#comment-15334242
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389362
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -189,39 +194,37 @@ private long currentRecordNumberInFile() {
   public int next() {
 writer.allocate();
 writer.reset();
-
 recordCount = 0;
 ReadState write = null;
 //Stopwatch p = new Stopwatch().start();
-try{
-  outside: while(recordCount < DEFAULT_ROWS_PER_BATCH) {
-writer.setPosition(recordCount);
-write = jsonReader.write(writer);
-
-if(write == ReadState.WRITE_SUCCEED) {
+   // try
--- End diff --

remove these commented lines


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334236#comment-15334236
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67389073
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -135,6 +135,9 @@
   BooleanValidator JSON_EXTENDED_TYPES = new 
BooleanValidator("store.json.extended_types", false);
   BooleanValidator JSON_WRITER_UGLIFY = new 
BooleanValidator("store.json.writer.uglify", false);
   BooleanValidator JSON_WRITER_SKIPNULLFIELDS = new 
BooleanValidator("store.json.writer.skip_null_fields", true);
+  String JSON_READER_SKIP_MALFORMED_RECORDS_FLAG = 
"store.json.reader.skip_malformed_records";
--- End diff --

Can you change this to 'skip_invalid_records' such that the name is 
somewhat consistent with the future similar option in DRILL-3764.  In the 
future the json option would likely be subsumed by the new global option. 


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4653) Malformed JSON should not stop the entire query from progressing

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334227#comment-15334227
 ] 

ASF GitHub Bot commented on DRILL-4653:
---

Github user amansinha100 commented on a diff in the pull request:

https://github.com/apache/drill/pull/518#discussion_r67388118
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ---
@@ -39,6 +40,7 @@
 import org.apache.drill.exec.vector.complex.fn.JsonReader;
 import org.apache.drill.exec.vector.complex.impl.VectorContainerWriter;
 import org.apache.hadoop.fs.Path;
+import org.apache.parquet.Log;
--- End diff --

Not sure why the parquet.log is included in the json reader


> Malformed JSON should not stop the entire query from progressing
> 
>
> Key: DRILL-4653
> URL: https://issues.apache.org/jira/browse/DRILL-4653
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: subbu srinivasan
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4530) Improve metadata cache performance for queries with single partition

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334208#comment-15334208
 ] 

ASF GitHub Bot commented on DRILL-4530:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/519#discussion_r67387102
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetFormatPlugin.java
 ---
@@ -208,8 +209,18 @@ public DrillTable isReadable(DrillFileSystem fs, 
FileSelection selection,
 FileSystemPlugin fsPlugin, String storageEngineName, String 
userName)
 throws IOException {
   // TODO: we only check the first file for directory reading.
-  if(selection.containsDirectories(fs)){
-if(isDirReadable(fs, selection.getFirstPath(fs))){
+  if(selection.containsDirectories(fs)) {
+Path dirMetaPath = new Path(selection.getSelectionRoot(), 
Metadata.METADATA_DIRECTORIES_FILENAME);
+if (fs.exists(dirMetaPath)) {
+  ParquetTableMetadataDirs mDirs = Metadata.readMetadataDirs(fs, 
dirMetaPath.toString());
+  if (mDirs.getDirectories().size() > 0) {
+FileSelection dirSelection = 
FileSelection.createFromDirectories(mDirs.getDirectories(), selection);
+dirSelection.setExpandedPartial();
+return new DynamicDrillTable(fsPlugin, storageEngineName, 
userName,
--- End diff --

A comment then, perhaps.


> Improve metadata cache performance for queries with single partition 
> -
>
> Key: DRILL-4530
> URL: https://issues.apache.org/jira/browse/DRILL-4530
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
> Fix For: 1.7.0
>
>
> Consider two types of queries which are run with Parquet metadata caching: 
> {noformat}
> query 1:
> SELECT col FROM  `A/B/C`;
> query 2:
> SELECT col FROM `A` WHERE dir0 = 'B' AND dir1 = 'C';
> {noformat}
> For a certain dataset, the query1 elapsed time is 1 sec whereas query2 
> elapsed time is 9 sec even though both are accessing the same amount of data. 
>  The user expectation is that they should perform roughly the same.  The main 
> difference comes from reading the bigger metadata cache file at the root 
> level 'A' for query2 and then applying the partitioning filter.  query1 reads 
> a much smaller metadata cache file at the subdirectory level. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4724) convert_from(binary_string(expression),'INT_BE') results in Exception

2016-06-16 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-4724:
--
Priority: Minor  (was: Major)

> convert_from(binary_string(expression),'INT_BE') results in Exception
> -
>
> Key: DRILL-4724
> URL: https://issues.apache.org/jira/browse/DRILL-4724
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.7.0
>Reporter: Khurram Faraaz
>Assignee: Chunhui Shi
>Priority: Minor
>
> The below query that uses binary_string function results in Exception
> Drill git commit ID : 6286c0a4
> {noformat}
> 2016-06-15 09:20:43,623 [289ee213-8ada-808f-e59d-5a6b67c53732:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 289ee213-8ada-808f-e59d-5a6b67c53732: 
> values(convert_from(binary_string('\\x99\\x8c\\x2f\\x77'),'INT_BE'))
> 2016-06-15 09:20:43,666 [289ee213-8ada-808f-e59d-5a6b67c53732:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: IllegalArgumentException: 
> Wrong length 8(8-0) in the buffer '\x5C\x99\x5C\x8C\x5C/\x5Cw', expected 4.
> [Error Id: bb6968cd-44c2-4c48-bb12-865f8709167e on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalArgumentException: Wrong length 8(8-0) in the buffer 
> '\x5C\x99\x5C\x8C\x5C/\x5Cw', expected 4.
> [Error Id: bb6968cd-44c2-4c48-bb12-865f8709167e on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>  ~[drill-common-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:791)
>  [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:901) 
> [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:271) 
> [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_101]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ReduceExpressionsRule_Project, args 
> [rel#1460:LogicalProject.NONE.ANY([]).[](input=rel#1459:Subset#0.NONE.ANY([]).[0],EXPR$0=CONVERT_FROMINT_BE(BINARY_STRING('\\x99\\x8c\\x2f\\x77')))]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ReduceExpressionsRule_Project, args 
> [rel#1460:LogicalProject.NONE.ANY([]).[](input=rel#1459:Subset#0.NONE.ANY([]).[0],EXPR$0=CONVERT_FROMINT_BE(BINARY_STRING('\\x99\\x8c\\x2f\\x77')))]
> at org.apache.calcite.util.Util.newInternal(Util.java:792) 
> ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
>  ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:339)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:237)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:286)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:94)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:978) 
> [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 

[jira] [Commented] (DRILL-4724) convert_from(binary_string(expression),'INT_BE') results in Exception

2016-06-16 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334178#comment-15334178
 ] 

Khurram Faraaz commented on DRILL-4724:
---

Agreed. If the error message can be improved that will help user.

> convert_from(binary_string(expression),'INT_BE') results in Exception
> -
>
> Key: DRILL-4724
> URL: https://issues.apache.org/jira/browse/DRILL-4724
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.7.0
>Reporter: Khurram Faraaz
>Assignee: Chunhui Shi
>
> The below query that uses binary_string function results in Exception
> Drill git commit ID : 6286c0a4
> {noformat}
> 2016-06-15 09:20:43,623 [289ee213-8ada-808f-e59d-5a6b67c53732:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 289ee213-8ada-808f-e59d-5a6b67c53732: 
> values(convert_from(binary_string('\\x99\\x8c\\x2f\\x77'),'INT_BE'))
> 2016-06-15 09:20:43,666 [289ee213-8ada-808f-e59d-5a6b67c53732:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: IllegalArgumentException: 
> Wrong length 8(8-0) in the buffer '\x5C\x99\x5C\x8C\x5C/\x5Cw', expected 4.
> [Error Id: bb6968cd-44c2-4c48-bb12-865f8709167e on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalArgumentException: Wrong length 8(8-0) in the buffer 
> '\x5C\x99\x5C\x8C\x5C/\x5Cw', expected 4.
> [Error Id: bb6968cd-44c2-4c48-bb12-865f8709167e on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>  ~[drill-common-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:791)
>  [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:901) 
> [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:271) 
> [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_101]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: Internal error: Error while 
> applying rule ReduceExpressionsRule_Project, args 
> [rel#1460:LogicalProject.NONE.ANY([]).[](input=rel#1459:Subset#0.NONE.ANY([]).[0],EXPR$0=CONVERT_FROMINT_BE(BINARY_STRING('\\x99\\x8c\\x2f\\x77')))]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Internal error: Error while applying 
> rule ReduceExpressionsRule_Project, args 
> [rel#1460:LogicalProject.NONE.ANY([]).[](input=rel#1459:Subset#0.NONE.ANY([]).[0],EXPR$0=CONVERT_FROMINT_BE(BINARY_STRING('\\x99\\x8c\\x2f\\x77')))]
> at org.apache.calcite.util.Util.newInternal(Util.java:792) 
> ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251)
>  ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
>  ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> ~[calcite-core-1.4.0-drill-r11.jar:1.4.0-drill-r11]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:339)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:237)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:286)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:94)
>  ~[drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:978) 
> [drill-java-exec-1.7.0-SNAPSHOT.jar:1.7.0-SNAPSHOT]
> at 

[jira] [Closed] (DRILL-4718) call to ResultSet.getObject(int columnIndex) results in InvalidCursorStateSqlException

2016-06-16 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4718.
-
Resolution: Invalid

This one was due to a user error, it not an issue any more.

> call to ResultSet.getObject(int columnIndex) results in 
> InvalidCursorStateSqlException
> --
>
> Key: DRILL-4718
> URL: https://issues.apache.org/jira/browse/DRILL-4718
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.7.0
> Environment: CentOS 6.8
>Reporter: Khurram Faraaz
>
> The JDBC snippet below results in the exception on Drill 1.7.0
> Result set cursor is positioned before all rows.  Call next() first.
> org.apache.drill.jdbc.InvalidCursorStateSqlException: Result set cursor is 
> positioned before all rows.  Call next() first.
> Drill git.commit.id=6286c0a
> JDBC Driver : drill-jdbc-all-1.7.0-SNAPSHOT.jar
> {noformat}
> try {
> final String URL_STRING = 
> "jdbc:drill:schema=dfs.tmp;drillbit=";
> Class.forName("org.apache.drill.jdbc.Driver").newInstance();
> Connection conn = 
> DriverManager.getConnection(URL_STRING,"test","test");
> Statement stmt = conn.createStatement();
> String query = "select columns[0] from `t3.csv`";
> ResultSet rs = stmt.executeQuery(query);
> System.out.println("ResultSet.getObject(1) 
> :"+rs.getObject(1));
> while (rs.next()) {
> }
> if (rs != null)
> rs.close();
> stmt.close();
> conn.close();
> } catch ( Exception e ) {
> System.out.println(e.getMessage());
> e.printStackTrace();
> }
> {noformat}
> data used in above query in t3.csv
> {noformat}
> abcd
> efgh
> ijkl
> mnop
> qrst
> {noformat}
> Here is the Exception
> {noformat}
> ...
> row_count: 5
> def {
>   record_count: 5
>   field {
> major_type {
>   minor_type: VARCHAR
>   mode: OPTIONAL
> }
> name_part {
>   name: "EXPR$0"
> }
> child {
>   major_type {
> minor_type: UINT1
> mode: REQUIRED
>   }
>   name_part {
> name: "$bits$"
>   }
>   value_count: 5
>   buffer_length: 5
> }
> child {
>   major_type {
> minor_type: VARCHAR
> mode: OPTIONAL
>   }
>   name_part {
> name: "EXPR$0"
>   }
>   child {
> major_type {
>   minor_type: UINT4
>   mode: REQUIRED
> }
> name_part {
>   name: "$offsets$"
> }
> value_count: 6
> buffer_length: 24
>   }
>   value_count: 5
>   buffer_length: 44
> }
> value_count: 5
> buffer_length: 49
>   }
>   carries_two_byte_selection_vector: false
> }
> , data=DrillBuf[14], udle: [9 131..180]].
> 08:37:17.483 [USER-rpc-event-queue] DEBUG o.a.d.e.rpc.user.QueryResultHandler 
> - resultArrived: queryState: COMPLETED, queryId = 
> 28a18f42-410f-bbc4-6445-0c95423ca341
> 08:37:17.484 [USER-rpc-event-queue] DEBUG 
> o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#1] Received query 
> completion: COMPLETED.
> 08:37:17.485 [USER-rpc-event-queue] DEBUG o.a.drill.exec.rpc.user.UserClient 
> - Sending response with Sender 1404764360
> Result set cursor is positioned before all rows.  Call next() first.
> org.apache.drill.jdbc.InvalidCursorStateSqlException: Result set cursor is 
> positioned before all rows.  Call next() first.
>   at 
> org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getCurrentRecordNumber(AvaticaDrillSqlAccessor.java:73)
>   at 
> org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:179)
>   at 
> net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:420)
>   at GetTblName.main(GetTblName.java:28)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2593) 500 error when crc for a query profile is out of sync

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334143#comment-15334143
 ] 

ASF GitHub Bot commented on DRILL-2593:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/523
  
+1. LGTM


> 500 error when crc for a query profile is out of sync
> -
>
> Key: DRILL-2593
> URL: https://issues.apache.org/jira/browse/DRILL-2593
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 0.7.0
>Reporter: Jason Altekruse
>Assignee: Arina Ielchiieva
> Fix For: 1.7.0
>
> Attachments: warning1.JPG, warning2.JPG
>
>
> To reproduce, on a machine where an embedded drillbit has been run, edit one 
> of the profiles stored in /tmp/drill/profiles and try to navigate to the 
> profiles page on the Web UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3272) HIve : Using 'if' function in hive results in an ExpressionParsingException

2016-06-16 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15334130#comment-15334130
 ] 

Vitalii Diravka commented on DRILL-3272:


[~rkins] "IF" hive udf works now in 1.6.0 and 1.5.0 drill versions. Please 
check it. Looks like it was fixed already. 

> HIve : Using 'if' function in hive results in an ExpressionParsingException
> ---
>
> Key: DRILL-3272
> URL: https://issues.apache.org/jira/browse/DRILL-3272
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Hive
>Reporter: Rahul Challapalli
> Fix For: Future
>
> Attachments: error.log
>
>
> git.commit.id.abbrev=5f26b8b
> The below query fails. It works properly from hive however
> {code}
> select if(1999 > 2000, 'latest', 'old') from lineitem limit 1;
> Error: SYSTEM ERROR: 
> org.apache.drill.common.exceptions.ExpressionParsingException: Expression has 
> syntax error! line 1:28:mismatched input ',' expecting CParen
> Fragment 1:1
> [Error Id: 007e7d7d-62dc-42fd-b526-07762c33719c on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> I attached the error log. Let me know if you need anything else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2385) count on complex objects failed with missing function implementation

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333952#comment-15333952
 ] 

ASF GitHub Bot commented on DRILL-2385:
---

Github user vdiravka commented on the issue:

https://github.com/apache/drill/pull/501
  
Squashing and rebasing to the master were done.


> count on complex objects failed with missing function implementation
> 
>
> Key: DRILL-2385
> URL: https://issues.apache.org/jira/browse/DRILL-2385
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 0.8.0
>Reporter: Chun Chang
>Assignee: Vitalii Diravka
>Priority: Minor
> Fix For: 1.7.0
>
>
> #Wed Mar 04 01:23:42 EST 2015
> git.commit.id.abbrev=71b6bfe
> Have a complex type looks like the following:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.sia from 
> `complex.json` t limit 1;
> ++
> |sia |
> ++
> | [1,11,101,1001] |
> ++
> {code}
> A count on the complex type will fail with missing function implementation:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.gbyi, count(t.sia) 
> countsia from `complex.json` t group by t.gbyi;
> Query failed: RemoteRpcException: Failure while running fragment., Schema is 
> currently null.  You must call buildSchema(SelectionVectorMode) before this 
> container can return a schema. [ 12856530-3133-45be-bdf4-ef8cc784f7b3 on 
> qa-node119.qa.lab:31010 ]
> [ 12856530-3133-45be-bdf4-ef8cc784f7b3 on qa-node119.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> drillbit.log
> {code}
> 2015-03-04 13:44:51,383 [2b08832b-9247-e90c-785d-751f02fc1548:frag:2:0] ERROR 
> o.a.drill.exec.ops.FragmentContext - Fragment Context received failure.
> org.apache.drill.exec.exception.SchemaChangeException: Failure while 
> materializing expression.
> Error in expression at index 0.  Error: Missing function implementation: 
> [count(BIGINT-REPEATED)].  Full expression: null.
> at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.createAggregatorInternal(HashAggBatch.java:210)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.createAggregator(HashAggBatch.java:158)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:101)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:130)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:114)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:121)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-03-04 13:44:51,383 [2b08832b-9247-e90c-785d-751f02fc1548:frag:2:0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
> fragment
> java.lang.NullPointerException: Schema is currently null.  You must call 
> buildSchema(SelectionVectorMode) before this container can return a schema.
> at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:261)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (DRILL-4726) Dynamic UDFs support

2016-06-16 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4726:

Description: 
Allow register UDFs without  restart of Drillbits.
Design is described in document below:

https://docs.google.com/document/d/1MluM17EKajvNP_x8U4aymcOihhUm8BMm8t_hM0jEFWk/edit

  was:
Allow register UDFs without  restart of Drillbits.
Approach is described in document below:

https://docs.google.com/document/d/1MluM17EKajvNP_x8U4aymcOihhUm8BMm8t_hM0jEFWk/edit


> Dynamic UDFs support
> 
>
> Key: DRILL-4726
> URL: https://issues.apache.org/jira/browse/DRILL-4726
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> Allow register UDFs without  restart of Drillbits.
> Design is described in document below:
> https://docs.google.com/document/d/1MluM17EKajvNP_x8U4aymcOihhUm8BMm8t_hM0jEFWk/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4726) Dynamic UDFs support

2016-06-16 Thread Arina Ielchiieva (JIRA)
Arina Ielchiieva created DRILL-4726:
---

 Summary: Dynamic UDFs support
 Key: DRILL-4726
 URL: https://issues.apache.org/jira/browse/DRILL-4726
 Project: Apache Drill
  Issue Type: New Feature
Affects Versions: 1.6.0
Reporter: Arina Ielchiieva
Assignee: Arina Ielchiieva
 Fix For: Future


Allow register UDFs without  restart of Drillbits.
Approach is described in document below:

https://docs.google.com/document/d/1MluM17EKajvNP_x8U4aymcOihhUm8BMm8t_hM0jEFWk/edit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3149) TextReader should support multibyte line delimiters

2016-06-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333438#comment-15333438
 ] 

ASF GitHub Bot commented on DRILL-3149:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/500
  
Done.


> TextReader should support multibyte line delimiters
> ---
>
> Key: DRILL-3149
> URL: https://issues.apache.org/jira/browse/DRILL-3149
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text & CSV
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Jim Scott
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: Future
>
>
> lineDelimiter in the TextFormatConfig doesn't support \r\n for record 
> delimiters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)