[jira] [Updated] (DRILL-3178) csv reader should allow newlines inside quotes

2016-09-16 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/DRILL-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

F Méthot updated DRILL-3178:

Attachment: drill-3178.patch

> csv reader should allow newlines inside quotes 
> ---
>
> Key: DRILL-3178
> URL: https://issues.apache.org/jira/browse/DRILL-3178
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text & CSV
>Affects Versions: 1.0.0
> Environment: Ubuntu Trusty 14.04.2 LTS
>Reporter: Neal McBurnett
>Assignee: F Méthot
> Fix For: Future
>
> Attachments: drill-3178.patch
>
>
> When reading a csv file which contains newlines within quoted strings, e.g. 
> via
> select * from dfs.`/tmp/q.csv`;
> Drill 1.0 says:
> Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException:  
> Error processing input: Cannot use newline character within quoted string
> But many tools produce csv files with newlines in quoted strings.  Drill 
> should be able to handle them.
> Workaround: the csvquote program (https://github.com/dbro/csvquote) can 
> encode embedded commas and newlines, and even decode them later if desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3178) csv reader should allow newlines inside quotes

2016-09-16 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/DRILL-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

F Méthot reassigned DRILL-3178:
---

Assignee: F Méthot

> csv reader should allow newlines inside quotes 
> ---
>
> Key: DRILL-3178
> URL: https://issues.apache.org/jira/browse/DRILL-3178
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text & CSV
>Affects Versions: 1.0.0
> Environment: Ubuntu Trusty 14.04.2 LTS
>Reporter: Neal McBurnett
>Assignee: F Méthot
> Fix For: Future
>
>
> When reading a csv file which contains newlines within quoted strings, e.g. 
> via
> select * from dfs.`/tmp/q.csv`;
> Drill 1.0 says:
> Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException:  
> Error processing input: Cannot use newline character within quoted string
> But many tools produce csv files with newlines in quoted strings.  Drill 
> should be able to handle them.
> Workaround: the csvquote program (https://github.com/dbro/csvquote) can 
> encode embedded commas and newlines, and even decode them later if desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4895) StreamingAggBatch code generation issues

2016-09-16 Thread Gautam Kumar Parai (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497884#comment-15497884
 ] 

Gautam Kumar Parai commented on DRILL-4895:
---

This is a potential performance issue - hence not critical.

> StreamingAggBatch code generation issues
> 
>
> Key: DRILL-4895
> URL: https://issues.apache.org/jira/browse/DRILL-4895
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.7.0
>Reporter: Gautam Kumar Parai
>Assignee: Gautam Kumar Parai
>
> We unnecessarily re-generate the code for the StreamingAggBatch even without 
> schema changes. Also, we seem to generate many holder variables than what 
> maybe required. This also affects sub-classes. HashAggBatch does not have the 
> same issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4895) StreamingAggBatch code generation issues

2016-09-16 Thread Gautam Kumar Parai (JIRA)
Gautam Kumar Parai created DRILL-4895:
-

 Summary: StreamingAggBatch code generation issues
 Key: DRILL-4895
 URL: https://issues.apache.org/jira/browse/DRILL-4895
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.7.0
Reporter: Gautam Kumar Parai
Assignee: Gautam Kumar Parai


We unnecessarily re-generate the code for the StreamingAggBatch even without 
schema changes. Also, we seem to generate many holder variables than what maybe 
required. This also affects sub-classes. HashAggBatch does not have the same 
issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-4771) Drill should avoid doing the same join twice if count(distinct) exists

2016-09-16 Thread Gautam Kumar Parai (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497847#comment-15497847
 ] 

Gautam Kumar Parai edited comment on DRILL-4771 at 9/17/16 12:51 AM:
-

I have created the pull request https://github.com/apache/drill/pull/588. 
[~amansinha100] [~jni] Can you please review the PR? Thanks!


was (Author: gparai):
I have created the pull request https://github.com/apache/drill/pull/588. 
[~amansinha100] Can you please review the PR? Thanks!

> Drill should avoid doing the same join twice if count(distinct) exists
> --
>
> Key: DRILL-4771
> URL: https://issues.apache.org/jira/browse/DRILL-4771
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Gautam Kumar Parai
>Assignee: Gautam Kumar Parai
>
> When the query has one distinct aggregate and one or more non-distinct 
> aggregates, the join instance need not produce the join-based plan. We can 
> generate multi-phase aggregates. Another approach would be to use grouping 
> sets. However, Drill is unable to support grouping sets and instead relies on 
> the join-based plan (see the plan below)
> {code}
> select emp.empno, count(*), avg(distinct dept.deptno) 
> from sales.emp emp inner join sales.dept dept 
> on emp.deptno = dept.deptno 
> group by emp.empno
> LogicalProject(EMPNO=[$0], EXPR$1=[$1], EXPR$2=[$3])
>   LogicalJoin(condition=[IS NOT DISTINCT FROM($0, $2)], joinType=[inner])
> LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
>   LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
> LogicalJoin(condition=[=($7, $9)], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalAggregate(group=[{0}], EXPR$2=[AVG($1)])
>   LogicalAggregate(group=[{0, 1}])
> LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
>   LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> The more efficient form should look like this
> {code}
> select emp.empno, count(*), avg(distinct dept.deptno) 
> from sales.emp emp inner join sales.dept dept 
> on emp.deptno = dept.deptno 
> group by emp.empno
> LogicalAggregate(group=[{0}], EXPR$1=[SUM($2)], EXPR$2=[AVG($1)])
>   LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT()])
> LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
>   LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[jira] [Commented] (DRILL-4771) Drill should avoid doing the same join twice if count(distinct) exists

2016-09-16 Thread Gautam Kumar Parai (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497847#comment-15497847
 ] 

Gautam Kumar Parai commented on DRILL-4771:
---

I have created the pull request https://github.com/apache/drill/pull/588. 
[~amansinha100] Can you please review the PR? Thanks!

> Drill should avoid doing the same join twice if count(distinct) exists
> --
>
> Key: DRILL-4771
> URL: https://issues.apache.org/jira/browse/DRILL-4771
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.2.0
>Reporter: Gautam Kumar Parai
>Assignee: Gautam Kumar Parai
>
> When the query has one distinct aggregate and one or more non-distinct 
> aggregates, the join instance need not produce the join-based plan. We can 
> generate multi-phase aggregates. Another approach would be to use grouping 
> sets. However, Drill is unable to support grouping sets and instead relies on 
> the join-based plan (see the plan below)
> {code}
> select emp.empno, count(*), avg(distinct dept.deptno) 
> from sales.emp emp inner join sales.dept dept 
> on emp.deptno = dept.deptno 
> group by emp.empno
> LogicalProject(EMPNO=[$0], EXPR$1=[$1], EXPR$2=[$3])
>   LogicalJoin(condition=[IS NOT DISTINCT FROM($0, $2)], joinType=[inner])
> LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
>   LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
> LogicalJoin(condition=[=($7, $9)], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalAggregate(group=[{0}], EXPR$2=[AVG($1)])
>   LogicalAggregate(group=[{0, 1}])
> LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
>   LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> The more efficient form should look like this
> {code}
> select emp.empno, count(*), avg(distinct dept.deptno) 
> from sales.emp emp inner join sales.dept dept 
> on emp.deptno = dept.deptno 
> group by emp.empno
> LogicalAggregate(group=[{0}], EXPR$1=[SUM($2)], EXPR$2=[AVG($1)])
>   LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT()])
> LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
>   LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4894) Fix unit test failure in 'storage-hive/core' module

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497811#comment-15497811
 ] 

ASF GitHub Bot commented on DRILL-4894:
---

Github user gparai commented on the issue:

https://github.com/apache/drill/pull/587
  
+1, unit tests passed.


> Fix unit test failure in 'storage-hive/core' module
> ---
>
> Key: DRILL-4894
> URL: https://issues.apache.org/jira/browse/DRILL-4894
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As part of DRILL-4886, I added `hbase-server` as a dependency for 
> 'storage-hive/core' which pulled older version (2.5.1) of some hadoop jars, 
> incompatible with other hadoop jars used by drill (2.7.1).
> This breaks unit tests in this module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi edited comment on DRILL-3898 at 9/17/16 12:04 AM:
---

The code in the above comment should be (Jira does not let me edit the comment 
)

{code}
public void write(byte[] b, int off, int len) throws IOException {  
  try { 
fos.write(b, off, len);  
  } catch (IOException e) {// unexpected exception 
throw new FSError(e);  // assume native fs error
  } 
}

{code}



was (Author: ben-zvi):
The code in the above comment should be (Jira does not let me edit the comment 
)

{quote}
public void write(byte[] b, int off, int len) throws IOException {  
  try { 
fos.write(b, off, len);  
  } catch (IOException e) {// unexpected exception 
throw new FSError(e);  // assume native fs error
  } 
}

{quote}


> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) 
> ~[drill-common-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Comment Edited] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi edited comment on DRILL-3898 at 9/17/16 12:02 AM:
---

The code in the above comment should be (Jira does not let me edit the comment 
)

{quote}
public void write(byte[] b, int off, int len) throws IOException {  
  try { 
fos.write(b, off, len);  
  } catch (IOException e) {// unexpected exception 
throw new FSError(e);  // assume native fs error
  } 
}

{quote}



was (Author: ben-zvi):
The code in the above comment should be (Jira does not let me edit the comment 
)

{quote}
public void write(byte[] b, int off, int len) throws IOException {  
  try { 
fos.write(b, off, len);  
  } catch (IOException e) {// unexpected exception 
throw new FSError(e);  // assume native fs error
  } 
}

{quote}


> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) 
> ~[drill-common-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Comment Edited] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi edited comment on DRILL-3898 at 9/17/16 12:01 AM:
---

The code in the above comment should be (Jira does not let me edit the comment 
)

{quote}
public void write(byte[] b, int off, int len) throws IOException {  
  try { 
fos.write(b, off, len);  
  } catch (IOException e) {// unexpected exception 
throw new FSError(e);  // assume native fs error
  } 
}

{quote}



was (Author: ben-zvi):
The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {  
<>
  
  try {   <>
   
fos.write(b, off, len); <>
  
  } catch (IOException e) {// unexpected exception   
<>
   
throw new FSError(e);  // assume native fs error
<>
   
  }  <>
   
}



> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) 
> 

[jira] [Comment Edited] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi edited comment on DRILL-3898 at 9/16/16 11:56 PM:
---

The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {  
<>
  
  try {   <>
   
fos.write(b, off, len); <>
  
  } catch (IOException e) {// unexpected exception   
<>
   
throw new FSError(e);  // assume native fs error
<>
   
  }  <>
   
}




was (Author: ben-zvi):
The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {
  
  try {
   
fos.write(b, off, len);
  
  } catch (IOException e) {// unexpected exception
   
throw new FSError(e);  // assume native fs error
   
  }
   
}



> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Comment Edited] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi edited comment on DRILL-3898 at 9/16/16 11:54 PM:
---

The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {

  try {

fos.write(b, off, len);

  } catch (IOException e) {// unexpected exception

throw new FSError(e);  // assume native fs error

  }

}




was (Author: ben-zvi):
The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {
  try {
fos.write(b, off, len);
  } catch (IOException e) {// unexpected exception
throw new FSError(e);  // assume native fs error
  }
}



> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) 
> ~[drill-common-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Comment Edited] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi edited comment on DRILL-3898 at 9/16/16 11:55 PM:
---

The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {
  
  try {
   
fos.write(b, off, len);
  
  } catch (IOException e) {// unexpected exception
   
throw new FSError(e);  // assume native fs error
   
  }
   
}




was (Author: ben-zvi):
The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {

  try {

fos.write(b, off, len);

  } catch (IOException e) {// unexpected exception

throw new FSError(e);  // assume native fs error

  }

}



> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) 
> ~[drill-common-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Commented] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497761#comment-15497761
 ] 

Boaz Ben-Zvi commented on DRILL-3898:
-

The code in the above comment should be (Jira does not let me edit the comment 
)

public void write(byte[] b, int off, int len) throws IOException {
  try {
fos.write(b, off, len);
  } catch (IOException e) {// unexpected exception
throw new FSError(e);  // assume native fs error
  }
}



> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:400)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) 
> ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:152)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:44) 
> ~[drill-common-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:553)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:362)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:147)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
> at 
> 

[jira] [Commented] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497745#comment-15497745
 ] 

ASF GitHub Bot commented on DRILL-3898:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/585#discussion_r79267883
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -592,11 +592,14 @@ public BatchGroup 
mergeAndSpill(LinkedList batchGroups) throws Schem
   }
   injector.injectChecked(context.getExecutionControls(), 
INTERRUPTION_WHILE_SPILLING, IOException.class);
   newGroup.closeOutputStream();
-} catch (Exception e) {
+} catch (Throwable e) {
   // we only need to cleanup newGroup if spill failed
-  AutoCloseables.close(e, newGroup);
+  try {
+AutoCloseables.close(e, newGroup);
+  } catch (Throwable t) { /* close() may hit the same IO issue; just 
ignore */ }
--- End diff --

The root cause for the whole bug is in Hadoop's RawLocalFileSystem.java:

package org.apache.hadoop.fs;
.
public void write(byte[] b, int off, int len) throws IOException {
  try {
fos.write(b, off, len);
  } catch (IOException e) {// unexpected exception
throw new FSError(e);  // assume native fs error
  }
}

And FSError is not a subclass of IOException !!!  

java.lang.Object
java.lang.Throwable
java.lang.Error
org.apache.hadoop.fs.FSError

So the only common ancestor is Throwable .  And any part in the drill code 
that catches only IOException will not catch !!





> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as varchar(100)) end
> . . . . . . . . . . . . >  from 
> . . . . . . . . . . . . >   `store_sales.dat` ss 
> . . . . . . . . . . . . > ;
> Error: SYSTEM ERROR: NullPointerException
> Fragment 1:16
> [Error Id: 0ae9338d-d04f-4b4a-93aa-a80d13cedb29 on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> This exception in drillbit.log should have triggered query cancellation:
> {code}
> 2015-10-06 17:01:34,463 [WorkManager-2] ERROR 
> o.apache.drill.exec.work.WorkManager - 
> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an exception.
> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
> at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:226)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> ~[na:1.7.0_71]
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> ~[na:1.7.0_71]
> at java.io.FilterOutputStream.close(FilterOutputStream.java:157) 
> ~[na:1.7.0_71]
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>  ~[hadoop-common-2.5.1-mapr-1503.jar:na]
> at 
> 

[jira] [Commented] (DRILL-4894) Fix unit test failure in 'storage-hive/core' module

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497605#comment-15497605
 ] 

ASF GitHub Bot commented on DRILL-4894:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/587


> Fix unit test failure in 'storage-hive/core' module
> ---
>
> Key: DRILL-4894
> URL: https://issues.apache.org/jira/browse/DRILL-4894
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As part of DRILL-4886, I added `hbase-server` as a dependency for 
> 'storage-hive/core' which pulled older version (2.5.1) of some hadoop jars, 
> incompatible with other hadoop jars used by drill (2.7.1).
> This breaks unit tests in this module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4894) Fix unit test failure in 'storage-hive/core' module

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497582#comment-15497582
 ] 

ASF GitHub Bot commented on DRILL-4894:
---

Github user adityakishore commented on the issue:

https://github.com/apache/drill/pull/587
  
I have verified that this does not alter the content of binary package.


> Fix unit test failure in 'storage-hive/core' module
> ---
>
> Key: DRILL-4894
> URL: https://issues.apache.org/jira/browse/DRILL-4894
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As part of DRILL-4886, I added `hbase-server` as a dependency for 
> 'storage-hive/core' which pulled older version (2.5.1) of some hadoop jars, 
> incompatible with other hadoop jars used by drill (2.7.1).
> This breaks unit tests in this module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4894) Fix unit test failure in 'storage-hive/core' module

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497490#comment-15497490
 ] 

ASF GitHub Bot commented on DRILL-4894:
---

Github user chunhui-shi commented on the issue:

https://github.com/apache/drill/pull/587
  
+1, unit test passed.


> Fix unit test failure in 'storage-hive/core' module
> ---
>
> Key: DRILL-4894
> URL: https://issues.apache.org/jira/browse/DRILL-4894
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As part of DRILL-4886, I added `hbase-server` as a dependency for 
> 'storage-hive/core' which pulled older version (2.5.1) of some hadoop jars, 
> incompatible with other hadoop jars used by drill (2.7.1).
> This breaks unit tests in this module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3898) No space error during external sort does not cancel the query

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497450#comment-15497450
 ] 

ASF GitHub Bot commented on DRILL-3898:
---

Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/585#discussion_r79255636
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -592,11 +592,14 @@ public BatchGroup 
mergeAndSpill(LinkedList batchGroups) throws Schem
   }
   injector.injectChecked(context.getExecutionControls(), 
INTERRUPTION_WHILE_SPILLING, IOException.class);
   newGroup.closeOutputStream();
-} catch (Exception e) {
+} catch (Throwable e) {
   // we only need to cleanup newGroup if spill failed
-  AutoCloseables.close(e, newGroup);
+  try {
+AutoCloseables.close(e, newGroup);
+  } catch (Throwable t) { /* close() may hit the same IO issue; just 
ignore */ }
--- End diff --

In the case of no disk space to spill, close() tries to cleanup by calling 
flushBuffer() which eventually throws the same exception as there's still no 
space:

at java.io.FileOutputStream.write(FileOutputStream.java:326)
  at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:246)
  at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
  - locked <0x24e5> (a java.io.BufferedOutputStream)
  at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
  at java.io.DataOutputStream.write(DataOutputStream.java:107)
  - locked <0x24e7> (a org.apache.hadoop.fs.FSDataOutputStream)
  at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.writeChunk(ChecksumFileSystem.java:419)
  at 
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:206)
  at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
  - locked <0x24e8> (a 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer)
  at 
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
  at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:407)
  at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
  at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
  at 
org.apache.drill.exec.physical.impl.xsort.BatchGroup.close(BatchGroup.java:169)
  at 
org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76)
  at 
org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:53)
  at 
org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:43)
  at 
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:598)



> No space error during external sort does not cancel the query
> -
>
> Key: DRILL-3898
> URL: https://issues.apache.org/jira/browse/DRILL-3898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0, 1.8.0
>Reporter: Victoria Markman
>Assignee: Boaz Ben-Zvi
> Fix For: Future
>
> Attachments: drillbit.log, sqlline_3898.ver_1_8.log
>
>
> While verifying DRILL-3732 I ran into a new problem.
> I think drill somehow loses track of out of disk exception and does not 
> cancel rest of the query, which results in NPE:
> Reproduction is the same as in DRILL-3732:
> {code}
> 0: jdbc:drill:schema=dfs> create table store_sales_20(ss_item_sk, 
> ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, s_sold_date_sk, ss_promo_sk) 
> partition by (ss_promo_sk) as
> . . . . . . . . . . . . >  select 
> . . . . . . . . . . . . >  case when columns[2] = '' then cast(null as 
> varchar(100)) else cast(columns[2] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[3] = '' then cast(null as 
> varchar(100)) else cast(columns[3] as varchar(100)) end,
> . . . . . . . . . . . . >  case when columns[4] = '' then cast(null as 
> varchar(100)) else cast(columns[4] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[5] = '' then cast(null as 
> varchar(100)) else cast(columns[5] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[0] = '' then cast(null as 
> varchar(100)) else cast(columns[0] as varchar(100)) end, 
> . . . . . . . . . . . . >  case when columns[8] = '' then cast(null as 
> varchar(100)) else cast(columns[8] as 

[jira] [Commented] (DRILL-4894) Fix unit test failure in 'storage-hive/core' module

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497289#comment-15497289
 ] 

ASF GitHub Bot commented on DRILL-4894:
---

GitHub user adityakishore opened a pull request:

https://github.com/apache/drill/pull/587

DRILL-4894: Fix unit test failure in 'storage-hive/core' module

Exclude 'hadoop-mapreduce-client-core' and 'hadoop-auth' as transitive 
dependencies from 'hbase-server'

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adityakishore/drill DRILL-4894

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/587.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #587


commit f3c26e34e3a72ef338c4dbca1a0204f342176972
Author: Aditya Kishore 
Date:   2016-09-16T19:14:35Z

DRILL-4894: Fix unit test failure in 'storage-hive/core' module

Exclude 'hadoop-mapreduce-client-core' and 'hadoop-auth' as transitive 
dependencies from 'hbase-server'




> Fix unit test failure in 'storage-hive/core' module
> ---
>
> Key: DRILL-4894
> URL: https://issues.apache.org/jira/browse/DRILL-4894
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>
> As part of DRILL-4886, I added `hbase-server` as a dependency for 
> 'storage-hive/core' which pulled older version (2.5.1) of some hadoop jars, 
> incompatible with other hadoop jars used by drill (2.7.1).
> This breaks unit tests in this module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4894) Fix unit test failure in 'storage-hive/core' module

2016-09-16 Thread Aditya Kishore (JIRA)
Aditya Kishore created DRILL-4894:
-

 Summary: Fix unit test failure in 'storage-hive/core' module
 Key: DRILL-4894
 URL: https://issues.apache.org/jira/browse/DRILL-4894
 Project: Apache Drill
  Issue Type: Bug
Reporter: Aditya Kishore
Assignee: Aditya Kishore


As part of DRILL-4886, I added `hbase-server` as a dependency for 
'storage-hive/core' which pulled older version (2.5.1) of some hadoop jars, 
incompatible with other hadoop jars used by drill (2.7.1).

This breaks unit tests in this module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4892) Swift Documentation

2016-09-16 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497034#comment-15497034
 ] 

Sudheesh Katkam commented on DRILL-4892:


>From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from \[1\], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
\[1\] and \[2\]. The parameters and values mentioned in the “Configuring” 
section in \[1\] should be provided through the “config” map in the storage 
plugin (or maybe through conf/core-site.xml in the Drill installation 
directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
...
  \},
  "formats": \{
...
  \}
  "config": \{
...
  \}
}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the 
exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}

> Swift Documentation
> ---
>
> Key: DRILL-4892
> URL: https://issues.apache.org/jira/browse/DRILL-4892
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.6.0, 1.8.0
>Reporter: Matt Keranen
>
> The Drill FAQ (https://drill.apache.org/faq/), suggest Swift is a datasource:
> "Cloud storage: Amazon S3, Google Cloud Storage, Azure Blog Storage, Swift"
> However there appears to be no documentation (?)
> Swift specific docs would be very useful. We have a large Swift installation 
> and using Drill over files in it would be a valuable feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-4892) Swift Documentation

2016-09-16 Thread Sudheesh Katkam (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497034#comment-15497034
 ] 

Sudheesh Katkam edited comment on DRILL-4892 at 9/16/16 6:31 PM:
-

>From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from \[1\], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
\[1\] and \[2\]. The parameters and values mentioned in the “Configuring” 
section in \[1\] should be provided through the “config” map in the storage 
plugin (or maybe through conf/core-site.xml in the Drill installation 
directory).

Something like:
\{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
...
  \},
  "formats": \{
...
  \}
  "config": \{
...
  \}
\}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the 
exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}


was (Author: sudheeshkatkam):
>From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from \[1\], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
\[1\] and \[2\]. The parameters and values mentioned in the “Configuring” 
section in \[1\] should be provided through the “config” map in the storage 
plugin (or maybe through conf/core-site.xml in the Drill installation 
directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
...
  \},
  "formats": \{
...
  \}
  "config": \{
...
  \}
}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the 
exact configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}

> Swift Documentation
> ---
>
> Key: DRILL-4892
> URL: https://issues.apache.org/jira/browse/DRILL-4892
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.6.0, 1.8.0
>Reporter: Matt Keranen
>
> The Drill FAQ (https://drill.apache.org/faq/), suggest Swift is a datasource:
> "Cloud storage: Amazon S3, Google Cloud Storage, Azure Blog Storage, Swift"
> However there appears to be no documentation (?)
> Swift specific docs would be very useful. We have a large Swift installation 
> and using Drill over files in it would be a valuable feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4726) Dynamic UDFs support

2016-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15496135#comment-15496135
 ] 

ASF GitHub Bot commented on DRILL-4726:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/574#discussion_r79155363
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -186,4 +226,105 @@ public boolean isFunctionComplexOutput(String name) {
 return false;
   }
 
+  public RemoteFunctionRegistry getRemoteFunctionRegistry() {
+return remoteFunctionRegistry;
+  }
+
+  public List validate(Path path) throws IOException {
+URL url = path.toUri().toURL();
+URL[] urls = {url};
+ClassLoader classLoader = new URLClassLoader(urls);
+return drillFuncRegistry.validate(path.getName(), scan(classLoader, 
path, urls));
+  }
+
+  public void register(String jarName, ScanResult classpathScan, 
ClassLoader classLoader) {
+drillFuncRegistry.register(jarName, classpathScan, classLoader);
+  }
+
+  public void unregister(String jarName) {
+drillFuncRegistry.unregister(jarName);
+  }
+
+  /**
+   * Loads all missing functions from remote registry.
+   * Compares list of already registered jars and remote jars, loads 
missing jars.
+   * Missing jars are stores in local DRILL_UDF_DIR.
+   *
+   * @return true if at least functions from one jar were loaded
+   */
+  public boolean loadRemoteFunctions() {
+List missingJars = Lists.newArrayList();
+Registry registry = remoteFunctionRegistry.getRegistry();
+
+List localJars = drillFuncRegistry.getAllJarNames();
+for (Jar jar : registry.getJarList()) {
+  if (!localJars.contains(jar.getName())) {
+missingJars.add(jar.getName());
+  }
+}
+
+for (String jarName : missingJars) {
+  try {
+Path localUdfArea = new Path(new File(getUdfDir()).toURI());
--- End diff --

Agree, I have already removed creation from sh script to Drill.


> Dynamic UDFs support
> 
>
> Key: DRILL-4726
> URL: https://issues.apache.org/jira/browse/DRILL-4726
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> Allow register UDFs without  restart of Drillbits.
> Design is described in document below:
> https://docs.google.com/document/d/1FfyJtWae5TLuyheHCfldYUpCdeIezR2RlNsrOTYyAB4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)