date:20151022

[jira] [Commented] (DRILL-3944) Drill MAXDIR Unknown variable or type "FILE_SEPARATOR"

2015-10-22 Thread Jitendra (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969756#comment-14969756
 ] 

Jitendra commented on DRILL-3944:
-

0: jdbc:drill:drillbit=localhost> select dir0 from vspace. wspace.`freemate2` 
where dir0= maxdir('vspace. wspace', 'freemate2'); 
---
dir0
---
---
No rows selected (0.444 seconds)

Then I tried below query also with dir1.

select dir1 from vspace. wspace.`freemate2` where dir1 = maxdir('vspace. 
wspace', 'freemate2');

2015-10-22 01:03:10,706 [29d7ca30-f074-d8fc-621a-e07adc8e66ff:foreman] INFO  
o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to check for 
Parquet metadata file.
java.io.IOException: Open failed for file: /vspace/wspace/freemate2/20151005, 
error: Invalid argument (22)
at com.mapr.fs.MapRClientImpl.open(MapRClientImpl.java:212) 
~[maprfs-4.1.0-mapr.jar:4.1.0-mapr]
at com.mapr.fs.MapRFileSystem.open(MapRFileSystem.java:862) 
~[maprfs-4.1.0-mapr.jar:4.1.0-mapr]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:800) 
~[hadoop-common-2.5.1-mapr-1503.jar:na]
at 
org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:132) 
~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:142)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.dfs.BasicFormatMatcher.isFileReadable(BasicFormatMatcher.java:112)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:256)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:210)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:326)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:153)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:276)
 [drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.java:83)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatalogReader.java:116)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:99)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:70)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.java:75)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(DelegatingScope.java:124)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:104)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:877)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2777)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2762)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2985)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
 [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]
at

[jira] [Commented] (DRILL-3340) Add named metrics and named operators in OperatorProfile

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969592#comment-14969592
 ] 

ASF GitHub Bot commented on DRILL-3340:
---

Github user adeneche commented on a diff in the pull request:

https://github.com/apache/drill/pull/216#discussion_r42786354
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/OperatorWrapper.java
 ---
@@ -130,4 +136,32 @@ public void addSummary(TableBuilder tb) {
 tb.appendBytes(Math.round(memSum / size), null);
 tb.appendBytes(peakMem.getLeft().getPeakLocalMemoryAllocated(), null);
   }
+
+  public String getMetricsTable() {
+if (!OperatorMetricRegistry.contains(operatorType.getNumber())) {
+  return "";
+}
+final ArrayList metricNames = Lists.newArrayList("Minor 
Fragment");
+for (final MetricValue metric : firstProfile.getMetricList()) {
+  
metricNames.add(OperatorMetricRegistry.getMetricName(operatorType.getNumber(), 
metric.getMetricId()));
+}
+
+final String[] metricsTableColumnNames = new 
String[metricNames.size()];
+final TableBuilder builder = new 
TableBuilder(metricNames.toArray(metricsTableColumnNames));
+for (final ImmutablePair ip : ops) {
+  final OperatorProfile op = ip.getLeft();
+
+  builder.appendCell(
+  new OperatorPathBuilder()
+  .setMajor(major)
+  .setMinor(ip.getRight())
+  .setOperator(op)
+  .build(),
+  null);
+  for (final MetricValue metric : op.getMetricList()) {
+builder.appendInteger(metric.getLongValue(), null);
--- End diff --

the code builds the table header from the first profile's metric list, then 
assumes all minor fragments will have the metrics in the same order of the 
first profile.

Can you confirm that this assumption is always true ?


> Add named metrics and named operators in OperatorProfile
> 
>
> Key: DRILL-3340
> URL: https://issues.apache.org/jira/browse/DRILL-3340
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: DRILL-3340.1.patch.txt, DRILL-3340.2.patch.txt, 
> DRILL-3340.3.patch.txt
>
>
> + Useful when reading JSON query profile.
> + Rename FragmentStats#getOperatorStats to FragmentStats#newOperatorStats



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3908) ResultSetMetadata.getColumnDisplaySize() does not return correct column length value

2015-10-22 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969629#comment-14969629
 ] 

Khurram Faraaz commented on DRILL-3908:
---

getColumnDisplaySize method of interface ResultSetMetaData should return, the 
normal maximum number of characters allowed as the width of the designated 
column. In this case since the column is defined to be of VARCHAR type the 
maximum allowed width is 4000 bytes.

> ResultSetMetadata.getColumnDisplaySize() does not return correct column 
> length value
> 
>
> Key: DRILL-3908
> URL: https://issues.apache.org/jira/browse/DRILL-3908
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.1.0
> Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 
> 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Sergio Lob
>Assignee: Parth Chandra
>Priority: Critical
> Fix For: 1.3.0
>
>
> ResultSetMetadata.getColumnDisplaySize() does not return correct column 
> length value. In fact, invocation of this method seems to always return value 
> 10.  This causes our JDBC application to process data incorrectly.  I see 
> that this problem was referenced in jira DRILL-3151, but the this bug has 
> apparently not been addressed, although status of DRILL-3151 is "resolved". 
> This is a critical bug for us. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3908) ResultSetMetadata.getColumnDisplaySize() does not return correct column length value

2015-10-22 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969553#comment-14969553
 ] 

Khurram Faraaz commented on DRILL-3908:
---

I ran the test using simba JDBC driver, and getColumnDisplaySize(1) returns 
1258, unlike Drill's JDBC driver where we return 10.

Here is the definition of the parquet file over which the query is run from 
JDBC.
{code}
[root@centos-01 parquet-tools]# ./parquet-schema 0_0_0.parquet
message root {
  optional int32 col0;
  optional int64 col1;
  optional float col2;
  optional double col3;
  optional int32 col4 (TIME_MILLIS);
  optional int64 col5 (TIMESTAMP_MILLIS);
  optional int32 col6 (DATE);
  optional boolean col7;
  optional binary col8 (UTF8);
  optional binary col9 (UTF8);
}
{code}

[root@centos-01 ~]# javac DataFromDrill.java
[root@centos-01 ~]# java DataFromDrill
log4j:WARN No appenders could be found for logger 
(org.apache.drill.common.config.NestedConfig).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
ResultSetMetadata.getColumnDisplaySize(1) :1258

I used simba JDBC driver to run the test

{code}
[root@centos-01 ~]# echo $CLASSPATH
.:/root/simbaJDBC/*
[root@centos-01 ~]# cd simbaJDBC/
[root@centos-01 simbaJDBC]# ls
antlr-runtime-3.5.2.jarDrillJDBC41.jar
javassist-3.19.0-GA.jar netty-transport-4.0.27.Final.jar
commons-lang3-3.3.2.jardrill-protocol-1.1.0.jar   joda-time-2.7.jar 
  protobuf-java-2.6.1.jar
config-1.2.1.jar   guava-16.0.1.jar   log4j-1.2.17.jar  
  reflections-0.9.9.jar
curator-client-2.7.1.jar   hadoop-common-2.6.0.jar
metrics-core-3.0.2.jar  slf4j-api-1.7.12.jar
curator-framework-2.7.1.jarjackson-annotations-2.5.2.jar  
metrics-jvm-3.0.2.jar   slf4j-log4j12-1.7.12.jar
curator-recipes-2.7.1.jar  jackson-core-2.5.2.jar 
netty-buffer-4.0.27.Final.jar   zookeeper-3.4.6.jar
curator-x-discovery-2.7.1.jar  jackson-core-asl-1.9.13.jar
netty-codec-4.0.27.Final.jar
drill-common-1.1.0.jar jackson-databind-2.5.2.jar 
netty-common-4.0.27.Final.jar
drill-java-exec-1.1.0.jar  jackson-mapper-asl-1.9.13.jar  
netty-handler-4.0.27.Final.jar
{code}

Snippet to run the query
{code}
final String URL_STRING = "jdbc:drill:drillbit=10.10.100.201";  
 
Class.forName("com.mapr.drill.jdbc41.Driver").newInstance();
Connection conn = 
DriverManager.getConnection(URL_STRING,"root","mapr");

Statement stmt = conn.createStatement();
String query = "select col9 from dfs.tmp.FEWRWSPQQ_101";
ResultSet rs = stmt.executeQuery(query);
ResultSetMetaData rsmd = rs.getMetaData();

System.out.println("ResultSetMetadata.getColumnDisplaySize(1) 
:"+rsmd.getColumnDisplaySize(1));
{code}

> ResultSetMetadata.getColumnDisplaySize() does not return correct column 
> length value
> 
>
> Key: DRILL-3908
> URL: https://issues.apache.org/jira/browse/DRILL-3908
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.1.0
> Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 
> 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>Reporter: Sergio Lob
>Assignee: Parth Chandra
>Priority: Critical
> Fix For: 1.3.0
>
>
> ResultSetMetadata.getColumnDisplaySize() does not return correct column 
> length value. In fact, invocation of this method seems to always return value 
> 10.  This causes our JDBC application to process data incorrectly.  I see 
> that this problem was referenced in jira DRILL-3151, but the this bug has 
> apparently not been addressed, although status of DRILL-3151 is "resolved". 
> This is a critical bug for us. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3543) Add stats for external sort to a query profile

2015-10-22 Thread Victoria Markman (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3543:

Assignee: Sudheesh Katkam

> Add stats for external sort to a query profile
> --
>
> Key: DRILL-3543
> URL: https://issues.apache.org/jira/browse/DRILL-3543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.1.0
>Reporter: Victoria Markman
>Assignee: Sudheesh Katkam
> Fix For: Future
>
>
> The only indication if sort spilled to disk today is info from the 
> drillbit.log.
> It would be great if this information was displayed in the query profile.
> {code}
> 015-07-22 23:47:29,907 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] INFO  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to 
> /tmp/drill/spill/2a4fd46e-f8c3-6b96-b165-b665a41be311/major_fragment_0/minor_fragment_0/operator_7/92
> 2015-07-22 23:47:29,919 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] INFO  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to 
> /tmp/drill/spill/2a4fd46e-f8c3-6b96-b165-b665a41be311/major_fragment_0/minor_fragment_0/operator_7/93
> 2015-07-22 23:47:29,919 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] INFO  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to 
> /tmp/drill/spill/2a4fd46e-f8c3-6b96-b165-b665a41be311/major_fragment_0/minor_fragment_0/operator_7/93
> 2015-07-22 23:47:29,919 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] WARN  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 7 batch groups. 
> Current allocated memory: 11566787
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3340) Add named metrics and named operators in OperatorProfile

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969597#comment-14969597
 ] 

ASF GitHub Bot commented on DRILL-3340:
---

Github user adeneche commented on the pull request:

https://github.com/apache/drill/pull/216#issuecomment-150314514
  
I think the way metrics are displayed is fine, and it's definitely an 
improvement compared to no metrics at all.


> Add named metrics and named operators in OperatorProfile
> 
>
> Key: DRILL-3340
> URL: https://issues.apache.org/jira/browse/DRILL-3340
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: DRILL-3340.1.patch.txt, DRILL-3340.2.patch.txt, 
> DRILL-3340.3.patch.txt
>
>
> + Useful when reading JSON query profile.
> + Rename FragmentStats#getOperatorStats to FragmentStats#newOperatorStats



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3749) Upgrade Hadoop dependency to latest version (2.7.1)

2015-10-22 Thread Andrew (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969547#comment-14969547
 ] 

Andrew commented on DRILL-3749:
---

[~adeneche] Yes, I am able to successfully run master on AWS all the way 
through to completion. 

> Upgrade Hadoop dependency to latest version (2.7.1)
> ---
>
> Key: DRILL-3749
> URL: https://issues.apache.org/jira/browse/DRILL-3749
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.1.0
>Reporter: Venki Korukanti
>Assignee: Jason Altekruse
> Fix For: Future
>
>
> Logging a JIRA to track and discuss upgrading Drill's Hadoop dependency 
> version. Currently Drill depends on Hadoop 2.5.0 version. Newer version of 
> Hadoop (2.7.1) has following features.
> 1) Better S3 support
> 2) Ability to check if a user has certain permissions on file/directory 
> without performing operations on the file/dir. Useful for cases like 
> DRILL-3467.
> As Drill is going to use higher version of Hadoop fileclient, there could be 
> potential issues when interacting with Hadoop services (such as HDFS) of 
> lower version than the fileclient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3232) Modify existing vectors to allow type promotion

2015-10-22 Thread Steven Phillips (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969616#comment-14969616
 ] 

Steven Phillips commented on DRILL-3232:


Design document: https://gist.github.com/StevenMPhillips/41b4a1bd745943d508d2

> Modify existing vectors to allow type promotion
> ---
>
> Key: DRILL-3232
> URL: https://issues.apache.org/jira/browse/DRILL-3232
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>Reporter: Steven Phillips
>Assignee: Hanifi Gunes
> Fix For: 1.3.0
>
>
> Support the ability for existing vectors to be promoted similar to supported 
> implicit casting rules.
> For example:
> INT > DOUBLE > STRING > EMBEDDED



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3229) Create a new EmbeddedVector

2015-10-22 Thread Steven Phillips (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969613#comment-14969613
 ] 

Steven Phillips commented on DRILL-3229:


Design document: https://gist.github.com/StevenMPhillips/41b4a1bd745943d508d2

> Create a new EmbeddedVector
> ---
>
> Key: DRILL-3229
> URL: https://issues.apache.org/jira/browse/DRILL-3229
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>Reporter: Jacques Nadeau
>Assignee: Steven Phillips
> Fix For: Future
>
>
> Embedded Vector will leverage a binary encoding for holding information about 
> type for each individual field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3340) Add named metrics and named operators in OperatorProfile

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969914#comment-14969914
 ] 

ASF GitHub Bot commented on DRILL-3340:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/216#discussion_r42807912
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/OperatorWrapper.java
 ---
@@ -130,4 +136,32 @@ public void addSummary(TableBuilder tb) {
 tb.appendBytes(Math.round(memSum / size), null);
 tb.appendBytes(peakMem.getLeft().getPeakLocalMemoryAllocated(), null);
   }
+
+  public String getMetricsTable() {
+if (!OperatorMetricRegistry.contains(operatorType.getNumber())) {
+  return "";
+}
+final ArrayList metricNames = Lists.newArrayList("Minor 
Fragment");
+for (final MetricValue metric : firstProfile.getMetricList()) {
+  
metricNames.add(OperatorMetricRegistry.getMetricName(operatorType.getNumber(), 
metric.getMetricId()));
+}
+
+final String[] metricsTableColumnNames = new 
String[metricNames.size()];
+final TableBuilder builder = new 
TableBuilder(metricNames.toArray(metricsTableColumnNames));
+for (final ImmutablePair ip : ops) {
+  final OperatorProfile op = ip.getLeft();
+
+  builder.appendCell(
+  new OperatorPathBuilder()
+  .setMajor(major)
+  .setMinor(ip.getRight())
+  .setOperator(op)
+  .build(),
+  null);
+  for (final MetricValue metric : op.getMetricList()) {
+builder.appendInteger(metric.getLongValue(), null);
--- End diff --

It's a bad assumption that I made :)


> Add named metrics and named operators in OperatorProfile
> 
>
> Key: DRILL-3340
> URL: https://issues.apache.org/jira/browse/DRILL-3340
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: DRILL-3340.1.patch.txt, DRILL-3340.2.patch.txt, 
> DRILL-3340.3.patch.txt
>
>
> + Useful when reading JSON query profile.
> + Rename FragmentStats#getOperatorStats to FragmentStats#newOperatorStats



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Mehant Baid (JIRA)

Mehant Baid created DRILL-3965:
--

 Summary: Index out of bounds exception in partition pruning
 Key: DRILL-3965
 URL: https://issues.apache.org/jira/browse/DRILL-3965
 Project: Apache Drill
  Issue Type: Bug
Reporter: Mehant Baid
Assignee: Mehant Baid


Hit IOOB while trying to perform partition pruning on a table that was created 
using CTAS auto partitioning with the below stack trace.

Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
range: -8
at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
at 
org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
 ~[drill-java-exec-1.2.0.jar:1.2.0]
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
 ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Rahul Challapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969978#comment-14969978
 ] 

Rahul Challapalli commented on DRILL-3965:
--

[~mehant] does this also address 
https://issues.apache.org/jira/browse/DRILL-3376

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Aman Sinha
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Mehant Baid (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969986#comment-14969986
 ] 

Mehant Baid commented on DRILL-3965:


I don't think so, looking at the stack trace in the DRILL-3376 it seems like a 
separate issue.

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Aman Sinha
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3966) Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean

2015-10-22 Thread Rahul Challapalli (JIRA)

Rahul Challapalli created DRILL-3966:


 Summary: Metadata Cache + Partition Pruning not hapenning when the 
partition column is of type boolean
 Key: DRILL-3966
 URL: https://issues.apache.org/jira/browse/DRILL-3966
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata, Query Planning & Optimization
Reporter: Rahul Challapalli


git.commit.id.abbrev=19b4b79

I have partitioned parquet files whose partition column is of type boolean.
The below plan suggests that pruning did not take place when partitioned column 
is of type boolean and when metadata exists. However if I get rid of the 
metadata cache, partition pruning seems to be working fine.

Query :
{code}
explain plan for select * from fewtypes_boolpartition where bool_col = false;

00-00Screen
00-01  Project(*=[$0])
00-02Project(T11¦¦*=[$0])
00-03  SelectionVectorRemover
00-04Filter(condition=[=($1, false)])
00-05  Project(T11¦¦*=[$0], bool_col=[$1])
00-06Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet],
 ReadEntryWithPath 
[path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]],
 selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, 
numFiles=2, usedMetadataFile=true, columns=[`*`]]])

{code}


Error from the log :
{code}
WARN  o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune 
partition.
 java.lang.UnsupportedOperationException: Unsupported type: BIT
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451)
 ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96)
 ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212)
 ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
 [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
 [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
 [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at 
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at 
org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) 
[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545)
 [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213)
 [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248)
 [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61)
 [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
 [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) 
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) 
[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
{code}

I attached the data sets required. Let me know if you need anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3966) Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean

2015-10-22 Thread Rahul Challapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-3966:
-
Attachment: 0_0_2.parquet
0_0_1.parquet

> Metadata Cache + Partition Pruning not hapenning when the partition column is 
> of type boolean
> -
>
> Key: DRILL-3966
> URL: https://issues.apache.org/jira/browse/DRILL-3966
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata, Query Planning & Optimization
>Reporter: Rahul Challapalli
> Attachments: 0_0_1.parquet, 0_0_2.parquet
>
>
> git.commit.id.abbrev=19b4b79
> I have partitioned parquet files whose partition column is of type boolean.
> The below plan suggests that pruning did not take place when partitioned 
> column is of type boolean and when metadata exists. However if I get rid of 
> the metadata cache, partition pruning seems to be working fine.
> Query :
> {code}
> explain plan for select * from fewtypes_boolpartition where bool_col = false;
> 00-00Screen
> 00-01  Project(*=[$0])
> 00-02Project(T11¦¦*=[$0])
> 00-03  SelectionVectorRemover
> 00-04Filter(condition=[=($1, false)])
> 00-05  Project(T11¦¦*=[$0], bool_col=[$1])
> 00-06Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet],
>  ReadEntryWithPath 
> [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]],
>  selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, 
> numFiles=2, usedMetadataFile=true, columns=[`*`]]])
> {code}
> Error from the log :
> {code}
> WARN  o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune 
> partition.
>  java.lang.UnsupportedOperationException: Unsupported type: BIT
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451)
>  ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96)
>  ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212)
>  ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808)
>  [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) 
> [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>   at 
> org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) 
> [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> {code}
> I attached the data sets required. Let me know if you need anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3965:
---
Attachment: DRILL-3965.patch

[~amansinha100] can you please review.

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3965:
---
Assignee: Aman Sinha  (was: Mehant Baid)

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Aman Sinha
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-3967) Broken Test: TestDrillbitResilience.cancelAfterEverythingIsCompleted()

2015-10-22 Thread Andrew (JIRA)

Andrew created DRILL-3967:
-

 Summary: Broken Test: 
TestDrillbitResilience.cancelAfterEverythingIsCompleted()
 Key: DRILL-3967
 URL: https://issues.apache.org/jira/browse/DRILL-3967
 Project: Apache Drill
  Issue Type: Test
  Components: Execution - Flow, Execution - RPC
Affects Versions: 1.2.0
Reporter: Andrew
Assignee: Sudheesh Katkam
Priority: Minor


TestDrillbitResilience.cancelAfterEverythingIsCompleted() can sometimes fail. 
I've noticed that running this test on an m2.xlarge on AWS causes a 
reproducible failure when running against the patch for 
https://issues.apache.org/jira/browse/DRILL-3749 (Upgraded Hadoop and Curator 
libraries).

When running this test with the same patch on my laptop, this test passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available

2015-10-22 Thread Kristine Hahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn updated DRILL-3486:
-
Fix Version/s: 1.3.0

> Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's 
> available
> --
>
> Key: DRILL-3486
> URL: https://issues.apache.org/jira/browse/DRILL-3486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
> Fix For: Future, 1.3.0
>
>
> The Drill documentation site's JDBC pages should have a link to a copy of the 
> driver's generated Javadoc documentation once we start generating it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available

2015-10-22 Thread Kristine Hahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn resolved DRILL-3486.
--
   Resolution: Fixed
Fix Version/s: (was: Future)

link has been published

> Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's 
> available
> --
>
> Key: DRILL-3486
> URL: https://issues.apache.org/jira/browse/DRILL-3486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
>
> The Drill documentation site's JDBC pages should have a link to a copy of the 
> driver's generated Javadoc documentation once we start generating it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (DRILL-3827) Empty metadata file causes queries on the table to fail

2015-10-22 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra reassigned DRILL-3827:


Assignee: Parth Chandra  (was: Jinfeng Ni)

> Empty metadata file causes queries on the table to fail
> ---
>
> Key: DRILL-3827
> URL: https://issues.apache.org/jira/browse/DRILL-3827
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Parth Chandra
>Priority: Critical
>
> I ran into a situation where drill created an empty metadata file (which is a 
> separate issue and I will try to narrow it down. Suspicion is that this 
> happens when "refresh table metada x" fails with "permission denied" error).
> However, we need to guard against situation where metadata file is empty or 
> corrupted. We probably should skip reading it if we encounter unexpected 
> result and continue with query planning without that information. In the same 
> fashion as partition pruning failure. It's also important to log this 
> information somewhere, drillbit.log as a start. It would be really nice to 
> have a flag in the query profile that tells a user if we used metadata file 
> for planning or not. Will help in debugging performance issues.
> Very confusing exception is thrown if you have zero length meta data file in 
> the directory:
> {code}
> [Wed Sep 23 07:45:28] # ls -la
> total 2
> drwxr-xr-x  2 root root   2 Sep 10 14:55 .
> drwxr-xr-x 16 root root  35 Sep 15 12:54 ..
> -rwxr-xr-x  1 root root 483 Jul  1 11:29 0_0_0.parquet
> -rwxr-xr-x  1 root root   0 Sep 10 14:55 .drill.parquet_metadata
> 0: jdbc:drill:schema=dfs> select * from t1;
> Error: SYSTEM ERROR: JsonMappingException: No content to map due to 
> end-of-input
>  at [Source: com.mapr.fs.MapRFsDataInputStream@342bd88d; line: 1, column: 1]
> [Error Id: c97574f6-b3e8-4183-8557-c30df6ca675f on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> Workaround is trivial, remove the file. Marking it as critical, since we 
> don't have any concurrency control in place and this file can get corrupted 
> as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3965:
---
Attachment: (was: DRILL-3965.patch)

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Mehant Baid (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3965:
---
Attachment: DRILL-3965.patch

Updated patch with minor changes

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source

2015-10-22 Thread Kristine Hahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn closed DRILL-3485.


> Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
> --
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Kristine Hahn
> Fix For: Future, 1.3.0
>
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source

2015-10-22 Thread Kristine Hahn (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970099#comment-14970099
 ] 

Kristine Hahn edited comment on DRILL-3485 at 10/22/15 11:32 PM:
-

Done--added javadoc url to Java Driver doc.


was (Author: krishahn):
Done

> Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
> --
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Kristine Hahn
> Fix For: Future, 1.3.0
>
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available

2015-10-22 Thread Kristine Hahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn closed DRILL-3486.


> Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's 
> available
> --
>
> Key: DRILL-3486
> URL: https://issues.apache.org/jira/browse/DRILL-3486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
>
> The Drill documentation site's JDBC pages should have a link to a copy of the 
> driver's generated Javadoc documentation once we start generating it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (DRILL-3820) Nested Directories : Metadata Cache in a directory stores information from sub-directories as well creating security issues

2015-10-22 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra reassigned DRILL-3820:


Assignee: Parth Chandra  (was: Aman Sinha)

> Nested Directories : Metadata Cache in a directory stores information from 
> sub-directories as well creating security issues
> ---
>
> Key: DRILL-3820
> URL: https://issues.apache.org/jira/browse/DRILL-3820
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Rahul Challapalli
>Assignee: Parth Chandra
>Priority: Critical
> Fix For: 1.3.0
>
>
> git.commit.id.abbrev=3c89b30
> User A has access to lineitem folder and its subfolders
> User B had access to lineitem folder but not its sub-folders.
> Now when User A runs the "refresh table metadata lineitem" command, the cache 
> file gets created under lineitem folder. This file contains information from 
> the underlying sub-directories as well.
> Now User B can download this file and get access to information which he 
> should not be seeing in the first place.
> This can be very easily reproducible if impersonation is enabled on the 
> cluster.
> Let me know if you need more information to reproduce this issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Aman Sinha (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-3965:
--
Assignee: Mehant Baid  (was: Aman Sinha)

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source

2015-10-22 Thread Kristine Hahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn updated DRILL-3485:
-
Fix Version/s: 1.3.0

> Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
> --
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Kristine Hahn
> Fix For: Future, 1.3.0
>
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source

2015-10-22 Thread Kristine Hahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristine Hahn resolved DRILL-3485.
--
Resolution: Fixed

Done

> Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
> --
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Kristine Hahn
> Fix For: Future, 1.3.0
>
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2868) Drill returning incorrect data when we have fields missing in some of the files

2015-10-22 Thread Parth Chandra (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-2868:
-
Assignee: (was: Chris Westin)

> Drill returning incorrect data when we have fields missing in some of the 
> files
> ---
>
> Key: DRILL-2868
> URL: https://issues.apache.org/jira/browse/DRILL-2868
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators, Storage - JSON, 
> Storage - Parquet
>Reporter: Rahul Challapalli
>Priority: Critical
> Fix For: Future
>
>
> git.commit.id.abbrev=5cd36c5
> Data File1 : a.json
> {code}
> { "c1" : 1, "m1" : {"m2" : {"m3" : {"c2" : 5} } } }
> { "c1" : 2, "m1" : {"m2" : {"m3" : {"c2" : 6} } } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> {code}
> Data File2 : b.json
> {code}
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> {code}
> Data File3 : c.json
> {code}
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> {code}
> The below query reports incorrect data :
> {code}
> select t.m1.m2.m3 from `delme_repro` as `t`;
> ++
> |   EXPR$0   |
> ++
> | null   |
> | null   |
> | null   |
> | null   |
> | null   |
> | null   |
> | null   |
> | null   |
> | null   |
> ++
> 9 rows selected (0.139 seconds)
> {code}
> However if I run the same query on the specific file, I get the correct output
> {code}
> select t.m1.m2.m3 from `delme_repro/a.json` as `t`;
> ++
> |   EXPR$0   |
> ++
> | {"c2":5}   |
> | {"c2":6}   |
> | {} |
> ++
> 3 rows selected (0.113 seconds)
> {code}
> It looks like the file size plays a part in deciding the order in which Drill 
> reads the files. But there could be more to this than just the order because 
> when I made sure that 'b.json' and 'c.json' only had one records, drill 
> correctly reported the data.
> Let me know if you have any questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3965) Index out of bounds exception in partition pruning

2015-10-22 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970207#comment-14970207
 ] 

Aman Sinha commented on DRILL-3965:
---

+1 .  Could you add javadoc to the new class and also rename the unit test ? 

> Index out of bounds exception in partition pruning
> --
>
> Key: DRILL-3965
> URL: https://issues.apache.org/jira/browse/DRILL-3965
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Aman Sinha
> Attachments: DRILL-3965.patch
>
>
> Hit IOOB while trying to perform partition pruning on a table that was 
> created using CTAS auto partitioning with the below stack trace.
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -8
>   at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79]
>   at 
> org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87)
>  ~[drill-java-exec-1.2.0.jar:1.2.0]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970367#comment-14970367
 ] 

ASF GitHub Bot commented on DRILL-3742:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/148#discussion_r42830561
  
--- Diff: 
common/src/main/java/org/apache/drill/common/scanner/RunTimeScan.java ---
@@ -0,0 +1,79 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.common.scanner;
+
+import static java.util.Collections.emptySet;
+
+import java.net.URL;
+import java.util.Collection;
+import java.util.Set;
+import java.util.TreeSet;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.common.scanner.persistence.ScanResult;
+
+/**
+ * Utility to scan classpath at runtime
+ *
+ */
+public class RunTimeScan {
+
+  // result of prescan
+  private static final ScanResult PRESCANNED = BuildTimeScan.load();
+
+  // urls of the locations (classes directory or jar) to scan that don't 
have a registry in them
+  private static final Collection NON_PRESCANNED_MARKED_PATHS = 
getNonPrescannedMarkedPaths();
+
+  // one element cache
+  private static Set SCANNED_PACKAGE_PREFIXES_FROM_CONFIG = 
emptySet();
--- End diff --

Since you're modifying interfaces anyway, can we please remove this static 
cache? At a minimum, I'd rather carry it on the DrillConfig object than in a 
static.


> Improve classpath scanning to reduce the time it takes
> --
>
> Key: DRILL-3742
> URL: https://issues.apache.org/jira/browse/DRILL-3742
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: Future
>
>
> classpath scanning and function registry take a long time (seconds every 
> time).
> We'd want to avoid loading the classes (use bytecode inspection instead) and 
> have a build time cache to avoid doing the scanning at startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3229) Create a new EmbeddedVector

2015-10-22 Thread Parth Chandra (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970439#comment-14970439
 ] 

Parth Chandra commented on DRILL-3229:
--

Nice doc. I have a couple of quick questions - 
In the list writer, when the map() method is called, I didn't quite follow the 
reason for tracking the current field name. What is it needed for? 
The Type promotion proposal is excellent. But with type promotion we will 
update the underlying writer to a UnionWriter the moment a type change occurs. 
Is it possible for us to to have a hierarchy of promotable types and we promote 
to a higher Scalar type (e.g. Int gets promoted to a Varchar) as a first step 
and Union if we encounter more than one type change or a change to a complex 
type. I'm OK if we think this is too complex to implement.
How will Screen handle a Union type? In general, a user level tool (sqlline 
included) will not know how to handle this. Can we have screen return a varchar 
representation of the Union type? During data exploration the user will then 
see there are type changes and can then use the type introspection and cast 
methods appropriately. 
What about metadata only queries ( i.e select * ... limit 0)? What type would 
the user application get?
For Function Evaluation my preference is to have code generation rather than 
have UDFs that take a union parameter.
For case statements - If a case statment can output a Union type, the end user 
will presumably have to resolve the different types using type introspection 
and an outer case statement. Actually I don't have enough idea about end user 
use cases to choose which is more desirable. Should we leave it as choice #2 
and see what users ask for?
Jacques had mentioned that you have an idea for introducing a Untyped null 
type. How would that fit in with this design?


> Create a new EmbeddedVector
> ---
>
> Key: DRILL-3229
> URL: https://issues.apache.org/jira/browse/DRILL-3229
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>Reporter: Jacques Nadeau
>Assignee: Steven Phillips
> Fix For: Future
>
>
> Embedded Vector will leverage a binary encoding for holding information about 
> type for each individual field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970364#comment-14970364
 ] 

ASF GitHub Bot commented on DRILL-3742:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/148#discussion_r42830256
  
--- Diff: fmpp/pom.xml ---
@@ -0,0 +1,58 @@
+
--- End diff --

Let's make tools part of the module hierarchy. 


> Improve classpath scanning to reduce the time it takes
> --
>
> Key: DRILL-3742
> URL: https://issues.apache.org/jira/browse/DRILL-3742
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: Future
>
>
> classpath scanning and function registry take a long time (seconds every 
> time).
> We'd want to avoid loading the classes (use bytecode inspection instead) and 
> have a build time cache to avoid doing the scanning at startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3749) Upgrade Hadoop dependency to latest version (2.7.1)

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970370#comment-14970370
 ] 

ASF GitHub Bot commented on DRILL-3749:
---

Github user jacques-n commented on the pull request:

https://github.com/apache/drill/pull/203#issuecomment-150451832
  
LGTM +1.

Let's get this merged.


> Upgrade Hadoop dependency to latest version (2.7.1)
> ---
>
> Key: DRILL-3749
> URL: https://issues.apache.org/jira/browse/DRILL-3749
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.1.0
>Reporter: Venki Korukanti
>Assignee: Jason Altekruse
> Fix For: Future
>
>
> Logging a JIRA to track and discuss upgrading Drill's Hadoop dependency 
> version. Currently Drill depends on Hadoop 2.5.0 version. Newer version of 
> Hadoop (2.7.1) has following features.
> 1) Better S3 support
> 2) Ability to check if a user has certain permissions on file/directory 
> without performing operations on the file/dir. Useful for cases like 
> DRILL-3467.
> As Drill is going to use higher version of Hadoop fileclient, there could be 
> potential issues when interacting with Hadoop services (such as HDFS) of 
> lower version than the fileclient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970396#comment-14970396
 ] 

ASF GitHub Bot commented on DRILL-3742:
---

Github user julienledem commented on the pull request:

https://github.com/apache/drill/pull/148#issuecomment-150460962
  
@jacques-n Thanks for the review!


> Improve classpath scanning to reduce the time it takes
> --
>
> Key: DRILL-3742
> URL: https://issues.apache.org/jira/browse/DRILL-3742
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: Future
>
>
> classpath scanning and function registry take a long time (seconds every 
> time).
> We'd want to avoid loading the classes (use bytecode inspection instead) and 
> have a build time cache to avoid doing the scanning at startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970368#comment-14970368
 ] 

ASF GitHub Bot commented on DRILL-3742:
---

Github user jacques-n commented on a diff in the pull request:

https://github.com/apache/drill/pull/148#discussion_r42830643
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalOperatorUtil.java
 ---
@@ -33,14 +32,10 @@
 
   private PhysicalOperatorUtil() {}
 
-  public synchronized static Class[] getSubTypes(final DrillConfig 
config) {
-final Class[] ops =
-PathScanner.scanForImplementationsArr(PhysicalOperator.class,
-
config.getStringList(CommonConstants.PHYSICAL_OPERATOR_SCAN_PACKAGES));
-final String lineBrokenList =
-ops.length == 0 ? "" : "\n\t- " + Joiner.on("\n\t- ").join(ops);
-logger.debug("Found {} physical operator classes: {}.", ops.length,
- lineBrokenList);
+  public synchronized static Set 
getSubTypes(ScanResult classpathScan) {
--- End diff --

Remove synchronized.


> Improve classpath scanning to reduce the time it takes
> --
>
> Key: DRILL-3742
> URL: https://issues.apache.org/jira/browse/DRILL-3742
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: Future
>
>
> classpath scanning and function registry take a long time (seconds every 
> time).
> We'd want to avoid loading the classes (use bytecode inspection instead) and 
> have a build time cache to avoid doing the scanning at startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes

2015-10-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970369#comment-14970369
 ] 

ASF GitHub Bot commented on DRILL-3742:
---

Github user jacques-n commented on the pull request:

https://github.com/apache/drill/pull/148#issuecomment-150451599
  
Few changes: move fmpp to tools directory, remove static cache and one 
other tweak and looks good to me.


> Improve classpath scanning to reduce the time it takes
> --
>
> Key: DRILL-3742
> URL: https://issues.apache.org/jira/browse/DRILL-3742
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Julien Le Dem
> Fix For: Future
>
>
> classpath scanning and function registry take a long time (seconds every 
> time).
> We'd want to avoid loading the classes (use bytecode inspection instead) and 
> have a build time cache to avoid doing the scanning at startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

39 matches

Mail list logo