[jira] [Commented] (DRILL-3944) Drill MAXDIR Unknown variable or type "FILE_SEPARATOR"
[ https://issues.apache.org/jira/browse/DRILL-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969756#comment-14969756 ] Jitendra commented on DRILL-3944: - 0: jdbc:drill:drillbit=localhost> select dir0 from vspace. wspace.`freemate2` where dir0= maxdir('vspace. wspace', 'freemate2'); --- dir0 --- --- No rows selected (0.444 seconds) Then I tried below query also with dir1. select dir1 from vspace. wspace.`freemate2` where dir1 = maxdir('vspace. wspace', 'freemate2'); 2015-10-22 01:03:10,706 [29d7ca30-f074-d8fc-621a-e07adc8e66ff:foreman] INFO o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to check for Parquet metadata file. java.io.IOException: Open failed for file: /vspace/wspace/freemate2/20151005, error: Invalid argument (22) at com.mapr.fs.MapRClientImpl.open(MapRClientImpl.java:212) ~[maprfs-4.1.0-mapr.jar:4.1.0-mapr] at com.mapr.fs.MapRFileSystem.open(MapRFileSystem.java:862) ~[maprfs-4.1.0-mapr.jar:4.1.0-mapr] at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:800) ~[hadoop-common-2.5.1-mapr-1503.jar:na] at org.apache.drill.exec.store.dfs.DrillFileSystem.open(DrillFileSystem.java:132) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:142) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.dfs.BasicFormatMatcher.isFileReadable(BasicFormatMatcher.java:112) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:256) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:210) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:326) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:153) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:276) [drill-java-exec-1.2.0.jar:1.2.0] at org.apache.calcite.jdbc.SimpleCalciteSchema.getTable(SimpleCalciteSchema.java:83) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.prepare.CalciteCatalogReader.getTableFrom(CalciteCatalogReader.java:116) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:99) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.prepare.CalciteCatalogReader.getTable(CalciteCatalogReader.java:70) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.EmptyScope.getTableNamespace(EmptyScope.java:75) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.DelegatingScope.getTableNamespace(DelegatingScope.java:124) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl(IdentifierNamespace.java:104) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:877) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2777) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2762) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2985) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) [calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] at
[jira] [Commented] (DRILL-3340) Add named metrics and named operators in OperatorProfile
[ https://issues.apache.org/jira/browse/DRILL-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969592#comment-14969592 ] ASF GitHub Bot commented on DRILL-3340: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/216#discussion_r42786354 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/OperatorWrapper.java --- @@ -130,4 +136,32 @@ public void addSummary(TableBuilder tb) { tb.appendBytes(Math.round(memSum / size), null); tb.appendBytes(peakMem.getLeft().getPeakLocalMemoryAllocated(), null); } + + public String getMetricsTable() { +if (!OperatorMetricRegistry.contains(operatorType.getNumber())) { + return ""; +} +final ArrayList metricNames = Lists.newArrayList("Minor Fragment"); +for (final MetricValue metric : firstProfile.getMetricList()) { + metricNames.add(OperatorMetricRegistry.getMetricName(operatorType.getNumber(), metric.getMetricId())); +} + +final String[] metricsTableColumnNames = new String[metricNames.size()]; +final TableBuilder builder = new TableBuilder(metricNames.toArray(metricsTableColumnNames)); +for (final ImmutablePairip : ops) { + final OperatorProfile op = ip.getLeft(); + + builder.appendCell( + new OperatorPathBuilder() + .setMajor(major) + .setMinor(ip.getRight()) + .setOperator(op) + .build(), + null); + for (final MetricValue metric : op.getMetricList()) { +builder.appendInteger(metric.getLongValue(), null); --- End diff -- the code builds the table header from the first profile's metric list, then assumes all minor fragments will have the metrics in the same order of the first profile. Can you confirm that this assumption is always true ? > Add named metrics and named operators in OperatorProfile > > > Key: DRILL-3340 > URL: https://issues.apache.org/jira/browse/DRILL-3340 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sudheesh Katkam >Assignee: Sudheesh Katkam >Priority: Minor > Fix For: 1.3.0 > > Attachments: DRILL-3340.1.patch.txt, DRILL-3340.2.patch.txt, > DRILL-3340.3.patch.txt > > > + Useful when reading JSON query profile. > + Rename FragmentStats#getOperatorStats to FragmentStats#newOperatorStats -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3908) ResultSetMetadata.getColumnDisplaySize() does not return correct column length value
[ https://issues.apache.org/jira/browse/DRILL-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969629#comment-14969629 ] Khurram Faraaz commented on DRILL-3908: --- getColumnDisplaySize method of interface ResultSetMetaData should return, the normal maximum number of characters allowed as the width of the designated column. In this case since the column is defined to be of VARCHAR type the maximum allowed width is 4000 bytes. > ResultSetMetadata.getColumnDisplaySize() does not return correct column > length value > > > Key: DRILL-3908 > URL: https://issues.apache.org/jira/browse/DRILL-3908 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.1.0 > Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May > 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Sergio Lob >Assignee: Parth Chandra >Priority: Critical > Fix For: 1.3.0 > > > ResultSetMetadata.getColumnDisplaySize() does not return correct column > length value. In fact, invocation of this method seems to always return value > 10. This causes our JDBC application to process data incorrectly. I see > that this problem was referenced in jira DRILL-3151, but the this bug has > apparently not been addressed, although status of DRILL-3151 is "resolved". > This is a critical bug for us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3908) ResultSetMetadata.getColumnDisplaySize() does not return correct column length value
[ https://issues.apache.org/jira/browse/DRILL-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969553#comment-14969553 ] Khurram Faraaz commented on DRILL-3908: --- I ran the test using simba JDBC driver, and getColumnDisplaySize(1) returns 1258, unlike Drill's JDBC driver where we return 10. Here is the definition of the parquet file over which the query is run from JDBC. {code} [root@centos-01 parquet-tools]# ./parquet-schema 0_0_0.parquet message root { optional int32 col0; optional int64 col1; optional float col2; optional double col3; optional int32 col4 (TIME_MILLIS); optional int64 col5 (TIMESTAMP_MILLIS); optional int32 col6 (DATE); optional boolean col7; optional binary col8 (UTF8); optional binary col9 (UTF8); } {code} [root@centos-01 ~]# javac DataFromDrill.java [root@centos-01 ~]# java DataFromDrill log4j:WARN No appenders could be found for logger (org.apache.drill.common.config.NestedConfig). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. ResultSetMetadata.getColumnDisplaySize(1) :1258 I used simba JDBC driver to run the test {code} [root@centos-01 ~]# echo $CLASSPATH .:/root/simbaJDBC/* [root@centos-01 ~]# cd simbaJDBC/ [root@centos-01 simbaJDBC]# ls antlr-runtime-3.5.2.jarDrillJDBC41.jar javassist-3.19.0-GA.jar netty-transport-4.0.27.Final.jar commons-lang3-3.3.2.jardrill-protocol-1.1.0.jar joda-time-2.7.jar protobuf-java-2.6.1.jar config-1.2.1.jar guava-16.0.1.jar log4j-1.2.17.jar reflections-0.9.9.jar curator-client-2.7.1.jar hadoop-common-2.6.0.jar metrics-core-3.0.2.jar slf4j-api-1.7.12.jar curator-framework-2.7.1.jarjackson-annotations-2.5.2.jar metrics-jvm-3.0.2.jar slf4j-log4j12-1.7.12.jar curator-recipes-2.7.1.jar jackson-core-2.5.2.jar netty-buffer-4.0.27.Final.jar zookeeper-3.4.6.jar curator-x-discovery-2.7.1.jar jackson-core-asl-1.9.13.jar netty-codec-4.0.27.Final.jar drill-common-1.1.0.jar jackson-databind-2.5.2.jar netty-common-4.0.27.Final.jar drill-java-exec-1.1.0.jar jackson-mapper-asl-1.9.13.jar netty-handler-4.0.27.Final.jar {code} Snippet to run the query {code} final String URL_STRING = "jdbc:drill:drillbit=10.10.100.201"; Class.forName("com.mapr.drill.jdbc41.Driver").newInstance(); Connection conn = DriverManager.getConnection(URL_STRING,"root","mapr"); Statement stmt = conn.createStatement(); String query = "select col9 from dfs.tmp.FEWRWSPQQ_101"; ResultSet rs = stmt.executeQuery(query); ResultSetMetaData rsmd = rs.getMetaData(); System.out.println("ResultSetMetadata.getColumnDisplaySize(1) :"+rsmd.getColumnDisplaySize(1)); {code} > ResultSetMetadata.getColumnDisplaySize() does not return correct column > length value > > > Key: DRILL-3908 > URL: https://issues.apache.org/jira/browse/DRILL-3908 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.1.0 > Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May > 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Sergio Lob >Assignee: Parth Chandra >Priority: Critical > Fix For: 1.3.0 > > > ResultSetMetadata.getColumnDisplaySize() does not return correct column > length value. In fact, invocation of this method seems to always return value > 10. This causes our JDBC application to process data incorrectly. I see > that this problem was referenced in jira DRILL-3151, but the this bug has > apparently not been addressed, although status of DRILL-3151 is "resolved". > This is a critical bug for us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3543) Add stats for external sort to a query profile
[ https://issues.apache.org/jira/browse/DRILL-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victoria Markman updated DRILL-3543: Assignee: Sudheesh Katkam > Add stats for external sort to a query profile > -- > > Key: DRILL-3543 > URL: https://issues.apache.org/jira/browse/DRILL-3543 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.1.0 >Reporter: Victoria Markman >Assignee: Sudheesh Katkam > Fix For: Future > > > The only indication if sort spilled to disk today is info from the > drillbit.log. > It would be great if this information was displayed in the query profile. > {code} > 015-07-22 23:47:29,907 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/2a4fd46e-f8c3-6b96-b165-b665a41be311/major_fragment_0/minor_fragment_0/operator_7/92 > 2015-07-22 23:47:29,919 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/2a4fd46e-f8c3-6b96-b165-b665a41be311/major_fragment_0/minor_fragment_0/operator_7/93 > 2015-07-22 23:47:29,919 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/2a4fd46e-f8c3-6b96-b165-b665a41be311/major_fragment_0/minor_fragment_0/operator_7/93 > 2015-07-22 23:47:29,919 [2a4fd46e-f8c3-6b96-b165-b665a41be311:frag:0:0] WARN > o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 7 batch groups. > Current allocated memory: 11566787 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3340) Add named metrics and named operators in OperatorProfile
[ https://issues.apache.org/jira/browse/DRILL-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969597#comment-14969597 ] ASF GitHub Bot commented on DRILL-3340: --- Github user adeneche commented on the pull request: https://github.com/apache/drill/pull/216#issuecomment-150314514 I think the way metrics are displayed is fine, and it's definitely an improvement compared to no metrics at all. > Add named metrics and named operators in OperatorProfile > > > Key: DRILL-3340 > URL: https://issues.apache.org/jira/browse/DRILL-3340 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sudheesh Katkam >Assignee: Sudheesh Katkam >Priority: Minor > Fix For: 1.3.0 > > Attachments: DRILL-3340.1.patch.txt, DRILL-3340.2.patch.txt, > DRILL-3340.3.patch.txt > > > + Useful when reading JSON query profile. > + Rename FragmentStats#getOperatorStats to FragmentStats#newOperatorStats -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3749) Upgrade Hadoop dependency to latest version (2.7.1)
[ https://issues.apache.org/jira/browse/DRILL-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969547#comment-14969547 ] Andrew commented on DRILL-3749: --- [~adeneche] Yes, I am able to successfully run master on AWS all the way through to completion. > Upgrade Hadoop dependency to latest version (2.7.1) > --- > > Key: DRILL-3749 > URL: https://issues.apache.org/jira/browse/DRILL-3749 > Project: Apache Drill > Issue Type: New Feature > Components: Tools, Build & Test >Affects Versions: 1.1.0 >Reporter: Venki Korukanti >Assignee: Jason Altekruse > Fix For: Future > > > Logging a JIRA to track and discuss upgrading Drill's Hadoop dependency > version. Currently Drill depends on Hadoop 2.5.0 version. Newer version of > Hadoop (2.7.1) has following features. > 1) Better S3 support > 2) Ability to check if a user has certain permissions on file/directory > without performing operations on the file/dir. Useful for cases like > DRILL-3467. > As Drill is going to use higher version of Hadoop fileclient, there could be > potential issues when interacting with Hadoop services (such as HDFS) of > lower version than the fileclient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3232) Modify existing vectors to allow type promotion
[ https://issues.apache.org/jira/browse/DRILL-3232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969616#comment-14969616 ] Steven Phillips commented on DRILL-3232: Design document: https://gist.github.com/StevenMPhillips/41b4a1bd745943d508d2 > Modify existing vectors to allow type promotion > --- > > Key: DRILL-3232 > URL: https://issues.apache.org/jira/browse/DRILL-3232 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Codegen, Execution - Data Types, Execution - > Relational Operators, Functions - Drill >Reporter: Steven Phillips >Assignee: Hanifi Gunes > Fix For: 1.3.0 > > > Support the ability for existing vectors to be promoted similar to supported > implicit casting rules. > For example: > INT > DOUBLE > STRING > EMBEDDED -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3229) Create a new EmbeddedVector
[ https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969613#comment-14969613 ] Steven Phillips commented on DRILL-3229: Design document: https://gist.github.com/StevenMPhillips/41b4a1bd745943d508d2 > Create a new EmbeddedVector > --- > > Key: DRILL-3229 > URL: https://issues.apache.org/jira/browse/DRILL-3229 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Codegen, Execution - Data Types, Execution - > Relational Operators, Functions - Drill >Reporter: Jacques Nadeau >Assignee: Steven Phillips > Fix For: Future > > > Embedded Vector will leverage a binary encoding for holding information about > type for each individual field. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3340) Add named metrics and named operators in OperatorProfile
[ https://issues.apache.org/jira/browse/DRILL-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969914#comment-14969914 ] ASF GitHub Bot commented on DRILL-3340: --- Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/216#discussion_r42807912 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/OperatorWrapper.java --- @@ -130,4 +136,32 @@ public void addSummary(TableBuilder tb) { tb.appendBytes(Math.round(memSum / size), null); tb.appendBytes(peakMem.getLeft().getPeakLocalMemoryAllocated(), null); } + + public String getMetricsTable() { +if (!OperatorMetricRegistry.contains(operatorType.getNumber())) { + return ""; +} +final ArrayList metricNames = Lists.newArrayList("Minor Fragment"); +for (final MetricValue metric : firstProfile.getMetricList()) { + metricNames.add(OperatorMetricRegistry.getMetricName(operatorType.getNumber(), metric.getMetricId())); +} + +final String[] metricsTableColumnNames = new String[metricNames.size()]; +final TableBuilder builder = new TableBuilder(metricNames.toArray(metricsTableColumnNames)); +for (final ImmutablePairip : ops) { + final OperatorProfile op = ip.getLeft(); + + builder.appendCell( + new OperatorPathBuilder() + .setMajor(major) + .setMinor(ip.getRight()) + .setOperator(op) + .build(), + null); + for (final MetricValue metric : op.getMetricList()) { +builder.appendInteger(metric.getLongValue(), null); --- End diff -- It's a bad assumption that I made :) > Add named metrics and named operators in OperatorProfile > > > Key: DRILL-3340 > URL: https://issues.apache.org/jira/browse/DRILL-3340 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sudheesh Katkam >Assignee: Sudheesh Katkam >Priority: Minor > Fix For: 1.3.0 > > Attachments: DRILL-3340.1.patch.txt, DRILL-3340.2.patch.txt, > DRILL-3340.3.patch.txt > > > + Useful when reading JSON query profile. > + Rename FragmentStats#getOperatorStats to FragmentStats#newOperatorStats -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3965) Index out of bounds exception in partition pruning
Mehant Baid created DRILL-3965: -- Summary: Index out of bounds exception in partition pruning Key: DRILL-3965 URL: https://issues.apache.org/jira/browse/DRILL-3965 Project: Apache Drill Issue Type: Bug Reporter: Mehant Baid Assignee: Mehant Baid Hit IOOB while trying to perform partition pruning on a table that was created using CTAS auto partitioning with the below stack trace. Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -8 at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] at org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) ~[drill-java-exec-1.2.0.jar:1.2.0] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969978#comment-14969978 ] Rahul Challapalli commented on DRILL-3965: -- [~mehant] does this also address https://issues.apache.org/jira/browse/DRILL-3376 > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Aman Sinha > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14969986#comment-14969986 ] Mehant Baid commented on DRILL-3965: I don't think so, looking at the stack trace in the DRILL-3376 it seems like a separate issue. > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Aman Sinha > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3966) Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean
Rahul Challapalli created DRILL-3966: Summary: Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean Key: DRILL-3966 URL: https://issues.apache.org/jira/browse/DRILL-3966 Project: Apache Drill Issue Type: Bug Components: Metadata, Query Planning & Optimization Reporter: Rahul Challapalli git.commit.id.abbrev=19b4b79 I have partitioned parquet files whose partition column is of type boolean. The below plan suggests that pruning did not take place when partitioned column is of type boolean and when metadata exists. However if I get rid of the metadata cache, partition pruning seems to be working fine. Query : {code} explain plan for select * from fewtypes_boolpartition where bool_col = false; 00-00Screen 00-01 Project(*=[$0]) 00-02Project(T11¦¦*=[$0]) 00-03 SelectionVectorRemover 00-04Filter(condition=[=($1, false)]) 00-05 Project(T11¦¦*=[$0], bool_col=[$1]) 00-06Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet], ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]], selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, numFiles=2, usedMetadataFile=true, columns=[`*`]]]) {code} Error from the log : {code} WARN o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune partition. java.lang.UnsupportedOperationException: Unsupported type: BIT at org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] {code} I attached the data sets required. Let me know if you need anything -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3966) Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean
[ https://issues.apache.org/jira/browse/DRILL-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Challapalli updated DRILL-3966: - Attachment: 0_0_2.parquet 0_0_1.parquet > Metadata Cache + Partition Pruning not hapenning when the partition column is > of type boolean > - > > Key: DRILL-3966 > URL: https://issues.apache.org/jira/browse/DRILL-3966 > Project: Apache Drill > Issue Type: Bug > Components: Metadata, Query Planning & Optimization >Reporter: Rahul Challapalli > Attachments: 0_0_1.parquet, 0_0_2.parquet > > > git.commit.id.abbrev=19b4b79 > I have partitioned parquet files whose partition column is of type boolean. > The below plan suggests that pruning did not take place when partitioned > column is of type boolean and when metadata exists. However if I get rid of > the metadata cache, partition pruning seems to be working fine. > Query : > {code} > explain plan for select * from fewtypes_boolpartition where bool_col = false; > 00-00Screen > 00-01 Project(*=[$0]) > 00-02Project(T11¦¦*=[$0]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[=($1, false)]) > 00-05 Project(T11¦¦*=[$0], bool_col=[$1]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet], > ReadEntryWithPath > [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]], > selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, > numFiles=2, usedMetadataFile=true, columns=[`*`]]]) > {code} > Error from the log : > {code} > WARN o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune > partition. > java.lang.UnsupportedOperationException: Unsupported type: BIT > at > org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451) > ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) > ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212) > ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) > [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > {code} > I attached the data sets required. Let me know if you need anything -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3965: --- Attachment: DRILL-3965.patch [~amansinha100] can you please review. > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Mehant Baid > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3965: --- Assignee: Aman Sinha (was: Mehant Baid) > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Aman Sinha > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3967) Broken Test: TestDrillbitResilience.cancelAfterEverythingIsCompleted()
Andrew created DRILL-3967: - Summary: Broken Test: TestDrillbitResilience.cancelAfterEverythingIsCompleted() Key: DRILL-3967 URL: https://issues.apache.org/jira/browse/DRILL-3967 Project: Apache Drill Issue Type: Test Components: Execution - Flow, Execution - RPC Affects Versions: 1.2.0 Reporter: Andrew Assignee: Sudheesh Katkam Priority: Minor TestDrillbitResilience.cancelAfterEverythingIsCompleted() can sometimes fail. I've noticed that running this test on an m2.xlarge on AWS causes a reproducible failure when running against the patch for https://issues.apache.org/jira/browse/DRILL-3749 (Upgraded Hadoop and Curator libraries). When running this test with the same patch on my laptop, this test passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available
[ https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristine Hahn updated DRILL-3486: - Fix Version/s: 1.3.0 > Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's > available > -- > > Key: DRILL-3486 > URL: https://issues.apache.org/jira/browse/DRILL-3486 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) > Fix For: Future, 1.3.0 > > > The Drill documentation site's JDBC pages should have a link to a copy of the > driver's generated Javadoc documentation once we start generating it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available
[ https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristine Hahn resolved DRILL-3486. -- Resolution: Fixed Fix Version/s: (was: Future) link has been published > Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's > available > -- > > Key: DRILL-3486 > URL: https://issues.apache.org/jira/browse/DRILL-3486 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) > Fix For: 1.3.0 > > > The Drill documentation site's JDBC pages should have a link to a copy of the > driver's generated Javadoc documentation once we start generating it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3827) Empty metadata file causes queries on the table to fail
[ https://issues.apache.org/jira/browse/DRILL-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Chandra reassigned DRILL-3827: Assignee: Parth Chandra (was: Jinfeng Ni) > Empty metadata file causes queries on the table to fail > --- > > Key: DRILL-3827 > URL: https://issues.apache.org/jira/browse/DRILL-3827 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Victoria Markman >Assignee: Parth Chandra >Priority: Critical > > I ran into a situation where drill created an empty metadata file (which is a > separate issue and I will try to narrow it down. Suspicion is that this > happens when "refresh table metada x" fails with "permission denied" error). > However, we need to guard against situation where metadata file is empty or > corrupted. We probably should skip reading it if we encounter unexpected > result and continue with query planning without that information. In the same > fashion as partition pruning failure. It's also important to log this > information somewhere, drillbit.log as a start. It would be really nice to > have a flag in the query profile that tells a user if we used metadata file > for planning or not. Will help in debugging performance issues. > Very confusing exception is thrown if you have zero length meta data file in > the directory: > {code} > [Wed Sep 23 07:45:28] # ls -la > total 2 > drwxr-xr-x 2 root root 2 Sep 10 14:55 . > drwxr-xr-x 16 root root 35 Sep 15 12:54 .. > -rwxr-xr-x 1 root root 483 Jul 1 11:29 0_0_0.parquet > -rwxr-xr-x 1 root root 0 Sep 10 14:55 .drill.parquet_metadata > 0: jdbc:drill:schema=dfs> select * from t1; > Error: SYSTEM ERROR: JsonMappingException: No content to map due to > end-of-input > at [Source: com.mapr.fs.MapRFsDataInputStream@342bd88d; line: 1, column: 1] > [Error Id: c97574f6-b3e8-4183-8557-c30df6ca675f on atsqa4-133.qa.lab:31010] > (state=,code=0) > {code} > Workaround is trivial, remove the file. Marking it as critical, since we > don't have any concurrency control in place and this file can get corrupted > as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3965: --- Attachment: (was: DRILL-3965.patch) > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Mehant Baid > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3965: --- Attachment: DRILL-3965.patch Updated patch with minor changes > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Mehant Baid > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
[ https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristine Hahn closed DRILL-3485. > Doc. site JDBC page(s) should at least point to JDBC Javadoc in source > -- > > Key: DRILL-3485 > URL: https://issues.apache.org/jira/browse/DRILL-3485 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) >Assignee: Kristine Hahn > Fix For: Future, 1.3.0 > > > We don't yet generate and publish Javadoc documentation for Drill's JDBC > driver, and therefore the Drill documentation site's JDBC pages can't yet > link to generated Javadoc documentation as they eventually should. > However, we have already written Javadoc source documentation for much of the > Drill-specific behavior and extensions in the JDBC interface. > Since that documentation already exists, we should point users to it somehow > (until we provide its information to the users normally, as generated Javadoc > documentation). > Therefore, in the interim, the Drill documentation site's JDBC pages should > at least point to the source code at > [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
[ https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970099#comment-14970099 ] Kristine Hahn edited comment on DRILL-3485 at 10/22/15 11:32 PM: - Done--added javadoc url to Java Driver doc. was (Author: krishahn): Done > Doc. site JDBC page(s) should at least point to JDBC Javadoc in source > -- > > Key: DRILL-3485 > URL: https://issues.apache.org/jira/browse/DRILL-3485 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) >Assignee: Kristine Hahn > Fix For: Future, 1.3.0 > > > We don't yet generate and publish Javadoc documentation for Drill's JDBC > driver, and therefore the Drill documentation site's JDBC pages can't yet > link to generated Javadoc documentation as they eventually should. > However, we have already written Javadoc source documentation for much of the > Drill-specific behavior and extensions in the JDBC interface. > Since that documentation already exists, we should point users to it somehow > (until we provide its information to the users normally, as generated Javadoc > documentation). > Therefore, in the interim, the Drill documentation site's JDBC pages should > at least point to the source code at > [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available
[ https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristine Hahn closed DRILL-3486. > Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's > available > -- > > Key: DRILL-3486 > URL: https://issues.apache.org/jira/browse/DRILL-3486 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) > Fix For: 1.3.0 > > > The Drill documentation site's JDBC pages should have a link to a copy of the > driver's generated Javadoc documentation once we start generating it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3820) Nested Directories : Metadata Cache in a directory stores information from sub-directories as well creating security issues
[ https://issues.apache.org/jira/browse/DRILL-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Chandra reassigned DRILL-3820: Assignee: Parth Chandra (was: Aman Sinha) > Nested Directories : Metadata Cache in a directory stores information from > sub-directories as well creating security issues > --- > > Key: DRILL-3820 > URL: https://issues.apache.org/jira/browse/DRILL-3820 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Rahul Challapalli >Assignee: Parth Chandra >Priority: Critical > Fix For: 1.3.0 > > > git.commit.id.abbrev=3c89b30 > User A has access to lineitem folder and its subfolders > User B had access to lineitem folder but not its sub-folders. > Now when User A runs the "refresh table metadata lineitem" command, the cache > file gets created under lineitem folder. This file contains information from > the underlying sub-directories as well. > Now User B can download this file and get access to information which he > should not be seeing in the first place. > This can be very easily reproducible if impersonation is enabled on the > cluster. > Let me know if you need more information to reproduce this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-3965: -- Assignee: Mehant Baid (was: Aman Sinha) > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Mehant Baid > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
[ https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristine Hahn updated DRILL-3485: - Fix Version/s: 1.3.0 > Doc. site JDBC page(s) should at least point to JDBC Javadoc in source > -- > > Key: DRILL-3485 > URL: https://issues.apache.org/jira/browse/DRILL-3485 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) >Assignee: Kristine Hahn > Fix For: Future, 1.3.0 > > > We don't yet generate and publish Javadoc documentation for Drill's JDBC > driver, and therefore the Drill documentation site's JDBC pages can't yet > link to generated Javadoc documentation as they eventually should. > However, we have already written Javadoc source documentation for much of the > Drill-specific behavior and extensions in the JDBC interface. > Since that documentation already exists, we should point users to it somehow > (until we provide its information to the users normally, as generated Javadoc > documentation). > Therefore, in the interim, the Drill documentation site's JDBC pages should > at least point to the source code at > [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
[ https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kristine Hahn resolved DRILL-3485. -- Resolution: Fixed Done > Doc. site JDBC page(s) should at least point to JDBC Javadoc in source > -- > > Key: DRILL-3485 > URL: https://issues.apache.org/jira/browse/DRILL-3485 > Project: Apache Drill > Issue Type: Bug > Components: Documentation >Reporter: Daniel Barclay (Drill) >Assignee: Kristine Hahn > Fix For: Future, 1.3.0 > > > We don't yet generate and publish Javadoc documentation for Drill's JDBC > driver, and therefore the Drill documentation site's JDBC pages can't yet > link to generated Javadoc documentation as they eventually should. > However, we have already written Javadoc source documentation for much of the > Drill-specific behavior and extensions in the JDBC interface. > Since that documentation already exists, we should point users to it somehow > (until we provide its information to the users normally, as generated Javadoc > documentation). > Therefore, in the interim, the Drill documentation site's JDBC pages should > at least point to the source code at > [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2868) Drill returning incorrect data when we have fields missing in some of the files
[ https://issues.apache.org/jira/browse/DRILL-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Chandra updated DRILL-2868: - Assignee: (was: Chris Westin) > Drill returning incorrect data when we have fields missing in some of the > files > --- > > Key: DRILL-2868 > URL: https://issues.apache.org/jira/browse/DRILL-2868 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators, Storage - JSON, > Storage - Parquet >Reporter: Rahul Challapalli >Priority: Critical > Fix For: Future > > > git.commit.id.abbrev=5cd36c5 > Data File1 : a.json > {code} > { "c1" : 1, "m1" : {"m2" : {"m3" : {"c2" : 5} } } } > { "c1" : 2, "m1" : {"m2" : {"m3" : {"c2" : 6} } } } > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > {code} > Data File2 : b.json > {code} > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > {code} > Data File3 : c.json > {code} > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } } > {code} > The below query reports incorrect data : > {code} > select t.m1.m2.m3 from `delme_repro` as `t`; > ++ > | EXPR$0 | > ++ > | null | > | null | > | null | > | null | > | null | > | null | > | null | > | null | > | null | > ++ > 9 rows selected (0.139 seconds) > {code} > However if I run the same query on the specific file, I get the correct output > {code} > select t.m1.m2.m3 from `delme_repro/a.json` as `t`; > ++ > | EXPR$0 | > ++ > | {"c2":5} | > | {"c2":6} | > | {} | > ++ > 3 rows selected (0.113 seconds) > {code} > It looks like the file size plays a part in deciding the order in which Drill > reads the files. But there could be more to this than just the order because > when I made sure that 'b.json' and 'c.json' only had one records, drill > correctly reported the data. > Let me know if you have any questions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3965) Index out of bounds exception in partition pruning
[ https://issues.apache.org/jira/browse/DRILL-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970207#comment-14970207 ] Aman Sinha commented on DRILL-3965: --- +1 . Could you add javadoc to the new class and also rename the unit test ? > Index out of bounds exception in partition pruning > -- > > Key: DRILL-3965 > URL: https://issues.apache.org/jira/browse/DRILL-3965 > Project: Apache Drill > Issue Type: Bug >Reporter: Mehant Baid >Assignee: Aman Sinha > Attachments: DRILL-3965.patch > > > Hit IOOB while trying to perform partition pruning on a table that was > created using CTAS auto partitioning with the below stack trace. > Caused by: java.lang.StringIndexOutOfBoundsException: String index out of > range: -8 > at java.lang.String.substring(String.java:1875) ~[na:1.7.0_79] > at > org.apache.drill.exec.planner.DFSPartitionLocation.(DFSPartitionLocation.java:31) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > ~[drill-java-exec-1.2.0.jar:1.2.0] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > ~[calcite-core-1.4.0-drill-r5.jar:1.4.0-drill-r5] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes
[ https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970367#comment-14970367 ] ASF GitHub Bot commented on DRILL-3742: --- Github user jacques-n commented on a diff in the pull request: https://github.com/apache/drill/pull/148#discussion_r42830561 --- Diff: common/src/main/java/org/apache/drill/common/scanner/RunTimeScan.java --- @@ -0,0 +1,79 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.common.scanner; + +import static java.util.Collections.emptySet; + +import java.net.URL; +import java.util.Collection; +import java.util.Set; +import java.util.TreeSet; + +import org.apache.drill.common.config.DrillConfig; +import org.apache.drill.common.scanner.persistence.ScanResult; + +/** + * Utility to scan classpath at runtime + * + */ +public class RunTimeScan { + + // result of prescan + private static final ScanResult PRESCANNED = BuildTimeScan.load(); + + // urls of the locations (classes directory or jar) to scan that don't have a registry in them + private static final Collection NON_PRESCANNED_MARKED_PATHS = getNonPrescannedMarkedPaths(); + + // one element cache + private static Set SCANNED_PACKAGE_PREFIXES_FROM_CONFIG = emptySet(); --- End diff -- Since you're modifying interfaces anyway, can we please remove this static cache? At a minimum, I'd rather carry it on the DrillConfig object than in a static. > Improve classpath scanning to reduce the time it takes > -- > > Key: DRILL-3742 > URL: https://issues.apache.org/jira/browse/DRILL-3742 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: Future > > > classpath scanning and function registry take a long time (seconds every > time). > We'd want to avoid loading the classes (use bytecode inspection instead) and > have a build time cache to avoid doing the scanning at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3229) Create a new EmbeddedVector
[ https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970439#comment-14970439 ] Parth Chandra commented on DRILL-3229: -- Nice doc. I have a couple of quick questions - In the list writer, when the map() method is called, I didn't quite follow the reason for tracking the current field name. What is it needed for? The Type promotion proposal is excellent. But with type promotion we will update the underlying writer to a UnionWriter the moment a type change occurs. Is it possible for us to to have a hierarchy of promotable types and we promote to a higher Scalar type (e.g. Int gets promoted to a Varchar) as a first step and Union if we encounter more than one type change or a change to a complex type. I'm OK if we think this is too complex to implement. How will Screen handle a Union type? In general, a user level tool (sqlline included) will not know how to handle this. Can we have screen return a varchar representation of the Union type? During data exploration the user will then see there are type changes and can then use the type introspection and cast methods appropriately. What about metadata only queries ( i.e select * ... limit 0)? What type would the user application get? For Function Evaluation my preference is to have code generation rather than have UDFs that take a union parameter. For case statements - If a case statment can output a Union type, the end user will presumably have to resolve the different types using type introspection and an outer case statement. Actually I don't have enough idea about end user use cases to choose which is more desirable. Should we leave it as choice #2 and see what users ask for? Jacques had mentioned that you have an idea for introducing a Untyped null type. How would that fit in with this design? > Create a new EmbeddedVector > --- > > Key: DRILL-3229 > URL: https://issues.apache.org/jira/browse/DRILL-3229 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Codegen, Execution - Data Types, Execution - > Relational Operators, Functions - Drill >Reporter: Jacques Nadeau >Assignee: Steven Phillips > Fix For: Future > > > Embedded Vector will leverage a binary encoding for holding information about > type for each individual field. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes
[ https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970364#comment-14970364 ] ASF GitHub Bot commented on DRILL-3742: --- Github user jacques-n commented on a diff in the pull request: https://github.com/apache/drill/pull/148#discussion_r42830256 --- Diff: fmpp/pom.xml --- @@ -0,0 +1,58 @@ + --- End diff -- Let's make tools part of the module hierarchy. > Improve classpath scanning to reduce the time it takes > -- > > Key: DRILL-3742 > URL: https://issues.apache.org/jira/browse/DRILL-3742 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: Future > > > classpath scanning and function registry take a long time (seconds every > time). > We'd want to avoid loading the classes (use bytecode inspection instead) and > have a build time cache to avoid doing the scanning at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3749) Upgrade Hadoop dependency to latest version (2.7.1)
[ https://issues.apache.org/jira/browse/DRILL-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970370#comment-14970370 ] ASF GitHub Bot commented on DRILL-3749: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/203#issuecomment-150451832 LGTM +1. Let's get this merged. > Upgrade Hadoop dependency to latest version (2.7.1) > --- > > Key: DRILL-3749 > URL: https://issues.apache.org/jira/browse/DRILL-3749 > Project: Apache Drill > Issue Type: New Feature > Components: Tools, Build & Test >Affects Versions: 1.1.0 >Reporter: Venki Korukanti >Assignee: Jason Altekruse > Fix For: Future > > > Logging a JIRA to track and discuss upgrading Drill's Hadoop dependency > version. Currently Drill depends on Hadoop 2.5.0 version. Newer version of > Hadoop (2.7.1) has following features. > 1) Better S3 support > 2) Ability to check if a user has certain permissions on file/directory > without performing operations on the file/dir. Useful for cases like > DRILL-3467. > As Drill is going to use higher version of Hadoop fileclient, there could be > potential issues when interacting with Hadoop services (such as HDFS) of > lower version than the fileclient. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes
[ https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970396#comment-14970396 ] ASF GitHub Bot commented on DRILL-3742: --- Github user julienledem commented on the pull request: https://github.com/apache/drill/pull/148#issuecomment-150460962 @jacques-n Thanks for the review! > Improve classpath scanning to reduce the time it takes > -- > > Key: DRILL-3742 > URL: https://issues.apache.org/jira/browse/DRILL-3742 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: Future > > > classpath scanning and function registry take a long time (seconds every > time). > We'd want to avoid loading the classes (use bytecode inspection instead) and > have a build time cache to avoid doing the scanning at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes
[ https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970368#comment-14970368 ] ASF GitHub Bot commented on DRILL-3742: --- Github user jacques-n commented on a diff in the pull request: https://github.com/apache/drill/pull/148#discussion_r42830643 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/PhysicalOperatorUtil.java --- @@ -33,14 +32,10 @@ private PhysicalOperatorUtil() {} - public synchronized static Class[] getSubTypes(final DrillConfig config) { -final Class[] ops = -PathScanner.scanForImplementationsArr(PhysicalOperator.class, - config.getStringList(CommonConstants.PHYSICAL_OPERATOR_SCAN_PACKAGES)); -final String lineBrokenList = -ops.length == 0 ? "" : "\n\t- " + Joiner.on("\n\t- ").join(ops); -logger.debug("Found {} physical operator classes: {}.", ops.length, - lineBrokenList); + public synchronized static SetgetSubTypes(ScanResult classpathScan) { --- End diff -- Remove synchronized. > Improve classpath scanning to reduce the time it takes > -- > > Key: DRILL-3742 > URL: https://issues.apache.org/jira/browse/DRILL-3742 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: Future > > > classpath scanning and function registry take a long time (seconds every > time). > We'd want to avoid loading the classes (use bytecode inspection instead) and > have a build time cache to avoid doing the scanning at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3742) Improve classpath scanning to reduce the time it takes
[ https://issues.apache.org/jira/browse/DRILL-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970369#comment-14970369 ] ASF GitHub Bot commented on DRILL-3742: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/148#issuecomment-150451599 Few changes: move fmpp to tools directory, remove static cache and one other tweak and looks good to me. > Improve classpath scanning to reduce the time it takes > -- > > Key: DRILL-3742 > URL: https://issues.apache.org/jira/browse/DRILL-3742 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem > Fix For: Future > > > classpath scanning and function registry take a long time (seconds every > time). > We'd want to avoid loading the classes (use bytecode inspection instead) and > have a build time cache to avoid doing the scanning at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)