[jira] [Created] (DRILL-7130) IllegalStateException: Read batch count [0] should be greater than zero
salim achouche created DRILL-7130: - Summary: IllegalStateException: Read batch count [0] should be greater than zero Key: DRILL-7130 URL: https://issues.apache.org/jira/browse/DRILL-7130 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.15.0 Reporter: salim achouche Assignee: salim achouche Fix For: 1.17.0 The following exception is being hit when reading parquet data: Caused by: java.lang.IllegalStateException: Read batch count [0] should be greater than zero at org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:509) ~[drill-shaded-guava-23.0.jar:23.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenNullableFixedEntryReader.getEntry(VarLenNullableFixedEntryReader.java:49) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getFixedEntry(VarLenBulkPageReader.java:167) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:132) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:154) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:38) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624) ~[vector-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:716) ~[vector-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:114) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:92) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:156) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:43) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:288) ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] ... 29 common frames omitted -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7129) Join with more than 1 condition is not using stats to compute row count estimate
[ https://issues.apache.org/jira/browse/DRILL-7129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gautam Parai reassigned DRILL-7129: --- Assignee: Gautam Parai > Join with more than 1 condition is not using stats to compute row count > estimate > > > Key: DRILL-7129 > URL: https://issues.apache.org/jira/browse/DRILL-7129 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.16.0 >Reporter: Anisha Reddy >Assignee: Gautam Parai >Priority: Major > Fix For: 1.17.0 > > > Below are the details: > > {code:java} > 0: jdbc:drill:drillbit=10.10.101.108> select count(*) from > `table_stats/Tpch0.01/parquet/lineitem`; +-+ | EXPR$0 | +-+ | > 57068 | +-+ 1 row selected (0.179 seconds) > 0: jdbc:drill:drillbit=10.10.101.108> select count(*) from > `table_stats/Tpch0.01/parquet/partsupp`; +-+ | EXPR$0 | +-+ | > 7474 | +-+ 1 row selected (0.171 seconds) > 0: jdbc:drill:drillbit=10.10.101.108> select count(*) from > `table_stats/Tpch0.01/parquet/lineitem` l, > `table_stats/Tpch0.01/parquet/partsupp` ps where l.l_partkey = ps.ps_partkey > and l.l_suppkey = ps.ps_suppkey; +-+ | EXPR$0 | +-+ | 53401 | > +-+ 1 row selected (0.769 seconds) > 0: jdbc:drill:drillbit=10.10.101.108> explain plan including all attributes > for select * from `table_stats/Tpch0.01/parquet/lineitem` l, > `table_stats/Tpch0.01/parquet/partsupp` ps where l.l_partkey = ps.ps_partkey > and l.l_suppkey = ps.ps_suppkey; > +--+--+ > | text | json | > +--+--+ > | 00-00 Screen : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0): > rowcount = 57068.0, cumulative cost = {313468.8 rows, 2110446.8 cpu, 193626.0 > io, 0.0 network, 197313.6 memory}, id = 107578 00-01 ProjectAllowDup(**=[$0], > **0=[$1]) : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0): rowcount > = 57068.0, cumulative cost = {307762.0 rows, 2104740.0 cpu, 193626.0 io, 0.0 > network, 197313.6 memory}, id = 107577 00-02 Project(T10¦¦**=[$0], > T11¦¦**=[$3]) : rowType = RecordType(DYNAMIC_STAR T10¦¦**, DYNAMIC_STAR > T11¦¦**): rowcount = 57068.0, cumulative cost = {250694.0 rows, 1990604.0 > cpu, 193626.0 io, 0.0 network, 197313.6 memory}, id = 107576 00-03 > HashJoin(condition=[AND(=($1, $4), =($2, $5))], joinType=[inner], semi-join: > =[false]) : rowType = RecordType(DYNAMIC_STAR T10¦¦**, ANY l_partkey, ANY > l_suppkey, DYNAMIC_STAR T11¦¦**, ANY ps_partkey, ANY ps_suppkey): rowcount = > 57068.0, cumulative cost = {193626.0 rows, 1876468.0 cpu, 193626.0 io, 0.0 > network, 197313.6 memory}, id = 107575 00-05 Project(T10¦¦**=[$0], > l_partkey=[$1], l_suppkey=[$2]) : rowType = RecordType(DYNAMIC_STAR T10¦¦**, > ANY l_partkey, ANY l_suppkey): rowcount = 57068.0, cumulative cost = > {114136.0 rows, 342408.0 cpu, 171204.0 io, 0.0 network, 0.0 memory}, id = > 107572 00-07 Scan(table=[[dfs, drilltestdir, > table_stats/Tpch0.01/parquet/lineitem]], groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/table_stats/Tpch0.01/parquet/lineitem]], > selectionRoot=maprfs:/drill/testdata/table_stats/Tpch0.01/parquet/lineitem, > numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`**`, > `l_partkey`, `l_suppkey`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY > l_partkey, ANY l_suppkey): rowcount = 57068.0, cumulative cost = {57068.0 > rows, 171204.0 cpu, 171204.0 io, 0.0 network, 0.0 memory}, id = 107571 00-04 > Project(T11¦¦**=[$0], ps_partkey=[$1], ps_suppkey=[$2]) : rowType = > RecordType(DYNAMIC_STAR T11¦¦**, ANY ps_partkey, ANY ps_suppkey): rowcount = > 7474.0, cumulative cost = {14948.0 rows, 44844.0 cpu, 22422.0 io, 0.0 > network, 0.0 memory}, id = 107574 00-06 Scan(table=[[dfs, drilltestdir, > table_stats/Tpch0.01/parquet/partsupp]], groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/table_stats/Tpch0.01/parquet/partsupp]], > selectionRoot=maprfs:/drill/testdata/table_stats/Tpch0.01/parquet/partsupp, > numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`**`, > `ps_partkey`, `ps_suppkey`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY > ps_partkey, ANY ps_suppkey): rowcount = 7474.0, cumulative cost = {7474.0 > rows, 22422.0 cpu, 22422.0 io, 0.0 network, 0.0 memory}, id = 107573 > {code} > The ndv for l_partkey = 2000 > ps_partkey = 1817 > l_supkey = 100 > ps_suppkey = 100 > We see that such joins is
[jira] [Created] (DRILL-7129) Join with more than 1 condition is not using stats to compute row count estimate
Anisha Reddy created DRILL-7129: --- Summary: Join with more than 1 condition is not using stats to compute row count estimate Key: DRILL-7129 URL: https://issues.apache.org/jira/browse/DRILL-7129 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Anisha Reddy Fix For: 1.17.0 Below are the details: {code:java} 0: jdbc:drill:drillbit=10.10.101.108> select count(*) from `table_stats/Tpch0.01/parquet/lineitem`; +-+ | EXPR$0 | +-+ | 57068 | +-+ 1 row selected (0.179 seconds) 0: jdbc:drill:drillbit=10.10.101.108> select count(*) from `table_stats/Tpch0.01/parquet/partsupp`; +-+ | EXPR$0 | +-+ | 7474 | +-+ 1 row selected (0.171 seconds) 0: jdbc:drill:drillbit=10.10.101.108> select count(*) from `table_stats/Tpch0.01/parquet/lineitem` l, `table_stats/Tpch0.01/parquet/partsupp` ps where l.l_partkey = ps.ps_partkey and l.l_suppkey = ps.ps_suppkey; +-+ | EXPR$0 | +-+ | 53401 | +-+ 1 row selected (0.769 seconds) 0: jdbc:drill:drillbit=10.10.101.108> explain plan including all attributes for select * from `table_stats/Tpch0.01/parquet/lineitem` l, `table_stats/Tpch0.01/parquet/partsupp` ps where l.l_partkey = ps.ps_partkey and l.l_suppkey = ps.ps_suppkey; +--+--+ | text | json | +--+--+ | 00-00 Screen : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0): rowcount = 57068.0, cumulative cost = {313468.8 rows, 2110446.8 cpu, 193626.0 io, 0.0 network, 197313.6 memory}, id = 107578 00-01 ProjectAllowDup(**=[$0], **0=[$1]) : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0): rowcount = 57068.0, cumulative cost = {307762.0 rows, 2104740.0 cpu, 193626.0 io, 0.0 network, 197313.6 memory}, id = 107577 00-02 Project(T10¦¦**=[$0], T11¦¦**=[$3]) : rowType = RecordType(DYNAMIC_STAR T10¦¦**, DYNAMIC_STAR T11¦¦**): rowcount = 57068.0, cumulative cost = {250694.0 rows, 1990604.0 cpu, 193626.0 io, 0.0 network, 197313.6 memory}, id = 107576 00-03 HashJoin(condition=[AND(=($1, $4), =($2, $5))], joinType=[inner], semi-join: =[false]) : rowType = RecordType(DYNAMIC_STAR T10¦¦**, ANY l_partkey, ANY l_suppkey, DYNAMIC_STAR T11¦¦**, ANY ps_partkey, ANY ps_suppkey): rowcount = 57068.0, cumulative cost = {193626.0 rows, 1876468.0 cpu, 193626.0 io, 0.0 network, 197313.6 memory}, id = 107575 00-05 Project(T10¦¦**=[$0], l_partkey=[$1], l_suppkey=[$2]) : rowType = RecordType(DYNAMIC_STAR T10¦¦**, ANY l_partkey, ANY l_suppkey): rowcount = 57068.0, cumulative cost = {114136.0 rows, 342408.0 cpu, 171204.0 io, 0.0 network, 0.0 memory}, id = 107572 00-07 Scan(table=[[dfs, drilltestdir, table_stats/Tpch0.01/parquet/lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/table_stats/Tpch0.01/parquet/lineitem]], selectionRoot=maprfs:/drill/testdata/table_stats/Tpch0.01/parquet/lineitem, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`**`, `l_partkey`, `l_suppkey`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY l_partkey, ANY l_suppkey): rowcount = 57068.0, cumulative cost = {57068.0 rows, 171204.0 cpu, 171204.0 io, 0.0 network, 0.0 memory}, id = 107571 00-04 Project(T11¦¦**=[$0], ps_partkey=[$1], ps_suppkey=[$2]) : rowType = RecordType(DYNAMIC_STAR T11¦¦**, ANY ps_partkey, ANY ps_suppkey): rowcount = 7474.0, cumulative cost = {14948.0 rows, 44844.0 cpu, 22422.0 io, 0.0 network, 0.0 memory}, id = 107574 00-06 Scan(table=[[dfs, drilltestdir, table_stats/Tpch0.01/parquet/partsupp]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/table_stats/Tpch0.01/parquet/partsupp]], selectionRoot=maprfs:/drill/testdata/table_stats/Tpch0.01/parquet/partsupp, numFiles=1, numRowGroups=1, usedMetadataFile=false, columns=[`**`, `ps_partkey`, `ps_suppkey`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY ps_partkey, ANY ps_suppkey): rowcount = 7474.0, cumulative cost = {7474.0 rows, 22422.0 cpu, 22422.0 io, 0.0 network, 0.0 memory}, id = 107573 {code} The ndv for l_partkey = 2000 ps_partkey = 1817 l_supkey = 100 ps_suppkey = 100 We see that such joins is just taking the max of left side and the right side table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (DRILL-7126) Contrib format-ltsv is not being included in distribution
[ https://issues.apache.org/jira/browse/DRILL-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish closed DRILL-7126. -- > Contrib format-ltsv is not being included in distribution > - > > Key: DRILL-7126 > URL: https://issues.apache.org/jira/browse/DRILL-7126 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Abhishek Girish >Assignee: Abhishek Girish >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Unable to add the ltsv format in the dfs storage plugin. Looks like it's a > build distribution issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7110) Skip writing profile when an ALTER SESSION is executed
[ https://issues.apache.org/jira/browse/DRILL-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7110: Labels: doc-impacting ready-to-commit (was: doc-impacting) > Skip writing profile when an ALTER SESSION is executed > -- > > Key: DRILL-7110 > URL: https://issues.apache.org/jira/browse/DRILL-7110 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Monitoring >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Minor > Labels: doc-impacting, ready-to-commit > Fix For: 1.16.0 > > > Currently, any {{ALTER }} query will be logged. While this is useful, > it can potentially add up to a lot of profiles being written unnecessarily, > since those changes are also reflected on the queries that follow. > This JIRA is proposing an option to skip writing such profiles to the profile > store. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7128) IllegalStateException: Read batch count [0] should be greater than zero
Khurram Faraaz created DRILL-7128: - Summary: IllegalStateException: Read batch count [0] should be greater than zero Key: DRILL-7128 URL: https://issues.apache.org/jira/browse/DRILL-7128 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.15.0 Reporter: Khurram Faraaz Source table is a Hive table stored as parquet. Issue is seen only when querying datacapturekey column, which is of VARCHAR type. Hive 2.3 MapR Drill : 1.15.0.0-mapr commit id : 951ef156fb1025677a2ca2dcf84e11002bf4b513 {noformat} 0: jdbc:drill:drillbit=test.a.node1> describe bt_br_cc_invalid_leads ; +-++--+ | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | +-++--+ | wrapup | CHARACTER VARYING | YES | | datacapturekey | CHARACTER VARYING | YES | | leadgendate | CHARACTER VARYING | YES | | crla1 | CHARACTER VARYING | YES | | crla2 | CHARACTER VARYING | YES | | invalid_lead | INTEGER | YES | | destination_advertiser_vendor_name | CHARACTER VARYING | YES | | source_program_key | CHARACTER VARYING | YES | | publisher_publisher | CHARACTER VARYING | YES | | areaname | CHARACTER VARYING | YES | | data_abertura_ficha | CHARACTER VARYING | YES | +-++--+ 11 rows selected (1.85 seconds) 0: jdbc:drill:drillbit=test.a.node1> // from the view definition, note that column datacapturekey is of type VARVCHAR with precision 2000 { "name" : "bt_br_cc_invalid_leads", "sql" : "SELECT CAST(`wrapup` AS VARCHAR(2000)) AS `wrapup`, CAST(`datacapturekey` AS VARCHAR(2000)) AS `datacapturekey`, CAST(`leadgendate` AS VARCHAR(2000)) AS `leadgendate`, CAST(`crla1` AS VARCHAR(2000)) AS `crla1`, CAST(`crla2` AS VARCHAR(2000)) AS `crla2`, CAST(`invalid_lead` AS INTEGER) AS `invalid_lead`, CAST(`destination_advertiser_vendor_name` AS VARCHAR(2000)) AS `destination_advertiser_vendor_name`, CAST(`source_program_key` AS VARCHAR(2000)) AS `source_program_key`, CAST(`publisher_publisher` AS VARCHAR(2000)) AS `publisher_publisher`, CAST(`areaname` AS VARCHAR(2000)) AS `areaname`, CAST(`data_abertura_ficha` AS VARCHAR(2000)) AS `data_abertura_ficha`\nFROM `dfs`.`root`.`/user/bigtable/logs/hive/warehouse/bt_br_cc_invalid_leads`", "fields" : [ { "name" : "wrapup", "type" : "VARCHAR", "precision" : 2000, "isNullable" : true }, { "name" : "datacapturekey", "type" : "VARCHAR", "precision" : 2000, "isNullable" : true ... ... // total number of rows in bt_br_cc_invalid_leads 0: jdbc:drill:drillbit=test.a.node1> select count(*) from bt_br_cc_invalid_leads ; +-+ | EXPR$0 | +-+ | 20599 | +-+ 1 row selected (0.173 seconds) {noformat} Stack trace from drillbit.log {noformat} 2019-03-18 12:19:01,610 [237010da-6eda-a913-0424-32f63fbe01be:foreman] INFO o.a.drill.exec.work.foreman.Foreman - Query text for query with id 237010da-6eda-a913-0424-32f63fbe01be issued by bigtable: SELECT `bt_br_cc_invalid_leads`.`datacapturekey` AS `datacapturekey` FROM `dfs.drill_views`.`bt_br_cc_invalid_leads` `bt_br_cc_invalid_leads` GROUP BY `bt_br_cc_invalid_leads`.`datacapturekey` 2019-03-18 12:19:02,495 [237010da-6eda-a913-0424-32f63fbe01be:frag:0:0] INFO o.a.d.e.w.fragment.FragmentExecutor - 237010da-6eda-a913-0424-32f63fbe01be:0:0: State change requested AWAITING_ALLOCATION --> RUNNING 2019-03-18 12:19:02,495 [237010da-6eda-a913-0424-32f63fbe01be:frag:0:0] INFO o.a.d.e.w.f.FragmentStatusReporter - 237010da-6eda-a913-0424-32f63fbe01be:0:0: State to report: RUNNING 2019-03-18 12:19:02,502 [237010da-6eda-a913-0424-32f63fbe01be:frag:0:0] INFO o.a.d.exec.physical.impl.ScanBatch - User Error Occurred: Error in parquet record reader. Message: Hadoop path: /user/bigtable/logs/hive/warehouse/bt_br_cc_invalid_leads/08_0 Total records read: 0 Row group index: 0 Records in row group: 1551 Parquet Metadata: ParquetMetaData{FileMetaData{schema: message hive_schema { optional binary wrapup (UTF8); optional binary datacapturekey (UTF8); optional binary leadgendate (UTF8); optional binary crla1 (UTF8); optional binary crla2 (UTF8); optional binary invalid_lead (UTF8); optional binary destination_advertiser_vendor_name (UTF8); optional binary source_program_key (UTF8); optional binary publisher_publisher (UTF8); optional binary areaname (UTF8); optional binary data_abertura_ficha (UTF8); } , metadata: {}}, blocks: [BlockMetaData\{1551, 139906 [ColumnMetaData{UNCOMPRESSED [wrapup] optional binary wrapup (UTF8) [PLAIN_DICTIONARY, RLE, BIT_PACKED], 4}, ColumnMetaData\{UNCOMPRESSED [datacapturekey] optional binary datacapturekey (UTF8) [RLE, PLAIN, BIT_PACKED], 656}, ColumnMetaData\{UNCOMPRESSED [leadgendate] optional binary leadgendate (UTF8) [PLAIN_DICTIONARY, RLE, BIT_PACKED], 23978},
[jira] [Assigned] (DRILL-7128) IllegalStateException: Read batch count [0] should be greater than zero
[ https://issues.apache.org/jira/browse/DRILL-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz reassigned DRILL-7128: - Assignee: salim achouche > IllegalStateException: Read batch count [0] should be greater than zero > --- > > Key: DRILL-7128 > URL: https://issues.apache.org/jira/browse/DRILL-7128 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.15.0 >Reporter: Khurram Faraaz >Assignee: salim achouche >Priority: Major > > Source table is a Hive table stored as parquet. > Issue is seen only when querying datacapturekey column, which is of VARCHAR > type. > Hive 2.3 > MapR Drill : 1.15.0.0-mapr > commit id : 951ef156fb1025677a2ca2dcf84e11002bf4b513 > {noformat} > 0: jdbc:drill:drillbit=test.a.node1> describe bt_br_cc_invalid_leads ; > +-++--+ > | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | > +-++--+ > | wrapup | CHARACTER VARYING | YES | > | datacapturekey | CHARACTER VARYING | YES | > | leadgendate | CHARACTER VARYING | YES | > | crla1 | CHARACTER VARYING | YES | > | crla2 | CHARACTER VARYING | YES | > | invalid_lead | INTEGER | YES | > | destination_advertiser_vendor_name | CHARACTER VARYING | YES | > | source_program_key | CHARACTER VARYING | YES | > | publisher_publisher | CHARACTER VARYING | YES | > | areaname | CHARACTER VARYING | YES | > | data_abertura_ficha | CHARACTER VARYING | YES | > +-++--+ > 11 rows selected (1.85 seconds) > 0: jdbc:drill:drillbit=test.a.node1> > // from the view definition, note that column datacapturekey is of type > VARVCHAR with precision 2000 > { > "name" : "bt_br_cc_invalid_leads", > "sql" : "SELECT CAST(`wrapup` AS VARCHAR(2000)) AS `wrapup`, > CAST(`datacapturekey` AS VARCHAR(2000)) AS `datacapturekey`, > CAST(`leadgendate` AS VARCHAR(2000)) AS `leadgendate`, CAST(`crla1` AS > VARCHAR(2000)) AS `crla1`, CAST(`crla2` AS VARCHAR(2000)) AS `crla2`, > CAST(`invalid_lead` AS INTEGER) AS `invalid_lead`, > CAST(`destination_advertiser_vendor_name` AS VARCHAR(2000)) AS > `destination_advertiser_vendor_name`, CAST(`source_program_key` AS > VARCHAR(2000)) AS `source_program_key`, CAST(`publisher_publisher` AS > VARCHAR(2000)) AS `publisher_publisher`, CAST(`areaname` AS VARCHAR(2000)) AS > `areaname`, CAST(`data_abertura_ficha` AS VARCHAR(2000)) AS > `data_abertura_ficha`\nFROM > `dfs`.`root`.`/user/bigtable/logs/hive/warehouse/bt_br_cc_invalid_leads`", > "fields" : [ { > "name" : "wrapup", > "type" : "VARCHAR", > "precision" : 2000, > "isNullable" : true > }, { > "name" : "datacapturekey", > "type" : "VARCHAR", > "precision" : 2000, > "isNullable" : true > ... > ... > // total number of rows in bt_br_cc_invalid_leads > 0: jdbc:drill:drillbit=test.a.node1> select count(*) from > bt_br_cc_invalid_leads ; > +-+ > | EXPR$0 | > +-+ > | 20599 | > +-+ > 1 row selected (0.173 seconds) > {noformat} > Stack trace from drillbit.log > {noformat} > 2019-03-18 12:19:01,610 [237010da-6eda-a913-0424-32f63fbe01be:foreman] INFO > o.a.drill.exec.work.foreman.Foreman - Query text for query with id > 237010da-6eda-a913-0424-32f63fbe01be issued by bigtable: SELECT > `bt_br_cc_invalid_leads`.`datacapturekey` AS `datacapturekey` > FROM `dfs.drill_views`.`bt_br_cc_invalid_leads` `bt_br_cc_invalid_leads` > GROUP BY `bt_br_cc_invalid_leads`.`datacapturekey` > 2019-03-18 12:19:02,495 [237010da-6eda-a913-0424-32f63fbe01be:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 237010da-6eda-a913-0424-32f63fbe01be:0:0: State change requested > AWAITING_ALLOCATION --> RUNNING > 2019-03-18 12:19:02,495 [237010da-6eda-a913-0424-32f63fbe01be:frag:0:0] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 237010da-6eda-a913-0424-32f63fbe01be:0:0: State to report: RUNNING > 2019-03-18 12:19:02,502 [237010da-6eda-a913-0424-32f63fbe01be:frag:0:0] INFO > o.a.d.exec.physical.impl.ScanBatch - User Error Occurred: Error in parquet > record reader. > Message: > Hadoop path: > /user/bigtable/logs/hive/warehouse/bt_br_cc_invalid_leads/08_0 > Total records read: 0 > Row group index: 0 > Records in row group: 1551 > Parquet Metadata: ParquetMetaData{FileMetaData{schema: message hive_schema { > optional binary wrapup (UTF8); > optional binary datacapturekey (UTF8); > optional binary leadgendate (UTF8); > optional binary crla1 (UTF8); > optional binary crla2 (UTF8); > optional binary invalid_lead (UTF8); > optional binary destination_advertiser_vendor_name (UTF8); > optional binary source_program_key (UTF8); > optional binary publisher_publisher (UTF8); > optional binary
[jira] [Updated] (DRILL-7118) Filter not getting pushed down on MapR-DB tables.
[ https://issues.apache.org/jira/browse/DRILL-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-7118: - Reviewer: Aman Sinha > Filter not getting pushed down on MapR-DB tables. > - > > Key: DRILL-7118 > URL: https://issues.apache.org/jira/browse/DRILL-7118 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.15.0 >Reporter: Hanumath Rao Maduri >Assignee: Hanumath Rao Maduri >Priority: Major > Fix For: 1.16.0 > > > A simple is null filter is not being pushed down for the mapr-db tables. Here > is the repro for the same. > {code:java} > 0: jdbc:drill:zk=local> explain plan for select * from dfs.`/tmp/js` where b > is null; > ANTLR Tool version 4.5 used for code generation does not match the current > runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation > does not match the current runtime version 4.7.1ANTLR Tool version 4.5 used > for code generation does not match the current runtime version 4.7.1ANTLR > Runtime version 4.5 used for parser compilation does not match the current > runtime version > 4.7.1+--+--+ > | text | json | > +--+--+ > | 00-00 Screen > 00-01 Project(**=[$0]) > 00-02 Project(T0¦¦**=[$0]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[IS NULL($1)]) > 00-05 Project(T0¦¦**=[$0], b=[$1]) > 00-06 Scan(table=[[dfs, /tmp/js]], groupscan=[JsonTableGroupScan > [ScanSpec=JsonScanSpec [tableName=/tmp/js, condition=null], columns=[`**`, > `b`], maxwidth=1]]) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6562) Plugin Management improvements
[ https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6562: Labels: doc-impacting ready-to-commit (was: doc-impacting) > Plugin Management improvements > -- > > Key: DRILL-6562 > URL: https://issues.apache.org/jira/browse/DRILL-6562 > Project: Apache Drill > Issue Type: Improvement > Components: Client - HTTP, Web Server >Affects Versions: 1.14.0 >Reporter: Abhishek Girish >Assignee: Vitalii Diravka >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.16.0 > > Attachments: Export.png, ExportAll.png, Screenshot from 2019-03-21 > 01-18-17.png, Screenshot from 2019-03-21 02-52-50.png, Storage.png, > UpdateExport.png, create.png, image-2018-07-23-02-55-02-024.png, > image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png > > > Follow-up to DRILL-4580. > Provide ability to export all storage plugin configurations at once, with a > new "Export All" option on the Storage page of the Drill web UI -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7127) Update hbase version for mapr profile
[ https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7127: - Labels: ready-to-commit (was: ) > Update hbase version for mapr profile > - > > Key: DRILL-7127 > URL: https://issues.apache.org/jira/browse/DRILL-7127 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HBase, Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Abhishek Girish >Assignee: Abhishek Girish >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - > which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7126) Contrib format-ltsv is not being included in distribution
[ https://issues.apache.org/jira/browse/DRILL-7126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-7126: - Labels: ready-to-commit (was: ) > Contrib format-ltsv is not being included in distribution > - > > Key: DRILL-7126 > URL: https://issues.apache.org/jira/browse/DRILL-7126 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Abhishek Girish >Assignee: Abhishek Girish >Priority: Major > Labels: ready-to-commit > Fix For: 1.16.0 > > > Unable to add the ltsv format in the dfs storage plugin. Looks like it's a > build distribution issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)