[jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT
[ https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515741#comment-16515741 ] David Lee commented on DRILL-4764: -- https://jira.apache.org/jira/browse/DRILL-5983 > Parquet file with INT_16, etc. logical types not supported by simple SELECT > --- > > Key: DRILL-4764 > URL: https://issues.apache.org/jira/browse/DRILL-4764 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.6.0 >Reporter: Paul Rogers >Assignee: Serhii Harnyk >Priority: Major > Fix For: 1.10.0 > > Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, > uint_32.parquet, uint_8.parquet > > > Create a Parquet file with the following schema: > message int16Data { required int32 index; required int32 value (INT_16); } > Store it as int_16.parquet in the local file system. Query it with: > SELECT * from `local`.`root`.`int_16.parquet`; > The result, in the web UI, is this error: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > UnsupportedOperationException: unsupported type: INT32 INT_16 Fragment 0:0 > [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4 on 172.30.1.28:31010] > The INT_16 logical (or "original") type simply tells consumers of the file > that the data is actually a 16-bit signed int. Presumably, this should tell > Drill to use the SmallIntVector (or NullableSmallIntVector) class for > storage. Without supporting this annotation, even 16-bit integers must be > stored as 32-bits within Drill. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT
[ https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515739#comment-16515739 ] David Lee commented on DRILL-4764: -- This bug has resurfaced in Drill 1.12 Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: ExecutionSetupException: Unsupported nullable converted type INT_16 for primitive type INT32 Fragment 1:3 > Parquet file with INT_16, etc. logical types not supported by simple SELECT > --- > > Key: DRILL-4764 > URL: https://issues.apache.org/jira/browse/DRILL-4764 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.6.0 >Reporter: Paul Rogers >Assignee: Serhii Harnyk >Priority: Major > Fix For: 1.10.0 > > Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, > uint_32.parquet, uint_8.parquet > > > Create a Parquet file with the following schema: > message int16Data { required int32 index; required int32 value (INT_16); } > Store it as int_16.parquet in the local file system. Query it with: > SELECT * from `local`.`root`.`int_16.parquet`; > The result, in the web UI, is this error: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > UnsupportedOperationException: unsupported type: INT32 INT_16 Fragment 0:0 > [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4 on 172.30.1.28:31010] > The INT_16 logical (or "original") type simply tells consumers of the file > that the data is actually a 16-bit signed int. Presumably, this should tell > Drill to use the SmallIntVector (or NullableSmallIntVector) class for > storage. Without supporting this annotation, even 16-bit integers must be > stored as 32-bits within Drill. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files
[ https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267136#comment-16267136 ] David Lee commented on DRILL-5769: -- I believe my other issue is a completely different problem. it looks like Drill is sampling the first xx number of json records to learn the schema and if it encounters a schema change down the road it throws a Index Out of Bounds. Can anyone confirm this? In my second case the error happens on record 10,200+ after the first 349 megs of the JSON file. > IndexOutOfBoundsException when querying JSON files > -- > > Key: DRILL-5769 > URL: https://issues.apache.org/jira/browse/DRILL-5769 > Project: Apache Drill > Issue Type: Bug > Components: Server, Storage - JSON >Affects Versions: 1.10.0 > Environment: *jdk_8u45_x64* > *single drillbit running on zookeeper* > *Following options set to TRUE:* > drill.exec.functions.cast_empty_string_to_null > store.json.all_text_mode > store.parquet.enable_dictionary_encoding > store.parquet.use_new_reader >Reporter: David Lee >Assignee: Jinfeng Ni > Fix For: Future > > Attachments: 001.json, 100.json, 111.json > > > *Running the following SQL on these three JSON files fail: * > 001.json 100.json 111.json > select t.id > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Error:* > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) > Fragment 0:0 [Error Id: .... > *However running the same SQL on two out of three files works:* > select t.id > from dfs.`/tmp/1??.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/?1?.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/??1.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Changing the selected column from t.id to t.* also works: * > select * > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5427) SQL Execution Syntax incorrect for Sybase RDBMS
[ https://issues.apache.org/jira/browse/DRILL-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225849#comment-16225849 ] David Lee commented on DRILL-5427: -- Here's a real example include the Drill query plan. Table Setup: use tempdb go create table my_table (column_a int, column_b varchar(20)) go insert my_table values (1, 'abc') insert my_table values (2, 'xyz') go All the following statements work in Native Transact-SQL: select * from my_table go select * from tempdb..my_table go select * from tempdb.guest.my_table go column_acolumn_b --- 1 abc 2 xyz Here's what happens if you run this in Drill: The JDBC SQL executed does not match any of the Transact-SQL statements above. select * from Sybase.tempdb.my_table 00-00Screen : rowType = RecordType(INTEGER column_a, VARCHAR(20) column_b): rowcount = 100.0, cumulative cost = {110.0 rows, 110.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2041 00-01 Project(column_a=[$0], column_b=[$1]) : rowType = RecordType(INTEGER column_a, VARCHAR(20) column_b): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2040 00-02Jdbc(sql=[SELECT * FROM "tempdb"."my_table"]) : rowType = RecordType(INTEGER column_a, VARCHAR(20) column_b): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2007 This is not a valid SQL statement in Sybase: SELECT * FROM "tempdb"."my_table" You need to either a) omit the schema, b) add an extra "." for the schema owner or c) add the schema owner which is "guest" for tempdb. > SQL Execution Syntax incorrect for Sybase RDBMS > --- > > Key: DRILL-5427 > URL: https://issues.apache.org/jira/browse/DRILL-5427 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC, SQL Parser, Storage - JDBC >Affects Versions: 1.10.0 > Environment: Windows Linux >Reporter: David Lee > Fix For: Future > > > The Sybase table syntax should be "database"."owner"."tablename", but "owner" > is not being added which produces incorrect SQL when executed. > The default owner is "DBO" for most tables. If DBO is omitted then the syntax > should be just "database".."tablename" > When SYBASE sees "database".."tablename" it: > A. Checks if there is a tablename owned by the logged in user. > B. If there is no tablename owned by the logged in user it uses DBO by default > This is what I'm seeing using a JDBC plug-in connection to Sybase ASE: > The following statements work fine: > A. show schemas > SCHEMA_NAME > cp.default > dfs.default > dfs.root > dfs.tmp > INFORMATION_SCHEMA > SYB1U > SYB1U.tempdb > B. use SYB1U.tempdb > oksummary > true Default schema changed to [SYB1U.tempdb] > C. show tables > TABLE_SCHEMA TABLE_NAME > SYB1U.tempdb sysalternates > SYB1U.tempdb sysattributes > SYB1U.tempdb syscolumns > SYB1U.tempdb syscomments > SYB1U.tempdb sysconstraints > etc.. etc.. etc.. > D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` > where TABLE_SCHEMA = 'SYB1U.tempdb' > and TABLE_NAME = 'syscolumns' > TABLE_CATALOG TABLE_SCHEMATABLE_NAME COLUMN_NAME > ORDINAL_POSITIONCOLUMN_DEFAULT IS_NULLABLE DATA_TYPE > DRILL SYB1U.tempdbsyscolumns id 1 (null) NO INTEGER > DRILL SYB1U.tempdbsyscolumns number 2 (null) NO SMALLINT > DRILL SYB1U.tempdbsyscolumns colid 3 (null) NO SMALLINT > DRILL SYB1U.tempdbsyscolumns status 4 (null) NO TINYINT > DRILL SYB1U.tempdbsyscolumns type5 (null) NO TINYINT > DRILL SYB1U.tempdbsyscolumns length 6 (null) NO INTEGER > DRILL SYB1U.tempdbsyscolumns offset 7 (null) NO SMALLINT > DRILL SYB1U.tempdbsyscolumns usertype8 (null) NO > SMALLINT > DRILL SYB1U.tempdbsyscolumns cdefault9 (null) NO > INTEGER > DRILL SYB1U.tempdbsyscolumns domain 10 (null) NO INTEGER > etc.. etc.. etc.. > However, the following statements fail: > A. select * from SYB1U.tempdb.syscolumns > DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL > query. > sql SELECT * > FROM "tempdb"."syscolumns" > plugin SYB1U > Fragment 0:0 > B. select * from SYB1U.tempdb.dbo.syscolumns > VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table > 'SYB1U.tempdb.dbo.syscolumns' not found > C. select * from SYB1U.tempdb..syscolumns > PARSE ERROR: Encountered ".." at line 1, column 27. > In A, the execution engine doesn't include the "owner" portion. > In B, adding dbo fails validation > In C, the default behavior in Sybase for ".." isn't recognized > I'm not sure if this is a Drill issue or a
[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files
[ https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216005#comment-16216005 ] David Lee commented on DRILL-5769: -- I finally debugged which json key value was causing my IOBE exception by trimming my json file down to 349 megs with around 10,200 json records. Unfortunately I couldn't generate a smaller test file which duplicates the problem. 349939049 Oct 23 16:50 test.json The JSON file contains multiple nested DealingSchedule objects like: "DealingSchedule": {"DealingTime": {"CutOffTimeDetail": [{"CutOffTimeDetailTimeZone": "1", "CutOffTimeDetail_CountryId": "CU$AUS", "CutOffTime": "15:00", "DealingType": "3"}]}} "DealingSchedule": {"DealingTime": {"CutOffTimeDetail": [{"CutOffTimeDetailTimeZone": "1", "CutOffTimeDetail_CountryId": "CU$AUS", "CutOffTime": "11:00", "DealingType": "3"}]}} "DealingSchedule": {"ValuationTimeTimeZone": "1", "ValuationTime_CountryId": "CU$AUS", "ValuationTime": "12:00"} "DealingSchedule": {"ValuationTimeTimeZone": "3", "ValuationTime_CountryId": "CU$CAN", "ValuationTime": "16:00","DealingTime": {"CutOffTimeDetail": [{"CutOffTimeDetailTimeZone": "3", "CutOffTimeDetail_CountryId": "CU$CAN", "CutOffTime": "16:00", "DealingType": "3"}]}} Near the end of the file this flavor of DealingSchedule appears which contains a DealingTimeDetail array of one record. This is the first time this DealingSchedule.DealingTime.DealingTimeDetail key appears in the JSON file: "DealingSchedule": {"ValuationTime_CountryId": "CU$GBR", "ValuationTime": "08:00", "DealingTime": {"DealingTimeDetail": [{"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": "17:00"}]}} It produces the following error: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384)) Fragment 0:0 [Error Id: 4d7e60fb-b7d0-49bd-9cf7-244dc4d7341d on ... 1. If I remove the array[] brackets and turn it into keys it works: [{"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": "17:00"}] to {"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": "17:00"} 2. I tried creating a smaller JSON file with just DealingSchedule objects, but it would read the file without errors. 3. If add extra records to the array it also returns an IOBE. {"DealingTimeDetail": [ {"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": "17:00"}, {"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": "17:00"} ]} > IndexOutOfBoundsException when querying JSON files > -- > > Key: DRILL-5769 > URL: https://issues.apache.org/jira/browse/DRILL-5769 > Project: Apache Drill > Issue Type: Bug > Components: Server, Storage - JSON >Affects Versions: 1.10.0 > Environment: *jdk_8u45_x64* > *single drillbit running on zookeeper* > *Following options set to TRUE:* > drill.exec.functions.cast_empty_string_to_null > store.json.all_text_mode > store.parquet.enable_dictionary_encoding > store.parquet.use_new_reader >Reporter: David Lee >Assignee: Jinfeng Ni > Fix For: 1.10.0, 1.11.0, 1.12.0 > > Attachments: 001.json, 100.json, 111.json > > > *Running the following SQL on these three JSON files fail: * > 001.json 100.json 111.json > select t.id > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Error:* > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) > Fragment 0:0 [Error Id: .... > *However running the same SQL on two out of three files works:* > select t.id > from dfs.`/tmp/1??.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/?1?.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/??1.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Changing the selected column from t.id to t.* also works: * > select * > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files
[ https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157935#comment-16157935 ] David Lee commented on DRILL-5769: -- My original problem was encountered when running a select * on a single 5 gig JSON file with a mix of nested keys and arrays to convert it to parquet.. Splitting that file into 10 smaller files worked to create 10 parquet files but then the same technique failed on a different 6 gig JSON since I have no control when an empty array may show up. > IndexOutOfBoundsException when querying JSON files > -- > > Key: DRILL-5769 > URL: https://issues.apache.org/jira/browse/DRILL-5769 > Project: Apache Drill > Issue Type: Bug > Components: Server, Storage - JSON >Affects Versions: 1.10.0 > Environment: *jdk_8u45_x64* > *single drillbit running on zookeeper* > *Following options set to TRUE:* > drill.exec.functions.cast_empty_string_to_null > store.json.all_text_mode > store.parquet.enable_dictionary_encoding > store.parquet.use_new_reader >Reporter: David Lee >Assignee: Jinfeng Ni > Fix For: 1.10.0, 1.11.0, 1.12.0 > > Attachments: 001.json, 100.json, 111.json > > > *Running the following SQL on these three JSON files fail: * > 001.json 100.json 111.json > select t.id > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Error:* > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) > Fragment 0:0 [Error Id: .... > *However running the same SQL on two out of three files works:* > select t.id > from dfs.`/tmp/1??.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/?1?.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/??1.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Changing the selected column from t.id to t.* also works: * > select * > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files
[ https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155710#comment-16155710 ] David Lee commented on DRILL-5769: -- I got this statement to work passing in [0] for the array, but the SQL parser should default it to [0] if it isn't specified correct? select t.id from dfs.`/tmp/???.json` t where t.assetData.debt.couponPaymentFeature[0].interestBasis = '5' > IndexOutOfBoundsException when querying JSON files > -- > > Key: DRILL-5769 > URL: https://issues.apache.org/jira/browse/DRILL-5769 > Project: Apache Drill > Issue Type: Bug > Components: Server, Storage - JSON >Affects Versions: 1.10.0 > Environment: *jdk_8u45_x64* > *single drillbit running on zookeeper* > *Following options set to TRUE:* > drill.exec.functions.cast_empty_string_to_null > store.json.all_text_mode > store.parquet.enable_dictionary_encoding > store.parquet.use_new_reader >Reporter: David Lee >Assignee: Jinfeng Ni > Fix For: 1.10.0, 1.11.0, 1.12.0 > > Attachments: 001.json, 100.json, 111.json > > > *Running the following SQL on these three JSON files fail: * > 001.json 100.json 111.json > select t.id > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Error:* > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) > Fragment 0:0 [Error Id: .... > *However running the same SQL on two out of three files works:* > select t.id > from dfs.`/tmp/1??.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/?1?.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/??1.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Changing the selected column from t.id to t.* also works: * > select * > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5769) IndexOutOfBoundsException when querying JSON files
[ https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Lee updated DRILL-5769: - Attachment: 001.json 100.json 111.json Test files included > IndexOutOfBoundsException when querying JSON files > -- > > Key: DRILL-5769 > URL: https://issues.apache.org/jira/browse/DRILL-5769 > Project: Apache Drill > Issue Type: Bug > Components: Server, Storage - JSON >Affects Versions: 1.10.0 > Environment: *jdk_8u45_x64* > *single drillbit running on zookeeper* > *Following options set to TRUE:* > drill.exec.functions.cast_empty_string_to_null > store.json.all_text_mode > store.parquet.enable_dictionary_encoding > store.parquet.use_new_reader >Reporter: David Lee > Fix For: 1.10.0, 1.11.0, 1.12.0 > > Attachments: 001.json, 100.json, 111.json > > > *Running the following SQL on these three JSON files fail: * > 001.json 100.json 111.json > select t.id > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Error:* > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) > Fragment 0:0 [Error Id: .... > *However running the same SQL on two out of three files works:* > select t.id > from dfs.`/tmp/1??.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/?1?.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/??1.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Changing the selected column from t.id to t.* also works: * > select * > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4885) WHERE clause causing a ClassCastException on HBase tables
[ https://issues.apache.org/jira/browse/DRILL-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051229#comment-16051229 ] David Lee commented on DRILL-4885: -- I think there's a problem with casting table columns when the table is generated using functions.. ClassCastException: org.apache.drill.common.expression.FunctionCall cannot be cast to org.apache.drill.common.expression.SchemaPath is happening for me in Drill 1.10 when executing the following: select iv.Invoice.Invoice_Id, iv.Invoice from (select flatten(a.TodaysInvoices.Invoice) as Invoice from dfs.`/tmp/Invoices_20170605` a ) iv where iv.Invoice.Invoice_Id = 'ABC123' However, the following runs fine if I add a limit on the inner subquery which probably forces the inner query to materialize a table first. select iv.Invoice.Invoice_Id, iv.Invoice from (select flatten(a.Invoices.Invoice) as Invoice from dfs.`/tmp/Invoices_20170605` a limit 10 ) iv where iv.Invoice.Invoice_Id = 'ABC123' > WHERE clause causing a ClassCastException on HBase tables > - > > Key: DRILL-4885 > URL: https://issues.apache.org/jira/browse/DRILL-4885 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.5.0 >Reporter: Ki Kang > > I am trying to figure out why I am getting a ClassCastException when I do the > following query. If I change the “FROM” clause to just “FROM (VALUES(0))” it > works just fine, but whenever I have “FROM” to an HBase table, I get the > error. I know that the HBase table is valid because if I remove the WHERE > clause, the query does not throw an error. > SELECT b.`date` FROM ( > SELECT TO_DATE(CONCAT(a.`jArray`[0], '-', a.`jArray`[1], '-', > a.`jArray`[2]), '-MM-dd') `date` FROM ( > SELECT CONVERT_FROM(REGEXP_REPLACE('["2016":"08":"03"]', ':', ','), > 'JSON') `jArray` > --FROM (VALUES(0)) > FROM `hbase`.`SomeValidTable` > ) a > ) b > WHERE b.`date` = '2016-08-03' > LIMIT 1 > SYSTEM ERROR: ClassCastException: > org.apache.drill.common.expression.FunctionCall cannot be cast to > org.apache.drill.common.expression.SchemaPath > <> > From: rahul challapalli> Date: Thu, Sep 1, 2016 at 11:09 AM > Subject: Re: WHERE clause causing a ClassCastException on HBase tables > To: dev > This is a bug. The query is failing at the planning state itself. Can you > raise a jira for the same with the details you posted here? > - Rahul -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (DRILL-4908) Unable to setup Sybase JDBC Plugin with access to multiple databases
[ https://issues.apache.org/jira/browse/DRILL-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Lee closed DRILL-4908. Resolution: Duplicate Closing this issue and reopening it as part of Drill 1.10 with new examples.. The behavior in Drill 1.10 is different now and this use case cannot be reproduced with the example included. > Unable to setup Sybase JDBC Plugin with access to multiple databases > > > Key: DRILL-4908 > URL: https://issues.apache.org/jira/browse/DRILL-4908 > Project: Apache Drill > Issue Type: Improvement > Components: SQL Parser >Affects Versions: 1.8.0 > Environment: linux, sybase ase, sybase iq, windows >Reporter: David Lee > Fix For: Future > > > This may also be a problem with Microsoft SQL Server which uses the same SQL > Syntax. > I am unable to setup a single JDBC plugin which allows me to query tables on > different databases on the server. > I can setup multiple JDBC plugins for each database on the server and join > data across multiple JDBC connections, but this is extremely inefficient and > SQL queries > just hang. > Test Case: Create two tables on two different databases and write a single > SQL statement to join them together. Try to replicate the results in Apache > Drill. > A. Temp tables in Sybase: > use tempdb > go > create table phone_book > ( > first_name varchar(10), > last_name varchar(20), > phone_number varchar(12) > ) > go > insert phone_book values ('Bob','Marley','555-555-') > insert phone_book values ('Mary','Jane','111-111-') > insert phone_book values ('Bat','Man','911-911-') > go > use tempdb_adhoc > go > create table cities > ( > first_name varchar(10), > last_name varchar(20), > city varchar(20) > ) > go > insert cities values ('Bob','Marley','San Francisco') > insert cities values ('Mary','Jane','New York') > insert cities values ('Bat','Man','Gotham') > go > select a.first_name, a.last_name, a.phone_number, b.city > from tempdb.guest.phone_book a > join tempdb_adhoc.guest.cities b > on b.first_name = a.first_name > and b.last_name = a.last_name > go > Returns Back in SYBASE ISQL: > first_name last_namephone_number city > -- > BobMarley 555-555- San Francisco > Mary Jane 111-111- New York > BatMan 911-911- Gotham > B. Drill JDBC Plugin Setups: > DEV: > { > "type": "jdbc", > "driver": "com.sybase.jdbc4.jdbc.SybDriver", > "url": "jdbc:sybase:Tds:my_server:4100", > "username": "my_login", > "password": "my_password", > "enabled": true > } > DEV_TEMPDB: > { > "type": "jdbc", > "driver": "com.sybase.jdbc4.jdbc.SybDriver", > "url": "jdbc:sybase:Tds:my_server:4100/tempdb", > "username": "my_login", > "password": "my_password", > "enabled": true > } > DEV_TEMPDB_ADHOC: > { > "type": "jdbc", > "driver": "com.sybase.jdbc4.jdbc.SybDriver", > "url": "jdbc:sybase:Tds:my_server:4100/tempdb_adhoc", > "username": "my_login", > "password": "my_password", > "enabled": true > } > C. Examples of Drill Statements which work and don't work. > 1. Returns back redundant schemas for each JDBC plugin: > 0: jdbc:drill:zk=local> show schemas; > +--+ > | SCHEMA_NAME | > +--+ > | DEV.tempdb | > | DEV.tempdb_adhoc | > | DEV_TEMPDB.tempdb| > | DEV_TEMPDB.tempdb_adhoc | > | DEV_TEMPDB_ADHOC.tempdb | > | DEV_TEMPDB_ADHOC.tempdb_adhoc| > +--+ > 2. SQL selects work within schemas and joins across schemas: > 0: jdbc:drill:zk=local> select * from DEV_TEMPDB.tempdb.guest.phone_book; > +-++---+ > | first_name | last_name | phone_number | > +-++---+ > | Bob | Marley | 555-555- | > | Mary| Jane | 111-111- | > | Bat | Man| 911-911- | > +-++---+ > 3 rows selected (1.585 seconds) > 0: jdbc:drill:zk=local> select * from > DEV_TEMPDB_ADHOC.tempdb_adhoc.guest.cities; > ; > +-+++ > | first_name | last_name | city | > +-+++ > | Bob | Marley | San Francisco | > | Mary| Jane | New York | > | Bat | Man| Gotham | > +-+++ > 3 rows selected (1.173 seconds) > 0: jdbc:drill:zk=local> select a.first_name, a.last_name, a.phone_number, > b.city > . . . . . . . . . . .
[jira] [Updated] (DRILL-5427) SQL Execution Syntax incorrect for Sybase RDBMS
[ https://issues.apache.org/jira/browse/DRILL-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Lee updated DRILL-5427: - Description: The Sybase table syntax should be "database"."owner"."tablename", but "owner" is not being added which produces incorrect SQL when executed. The default owner is "DBO" for most tables. If DBO is omitted then the syntax should be just "database".."tablename" When SYBASE sees "database".."tablename" it: A. Checks if there is a tablename owned by the logged in user. B. If there is no tablename owned by the logged in user it uses DBO by default This is what I'm seeing using a JDBC plug-in connection to Sybase ASE: The following statements work fine: A. show schemas SCHEMA_NAME cp.default dfs.default dfs.root dfs.tmp INFORMATION_SCHEMA SYB1U SYB1U.tempdb B. use SYB1U.tempdb ok summary trueDefault schema changed to [SYB1U.tempdb] C. show tables TABLE_SCHEMATABLE_NAME SYB1U.tempdbsysalternates SYB1U.tempdbsysattributes SYB1U.tempdbsyscolumns SYB1U.tempdbsyscomments SYB1U.tempdbsysconstraints etc.. etc.. etc.. D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` where TABLE_SCHEMA = 'SYB1U.tempdb' and TABLE_NAME = 'syscolumns' TABLE_CATALOG TABLE_SCHEMATABLE_NAME COLUMN_NAME ORDINAL_POSITIONCOLUMN_DEFAULT IS_NULLABLE DATA_TYPE DRILL SYB1U.tempdbsyscolumns id 1 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns number 2 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns colid 3 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns status 4 (null) NO TINYINT DRILL SYB1U.tempdbsyscolumns type5 (null) NO TINYINT DRILL SYB1U.tempdbsyscolumns length 6 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns offset 7 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns usertype8 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns cdefault9 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns domain 10 (null) NO INTEGER etc.. etc.. etc.. However, the following statements fail: A. select * from SYB1U.tempdb.syscolumns DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query. sql SELECT * FROM "tempdb"."syscolumns" plugin SYB1U Fragment 0:0 B. select * from SYB1U.tempdb.dbo.syscolumns VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 'SYB1U.tempdb.dbo.syscolumns' not found C. select * from SYB1U.tempdb..syscolumns PARSE ERROR: Encountered ".." at line 1, column 27. In A, the execution engine doesn't include the "owner" portion. In B, adding dbo fails validation In C, the default behavior in Sybase for ".." isn't recognized I'm not sure if this is a Drill issue or a Calcite issue. was: The Sybase table syntax should be "database"."owner"."tablename", but "owner" is not being added which produces incorrect SQL when executed. The default owner is "DBO" for most tables. If DBO is omitted then the syntax should be just "database".."tablename" When SYBASE sees "database".."tablename" it: A. Checks if there is a tablename owned by the logged in user. B. If there is no tablename owned by the logged in user it uses DBO by default This is what I'm seeing using a JDBC plug-in connection to Sybase ASE: The following statements work fine: A. show schemas SCHEMA_NAME cp.default dfs.default dfs.root dfs.tmp INFORMATION_SCHEMA SYB1U SYB1U.tempdb B. use SYB1U.tempdb ok summary trueDefault schema changed to [SYB1U.tempdb] C. show tables TABLE_SCHEMATABLE_NAME SYB1U.tempdbsysalternates SYB1U.tempdbsysattributes SYB1U.tempdbsyscolumns SYB1U.tempdbsyscomments SYB1U.tempdbsysconstraints etc.. etc.. etc.. D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` where TABLE_SCHEMA = 'SYB1U.tempdb' and TABLE_NAME = 'syscolumns' TABLE_CATALOG TABLE_SCHEMATABLE_NAME COLUMN_NAME ORDINAL_POSITIONCOLUMN_DEFAULT IS_NULLABLE DATA_TYPE DRILL SYB1U.tempdbsyscolumns id 1 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns number 2 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns colid 3 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns status 4 (null) NO TINYINT DRILL SYB1U.tempdbsyscolumns type5 (null) NO TINYINT DRILL SYB1U.tempdbsyscolumns length 6 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns offset 7 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns usertype8 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns cdefault9 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns domain 10 (null) NO INTEGER
[jira] [Created] (DRILL-5427) SQL Execution Syntax incorrect for Sybase RDBMS
David Lee created DRILL-5427: Summary: SQL Execution Syntax incorrect for Sybase RDBMS Key: DRILL-5427 URL: https://issues.apache.org/jira/browse/DRILL-5427 Project: Apache Drill Issue Type: Bug Components: Client - JDBC, SQL Parser, Storage - JDBC Affects Versions: 1.10.0 Environment: Windows Linux Reporter: David Lee Fix For: Future, 1.10.0 The Sybase table syntax should be "database"."owner"."tablename", but "owner" is not being added which produces incorrect SQL when executed. The default owner is "DBO" for most tables. If DBO is omitted then the syntax should be just "database".."tablename" When SYBASE sees "database".."tablename" it: A. Checks if there is a tablename owned by the logged in user. B. If there is no tablename owned by the logged in user it uses DBO by default This is what I'm seeing using a JDBC plug-in connection to Sybase ASE: The following statements work fine: A. show schemas SCHEMA_NAME cp.default dfs.default dfs.root dfs.tmp INFORMATION_SCHEMA SYB1U SYB1U.tempdb B. use SYB1U.tempdb ok summary trueDefault schema changed to [SYB1U.tempdb] C. show tables TABLE_SCHEMATABLE_NAME SYB1U.tempdbsysalternates SYB1U.tempdbsysattributes SYB1U.tempdbsyscolumns SYB1U.tempdbsyscomments SYB1U.tempdbsysconstraints etc.. etc.. etc.. D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` where TABLE_SCHEMA = 'SYB1U.tempdb' and TABLE_NAME = 'syscolumns' TABLE_CATALOG TABLE_SCHEMATABLE_NAME COLUMN_NAME ORDINAL_POSITIONCOLUMN_DEFAULT IS_NULLABLE DATA_TYPE DRILL SYB1U.tempdbsyscolumns id 1 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns number 2 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns colid 3 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns status 4 (null) NO TINYINT DRILL SYB1U.tempdbsyscolumns type5 (null) NO TINYINT DRILL SYB1U.tempdbsyscolumns length 6 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns offset 7 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns usertype8 (null) NO SMALLINT DRILL SYB1U.tempdbsyscolumns cdefault9 (null) NO INTEGER DRILL SYB1U.tempdbsyscolumns domain 10 (null) NO INTEGER etc.. etc.. etc.. However, the following statements fail: A. select * from SYB1U.tempdb.syscolumns DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query. sql SELECT * FROM "tempdb"."syscolumns" plugin SYB1U Fragment 0:0 B. select * from SYB1U.tempdb.dbo.syscolumns VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 'SYB1U.tempdb.dbo.syscolumns' not found C. select * from SYB1U.tempdb..syscolumns PARSE ERROR: Encountered ".." at line 1, column 27. In A, the execution engine doesn't include the "owner" portion. In B, adding dbo fails validation In C, the default behavior in Sybase for ".." isn't recognized use SYB1U.tempdb show tables SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` where TABLE_SCHEMA = 'SYB1U.tempdb' and TABLE_NAME = 'syscolumns' I'm not sure if this is a Drill issue or a Calcite issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (DRILL-4908) Unable to setup Sybase JDBC Plugin with access to multiple databases
David Lee created DRILL-4908: Summary: Unable to setup Sybase JDBC Plugin with access to multiple databases Key: DRILL-4908 URL: https://issues.apache.org/jira/browse/DRILL-4908 Project: Apache Drill Issue Type: Improvement Components: SQL Parser Affects Versions: 1.8.0 Environment: linux, sybase ase, sybase iq, windows Reporter: David Lee Fix For: Future This may also be a problem with Microsoft SQL Server which uses the same SQL Syntax. I am unable to setup a single JDBC plugin which allows me to query tables on different databases on the server. I can setup multiple JDBC plugins for each database on the server and join data across multiple JDBC connections, but this is extremely inefficient and SQL queries just hang. Test Case: Create two tables on two different databases and write a single SQL statement to join them together. Try to replicate the results in Apache Drill. A. Temp tables in Sybase: use tempdb go create table phone_book ( first_name varchar(10), last_name varchar(20), phone_number varchar(12) ) go insert phone_book values ('Bob','Marley','555-555-') insert phone_book values ('Mary','Jane','111-111-') insert phone_book values ('Bat','Man','911-911-') go use tempdb_adhoc go create table cities ( first_name varchar(10), last_name varchar(20), city varchar(20) ) go insert cities values ('Bob','Marley','San Francisco') insert cities values ('Mary','Jane','New York') insert cities values ('Bat','Man','Gotham') go select a.first_name, a.last_name, a.phone_number, b.city from tempdb.guest.phone_book a join tempdb_adhoc.guest.cities b on b.first_name = a.first_name and b.last_name = a.last_name go Returns Back in SYBASE ISQL: first_name last_namephone_number city -- BobMarley 555-555- San Francisco Mary Jane 111-111- New York BatMan 911-911- Gotham B. Drill JDBC Plugin Setups: DEV: { "type": "jdbc", "driver": "com.sybase.jdbc4.jdbc.SybDriver", "url": "jdbc:sybase:Tds:my_server:4100", "username": "my_login", "password": "my_password", "enabled": true } DEV_TEMPDB: { "type": "jdbc", "driver": "com.sybase.jdbc4.jdbc.SybDriver", "url": "jdbc:sybase:Tds:my_server:4100/tempdb", "username": "my_login", "password": "my_password", "enabled": true } DEV_TEMPDB_ADHOC: { "type": "jdbc", "driver": "com.sybase.jdbc4.jdbc.SybDriver", "url": "jdbc:sybase:Tds:my_server:4100/tempdb_adhoc", "username": "my_login", "password": "my_password", "enabled": true } C. Examples of Drill Statements which work and don't work. 1. Returns back redundant schemas for each JDBC plugin: 0: jdbc:drill:zk=local> show schemas; +--+ | SCHEMA_NAME | +--+ | DEV.tempdb | | DEV.tempdb_adhoc | | DEV_TEMPDB.tempdb| | DEV_TEMPDB.tempdb_adhoc | | DEV_TEMPDB_ADHOC.tempdb | | DEV_TEMPDB_ADHOC.tempdb_adhoc| +--+ 2. SQL selects work within schemas and joins across schemas: 0: jdbc:drill:zk=local> select * from DEV_TEMPDB.tempdb.guest.phone_book; +-++---+ | first_name | last_name | phone_number | +-++---+ | Bob | Marley | 555-555- | | Mary| Jane | 111-111- | | Bat | Man| 911-911- | +-++---+ 3 rows selected (1.585 seconds) 0: jdbc:drill:zk=local> select * from DEV_TEMPDB_ADHOC.tempdb_adhoc.guest.cities; ; +-+++ | first_name | last_name | city | +-+++ | Bob | Marley | San Francisco | | Mary| Jane | New York | | Bat | Man| Gotham | +-+++ 3 rows selected (1.173 seconds) 0: jdbc:drill:zk=local> select a.first_name, a.last_name, a.phone_number, b.city . . . . . . . . . . . > from DEV_TEMPDB.tempdb.guest.phone_book a . . . . . . . . . . . > join DEV_TEMPDB_ADHOC.tempdb_adhoc.guest.cities b . . . . . . . . . . . > on b.first_name = a.first_name . . . . . . . . . . . > and b.last_name = a.last_name; +-++---++ | first_name | last_name | phone_number | city | +-++---++ | Bob | Marley | 555-555- | San Francisco | | Mary| Jane | 111-111- | New York | | Bat | Man| 911-911- | Gotham |