[jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT

2018-06-18 Thread David Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515741#comment-16515741
 ] 

David Lee commented on DRILL-4764:
--

https://jira.apache.org/jira/browse/DRILL-5983


> Parquet file with INT_16, etc. logical types not supported by simple SELECT
> ---
>
> Key: DRILL-4764
> URL: https://issues.apache.org/jira/browse/DRILL-4764
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.6.0
>Reporter: Paul Rogers
>Assignee: Serhii Harnyk
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, 
> uint_32.parquet, uint_8.parquet
>
>
> Create a Parquet file with the following schema:
> message int16Data { required int32 index; required int32 value (INT_16); }
> Store it as int_16.parquet in the local file system. Query it with:
> SELECT * from `local`.`root`.`int_16.parquet`;
> The result, in the web UI, is this error:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> UnsupportedOperationException: unsupported type: INT32 INT_16 Fragment 0:0 
> [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4 on 172.30.1.28:31010]
> The INT_16 logical (or "original") type simply tells consumers of the file 
> that the data is actually a 16-bit signed int. Presumably, this should tell 
> Drill to use the SmallIntVector (or NullableSmallIntVector) class for 
> storage. Without supporting this annotation, even 16-bit integers must be 
> stored as 32-bits within Drill.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT

2018-06-18 Thread David Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515739#comment-16515739
 ] 

David Lee commented on DRILL-4764:
--

This bug has resurfaced in Drill 1.12

Query Failed: An Error Occurred
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
ExecutionSetupException: Unsupported nullable converted type INT_16 for 
primitive type INT32 Fragment 1:3

> Parquet file with INT_16, etc. logical types not supported by simple SELECT
> ---
>
> Key: DRILL-4764
> URL: https://issues.apache.org/jira/browse/DRILL-4764
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.6.0
>Reporter: Paul Rogers
>Assignee: Serhii Harnyk
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, 
> uint_32.parquet, uint_8.parquet
>
>
> Create a Parquet file with the following schema:
> message int16Data { required int32 index; required int32 value (INT_16); }
> Store it as int_16.parquet in the local file system. Query it with:
> SELECT * from `local`.`root`.`int_16.parquet`;
> The result, in the web UI, is this error:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> UnsupportedOperationException: unsupported type: INT32 INT_16 Fragment 0:0 
> [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4 on 172.30.1.28:31010]
> The INT_16 logical (or "original") type simply tells consumers of the file 
> that the data is actually a 16-bit signed int. Presumably, this should tell 
> Drill to use the SmallIntVector (or NullableSmallIntVector) class for 
> storage. Without supporting this annotation, even 16-bit integers must be 
> stored as 32-bits within Drill.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files

2017-11-27 Thread David Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267136#comment-16267136
 ] 

David Lee commented on DRILL-5769:
--

I believe my other issue is a completely different problem. it looks like Drill 
is sampling the first xx number of json records to learn the schema and if it 
encounters a schema change down the road it throws a Index Out of Bounds. Can 
anyone confirm this? In my second case the error happens on record 10,200+ 
after the first 349 megs of the JSON file.

> IndexOutOfBoundsException when querying JSON files
> --
>
> Key: DRILL-5769
> URL: https://issues.apache.org/jira/browse/DRILL-5769
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Storage - JSON
>Affects Versions: 1.10.0
> Environment: *jdk_8u45_x64*
> *single drillbit running on zookeeper*
> *Following options set to TRUE:*
> drill.exec.functions.cast_empty_string_to_null
> store.json.all_text_mode
> store.parquet.enable_dictionary_encoding
> store.parquet.use_new_reader
>Reporter: David Lee
>Assignee: Jinfeng Ni
> Fix For: Future
>
> Attachments: 001.json, 100.json, 111.json
>
>
> *Running the following SQL on these three JSON files fail: *
> 001.json 100.json 111.json
> select t.id
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Error:*
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) 
> Fragment 0:0 [Error Id: ....
> *However running the same SQL on two out of three files works:*
> select t.id
> from dfs.`/tmp/1??.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/?1?.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/??1.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Changing the selected column from t.id to t.* also works: *
> select *
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5427) SQL Execution Syntax incorrect for Sybase RDBMS

2017-10-30 Thread David Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16225849#comment-16225849
 ] 

David Lee commented on DRILL-5427:
--

Here's a real example include the Drill query plan.

Table Setup:

use tempdb
go

create table my_table
(column_a int, column_b varchar(20))
go

insert my_table values (1, 'abc')
insert my_table values (2, 'xyz')
go

All the following statements work in Native Transact-SQL:

select * from my_table
go

select * from tempdb..my_table
go

select * from tempdb.guest.my_table
go

 column_acolumn_b
 --- 
   1 abc
   2 xyz

Here's what happens if you run this in Drill: The JDBC SQL executed does not 
match any of the Transact-SQL statements above.

select * from Sybase.tempdb.my_table

00-00Screen : rowType = RecordType(INTEGER column_a, VARCHAR(20) column_b): 
rowcount = 100.0, cumulative cost = {110.0 rows, 110.0 cpu, 0.0 io, 0.0 
network, 0.0 memory}, id = 2041
00-01  Project(column_a=[$0], column_b=[$1]) : rowType = RecordType(INTEGER 
column_a, VARCHAR(20) column_b): rowcount = 100.0, cumulative cost = {100.0 
rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2040
00-02Jdbc(sql=[SELECT *
FROM "tempdb"."my_table"]) : rowType = RecordType(INTEGER column_a, VARCHAR(20) 
column_b): rowcount = 100.0, cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 
0.0 network, 0.0 memory}, id = 2007

This is not a valid SQL statement in Sybase:

SELECT * FROM "tempdb"."my_table"

You need to either a) omit the schema, b) add an extra "." for the schema owner 
or c) add the schema owner which is "guest" for tempdb.

> SQL Execution Syntax incorrect for Sybase RDBMS
> ---
>
> Key: DRILL-5427
> URL: https://issues.apache.org/jira/browse/DRILL-5427
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, SQL Parser, Storage - JDBC
>Affects Versions: 1.10.0
> Environment: Windows Linux
>Reporter: David Lee
> Fix For: Future
>
>
> The Sybase table syntax should be "database"."owner"."tablename", but "owner" 
> is not being added which produces incorrect SQL when executed.
> The default owner is "DBO" for most tables. If DBO is omitted then the syntax 
> should be just "database".."tablename" 
> When SYBASE sees  "database".."tablename" it:
> A. Checks if there is a tablename owned by the logged in user.
> B. If there is no tablename owned by the logged in user it uses DBO by default
> This is what I'm seeing using a JDBC plug-in connection to Sybase ASE:
> The following statements work fine:
> A. show schemas
> SCHEMA_NAME
> cp.default
> dfs.default
> dfs.root
> dfs.tmp
> INFORMATION_SCHEMA
> SYB1U
> SYB1U.tempdb
> B. use SYB1U.tempdb
> oksummary
> true  Default schema changed to [SYB1U.tempdb]
> C. show tables
> TABLE_SCHEMA  TABLE_NAME
> SYB1U.tempdb  sysalternates
> SYB1U.tempdb  sysattributes
> SYB1U.tempdb  syscolumns
> SYB1U.tempdb  syscomments
> SYB1U.tempdb  sysconstraints
> etc.. etc.. etc..
> D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS`
> where TABLE_SCHEMA = 'SYB1U.tempdb'
> and TABLE_NAME = 'syscolumns'
> TABLE_CATALOG TABLE_SCHEMATABLE_NAME  COLUMN_NAME 
> ORDINAL_POSITIONCOLUMN_DEFAULT  IS_NULLABLE DATA_TYPE
> DRILL SYB1U.tempdbsyscolumns  id  1   (null)  NO  INTEGER
> DRILL SYB1U.tempdbsyscolumns  number  2   (null)  NO  SMALLINT
> DRILL SYB1U.tempdbsyscolumns  colid   3   (null)  NO  SMALLINT
> DRILL SYB1U.tempdbsyscolumns  status  4   (null)  NO  TINYINT
> DRILL SYB1U.tempdbsyscolumns  type5   (null)  NO  TINYINT
> DRILL SYB1U.tempdbsyscolumns  length  6   (null)  NO  INTEGER
> DRILL SYB1U.tempdbsyscolumns  offset  7   (null)  NO  SMALLINT
> DRILL SYB1U.tempdbsyscolumns  usertype8   (null)  NO  
> SMALLINT
> DRILL SYB1U.tempdbsyscolumns  cdefault9   (null)  NO  
> INTEGER
> DRILL SYB1U.tempdbsyscolumns  domain  10  (null)  NO  INTEGER
> etc.. etc.. etc..
> However, the following statements fail:
> A. select * from SYB1U.tempdb.syscolumns
> DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL 
> query. 
> sql SELECT *
> FROM "tempdb"."syscolumns"
> plugin SYB1U
> Fragment 0:0
> B. select * from SYB1U.tempdb.dbo.syscolumns
> VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 
> 'SYB1U.tempdb.dbo.syscolumns' not found
> C. select * from SYB1U.tempdb..syscolumns
> PARSE ERROR: Encountered ".." at line 1, column 27.
> In A, the execution engine doesn't include the "owner" portion.
> In B, adding dbo fails validation
> In C, the default behavior in Sybase for ".." isn't recognized
> I'm not sure if this is a Drill issue or a 

[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files

2017-10-23 Thread David Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16216005#comment-16216005
 ] 

David Lee commented on DRILL-5769:
--

I finally debugged which json key value was causing my IOBE exception by 
trimming my json file down to 
 349 megs with around 10,200 json records. Unfortunately I couldn't generate a 
smaller test file which duplicates the problem.

349939049 Oct 23 16:50 test.json

The JSON file contains multiple nested DealingSchedule objects like:

"DealingSchedule": {"DealingTime": {"CutOffTimeDetail": 
[{"CutOffTimeDetailTimeZone": "1", "CutOffTimeDetail_CountryId": "CU$AUS", 
"CutOffTime": "15:00", "DealingType": "3"}]}}

"DealingSchedule": {"DealingTime": {"CutOffTimeDetail": 
[{"CutOffTimeDetailTimeZone": "1", "CutOffTimeDetail_CountryId": "CU$AUS", 
"CutOffTime": "11:00", "DealingType": "3"}]}}

"DealingSchedule": {"ValuationTimeTimeZone": "1", "ValuationTime_CountryId": 
"CU$AUS", "ValuationTime": "12:00"}

"DealingSchedule": {"ValuationTimeTimeZone": "3", "ValuationTime_CountryId": 
"CU$CAN", "ValuationTime": "16:00","DealingTime": {"CutOffTimeDetail": 
[{"CutOffTimeDetailTimeZone": "3", "CutOffTimeDetail_CountryId": "CU$CAN", 
"CutOffTime": "16:00", "DealingType": "3"}]}}

Near the end of the file this flavor of DealingSchedule appears which contains 
a DealingTimeDetail array of one record. This is the first time this 
DealingSchedule.DealingTime.DealingTimeDetail key appears in the JSON file:

"DealingSchedule": {"ValuationTime_CountryId": "CU$GBR", "ValuationTime": 
"08:00", "DealingTime": {"DealingTimeDetail": [{"DealingTimeDetail_CountryId": 
"CU$GBR", "StartTime": "09:00", "EndTime": "17:00"}]}}

It produces the following error:

org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
IndexOutOfBoundsException: index: 16384, length: 4 (expected: range(0, 16384)) 
Fragment 0:0 [Error Id: 4d7e60fb-b7d0-49bd-9cf7-244dc4d7341d on ...

1. If I remove the array[] brackets and turn it into keys it works:
[{"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": 
"17:00"}]
to
{"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": 
"17:00"}

2. I tried creating a smaller JSON file with just DealingSchedule objects, but 
it would read the file without errors.

3. If add extra records to the array it also returns an IOBE.
{"DealingTimeDetail": [
{"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": 
"17:00"},
{"DealingTimeDetail_CountryId": "CU$GBR", "StartTime": "09:00", "EndTime": 
"17:00"}
]}


> IndexOutOfBoundsException when querying JSON files
> --
>
> Key: DRILL-5769
> URL: https://issues.apache.org/jira/browse/DRILL-5769
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Storage - JSON
>Affects Versions: 1.10.0
> Environment: *jdk_8u45_x64*
> *single drillbit running on zookeeper*
> *Following options set to TRUE:*
> drill.exec.functions.cast_empty_string_to_null
> store.json.all_text_mode
> store.parquet.enable_dictionary_encoding
> store.parquet.use_new_reader
>Reporter: David Lee
>Assignee: Jinfeng Ni
> Fix For: 1.10.0, 1.11.0, 1.12.0
>
> Attachments: 001.json, 100.json, 111.json
>
>
> *Running the following SQL on these three JSON files fail: *
> 001.json 100.json 111.json
> select t.id
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Error:*
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) 
> Fragment 0:0 [Error Id: ....
> *However running the same SQL on two out of three files works:*
> select t.id
> from dfs.`/tmp/1??.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/?1?.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/??1.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Changing the selected column from t.id to t.* also works: *
> select *
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files

2017-09-07 Thread David Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157935#comment-16157935
 ] 

David Lee commented on DRILL-5769:
--

My original problem was encountered when running a select * on a single 5 gig 
JSON file with a mix of nested keys and arrays to convert it to parquet.. 
Splitting that file into 10 smaller files worked to create 10 parquet files but 
then the same technique failed on a different 6 gig JSON since I have no 
control when an empty array may show up.

> IndexOutOfBoundsException when querying JSON files
> --
>
> Key: DRILL-5769
> URL: https://issues.apache.org/jira/browse/DRILL-5769
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Storage - JSON
>Affects Versions: 1.10.0
> Environment: *jdk_8u45_x64*
> *single drillbit running on zookeeper*
> *Following options set to TRUE:*
> drill.exec.functions.cast_empty_string_to_null
> store.json.all_text_mode
> store.parquet.enable_dictionary_encoding
> store.parquet.use_new_reader
>Reporter: David Lee
>Assignee: Jinfeng Ni
> Fix For: 1.10.0, 1.11.0, 1.12.0
>
> Attachments: 001.json, 100.json, 111.json
>
>
> *Running the following SQL on these three JSON files fail: *
> 001.json 100.json 111.json
> select t.id
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Error:*
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) 
> Fragment 0:0 [Error Id: ....
> *However running the same SQL on two out of three files works:*
> select t.id
> from dfs.`/tmp/1??.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/?1?.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/??1.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Changing the selected column from t.id to t.* also works: *
> select *
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5769) IndexOutOfBoundsException when querying JSON files

2017-09-06 Thread David Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155710#comment-16155710
 ] 

David Lee commented on DRILL-5769:
--

I got this statement to work passing in [0] for the array, but the SQL parser 
should default it to [0] if it isn't specified correct?

select t.id
from dfs.`/tmp/???.json` t
where t.assetData.debt.couponPaymentFeature[0].interestBasis = '5'

> IndexOutOfBoundsException when querying JSON files
> --
>
> Key: DRILL-5769
> URL: https://issues.apache.org/jira/browse/DRILL-5769
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Storage - JSON
>Affects Versions: 1.10.0
> Environment: *jdk_8u45_x64*
> *single drillbit running on zookeeper*
> *Following options set to TRUE:*
> drill.exec.functions.cast_empty_string_to_null
> store.json.all_text_mode
> store.parquet.enable_dictionary_encoding
> store.parquet.use_new_reader
>Reporter: David Lee
>Assignee: Jinfeng Ni
> Fix For: 1.10.0, 1.11.0, 1.12.0
>
> Attachments: 001.json, 100.json, 111.json
>
>
> *Running the following SQL on these three JSON files fail: *
> 001.json 100.json 111.json
> select t.id
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Error:*
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) 
> Fragment 0:0 [Error Id: ....
> *However running the same SQL on two out of three files works:*
> select t.id
> from dfs.`/tmp/1??.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/?1?.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/??1.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Changing the selected column from t.id to t.* also works: *
> select *
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5769) IndexOutOfBoundsException when querying JSON files

2017-09-05 Thread David Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Lee updated DRILL-5769:
-
Attachment: 001.json
100.json
111.json

Test files included

> IndexOutOfBoundsException when querying JSON files
> --
>
> Key: DRILL-5769
> URL: https://issues.apache.org/jira/browse/DRILL-5769
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Storage - JSON
>Affects Versions: 1.10.0
> Environment: *jdk_8u45_x64*
> *single drillbit running on zookeeper*
> *Following options set to TRUE:*
> drill.exec.functions.cast_empty_string_to_null
> store.json.all_text_mode
> store.parquet.enable_dictionary_encoding
> store.parquet.use_new_reader
>Reporter: David Lee
> Fix For: 1.10.0, 1.11.0, 1.12.0
>
> Attachments: 001.json, 100.json, 111.json
>
>
> *Running the following SQL on these three JSON files fail: *
> 001.json 100.json 111.json
> select t.id
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Error:*
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) 
> Fragment 0:0 [Error Id: ....
> *However running the same SQL on two out of three files works:*
> select t.id
> from dfs.`/tmp/1??.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/?1?.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/??1.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Changing the selected column from t.id to t.* also works: *
> select *
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-4885) WHERE clause causing a ClassCastException on HBase tables

2017-06-15 Thread David Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051229#comment-16051229
 ] 

David Lee commented on DRILL-4885:
--

I think there's a problem with casting table columns when the table is 
generated using functions..

ClassCastException: org.apache.drill.common.expression.FunctionCall cannot be 
cast to org.apache.drill.common.expression.SchemaPath

is happening for me in Drill 1.10 when executing the following:

select iv.Invoice.Invoice_Id, iv.Invoice
from
(select flatten(a.TodaysInvoices.Invoice) as Invoice
from dfs.`/tmp/Invoices_20170605` a
) iv
where iv.Invoice.Invoice_Id = 'ABC123'

However, the following runs fine if I add a limit on the inner subquery which 
probably forces the inner query to materialize a table first.

select iv.Invoice.Invoice_Id, iv.Invoice
from
(select flatten(a.Invoices.Invoice) as Invoice
from dfs.`/tmp/Invoices_20170605` a limit 10
) iv
where iv.Invoice.Invoice_Id = 'ABC123'


> WHERE clause causing a ClassCastException on HBase tables
> -
>
> Key: DRILL-4885
> URL: https://issues.apache.org/jira/browse/DRILL-4885
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Ki Kang
>
> I am trying to figure out why I am getting a ClassCastException when I do the 
> following query.  If I change the “FROM” clause to just “FROM (VALUES(0))” it 
> works just fine, but whenever I have “FROM” to an HBase table, I get the 
> error.  I know that the HBase table is valid because if I remove the WHERE 
> clause, the query does not throw an error.
> SELECT b.`date` FROM (
>   SELECT TO_DATE(CONCAT(a.`jArray`[0], '-', a.`jArray`[1], '-', 
> a.`jArray`[2]), '-MM-dd') `date` FROM (
> SELECT CONVERT_FROM(REGEXP_REPLACE('["2016":"08":"03"]', ':', ','), 
> 'JSON') `jArray`
> --FROM (VALUES(0))
> FROM `hbase`.`SomeValidTable`
>   ) a
> ) b
> WHERE b.`date` = '2016-08-03'
> LIMIT 1
> SYSTEM ERROR: ClassCastException: 
> org.apache.drill.common.expression.FunctionCall cannot be cast to 
> org.apache.drill.common.expression.SchemaPath
> <>
> From: rahul challapalli 
> Date: Thu, Sep 1, 2016 at 11:09 AM
> Subject: Re: WHERE clause causing a ClassCastException on HBase tables
> To: dev 
> This is a bug. The query is failing at the planning state itself. Can you 
> raise a jira for the same with the details you posted here?
> - Rahul



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (DRILL-4908) Unable to setup Sybase JDBC Plugin with access to multiple databases

2017-04-10 Thread David Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Lee closed DRILL-4908.

Resolution: Duplicate

Closing this issue and reopening it as part of Drill 1.10 with new examples.. 
The behavior in Drill 1.10 is different now and this use case cannot be 
reproduced with the example included.

> Unable to setup Sybase JDBC Plugin with access to multiple databases
> 
>
> Key: DRILL-4908
> URL: https://issues.apache.org/jira/browse/DRILL-4908
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Affects Versions: 1.8.0
> Environment: linux, sybase ase, sybase iq, windows
>Reporter: David Lee
> Fix For: Future
>
>
> This may also be a problem with Microsoft SQL Server which uses the same SQL 
> Syntax.
> I am unable to setup a single JDBC plugin which allows me to query tables on 
> different databases on the server.
> I can setup multiple JDBC plugins for each database on the server and join 
> data across multiple JDBC connections, but this is extremely inefficient and 
> SQL queries 
> just hang.
> Test Case: Create two tables on two different databases and write a single 
> SQL statement to join them together. Try to replicate the results in Apache 
> Drill.
> A. Temp tables in Sybase:
> use tempdb
> go
> create table phone_book
> (
> first_name varchar(10),
> last_name varchar(20),
> phone_number varchar(12)
> )
> go
> insert phone_book values ('Bob','Marley','555-555-')
> insert phone_book values ('Mary','Jane','111-111-')
> insert phone_book values ('Bat','Man','911-911-')
> go
> use tempdb_adhoc
> go
> create table cities
> (
> first_name varchar(10),
> last_name varchar(20),
> city varchar(20)
> )
> go
> insert cities values ('Bob','Marley','San Francisco')
> insert cities values ('Mary','Jane','New York')
> insert cities values ('Bat','Man','Gotham')
> go
> select a.first_name, a.last_name, a.phone_number, b.city
> from tempdb.guest.phone_book a
> join tempdb_adhoc.guest.cities b
> on b.first_name = a.first_name
> and b.last_name = a.last_name
> go
> Returns Back in SYBASE ISQL:
>  first_name last_namephone_number city   
>  --   
>  BobMarley   555-555- San Francisco
>  Mary   Jane 111-111- New York
>  BatMan  911-911- Gotham
> B. Drill JDBC Plugin Setups:
> DEV:
> {
>   "type": "jdbc",
>   "driver": "com.sybase.jdbc4.jdbc.SybDriver",
>   "url": "jdbc:sybase:Tds:my_server:4100",
>   "username": "my_login",
>   "password": "my_password",
>   "enabled": true
> }
> DEV_TEMPDB:
> {
>   "type": "jdbc",
>   "driver": "com.sybase.jdbc4.jdbc.SybDriver",
>   "url": "jdbc:sybase:Tds:my_server:4100/tempdb",
>   "username": "my_login",
>   "password": "my_password",
>   "enabled": true
> }
> DEV_TEMPDB_ADHOC:
> {
>   "type": "jdbc",
>   "driver": "com.sybase.jdbc4.jdbc.SybDriver",
>   "url": "jdbc:sybase:Tds:my_server:4100/tempdb_adhoc",
>   "username": "my_login",
>   "password": "my_password",
>   "enabled": true
> }
> C. Examples of Drill Statements which work and don't work.
> 1. Returns back redundant schemas for each JDBC plugin:
> 0: jdbc:drill:zk=local> show schemas;
> +--+
> | SCHEMA_NAME  |
> +--+
> | DEV.tempdb   |
> | DEV.tempdb_adhoc |
> | DEV_TEMPDB.tempdb|
> | DEV_TEMPDB.tempdb_adhoc  |
> | DEV_TEMPDB_ADHOC.tempdb  |
> | DEV_TEMPDB_ADHOC.tempdb_adhoc|
> +--+
> 2. SQL selects work within schemas and joins across schemas:
> 0: jdbc:drill:zk=local> select * from DEV_TEMPDB.tempdb.guest.phone_book;
> +-++---+
> | first_name  | last_name  | phone_number  |
> +-++---+
> | Bob | Marley | 555-555-  |
> | Mary| Jane   | 111-111-  |
> | Bat | Man| 911-911-  |
> +-++---+
> 3 rows selected (1.585 seconds)
> 0: jdbc:drill:zk=local> select * from 
> DEV_TEMPDB_ADHOC.tempdb_adhoc.guest.cities;
> ;
> +-+++
> | first_name  | last_name  |  city  |
> +-+++
> | Bob | Marley | San Francisco  |
> | Mary| Jane   | New York   |
> | Bat | Man| Gotham |
> +-+++
> 3 rows selected (1.173 seconds)
> 0: jdbc:drill:zk=local> select a.first_name, a.last_name, a.phone_number, 
> b.city
> . . . . . . . . . . . 

[jira] [Updated] (DRILL-5427) SQL Execution Syntax incorrect for Sybase RDBMS

2017-04-10 Thread David Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Lee updated DRILL-5427:
-
Description: 
The Sybase table syntax should be "database"."owner"."tablename", but "owner" 
is not being added which produces incorrect SQL when executed.

The default owner is "DBO" for most tables. If DBO is omitted then the syntax 
should be just "database".."tablename" 

When SYBASE sees  "database".."tablename" it:

A. Checks if there is a tablename owned by the logged in user.
B. If there is no tablename owned by the logged in user it uses DBO by default

This is what I'm seeing using a JDBC plug-in connection to Sybase ASE:

The following statements work fine:

A. show schemas

SCHEMA_NAME
cp.default
dfs.default
dfs.root
dfs.tmp
INFORMATION_SCHEMA
SYB1U
SYB1U.tempdb

B. use SYB1U.tempdb

ok  summary
trueDefault schema changed to [SYB1U.tempdb]

C. show tables

TABLE_SCHEMATABLE_NAME
SYB1U.tempdbsysalternates
SYB1U.tempdbsysattributes
SYB1U.tempdbsyscolumns
SYB1U.tempdbsyscomments
SYB1U.tempdbsysconstraints
etc.. etc.. etc..

D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS`
where TABLE_SCHEMA = 'SYB1U.tempdb'
and TABLE_NAME = 'syscolumns'

TABLE_CATALOG   TABLE_SCHEMATABLE_NAME  COLUMN_NAME 
ORDINAL_POSITIONCOLUMN_DEFAULT  IS_NULLABLE DATA_TYPE
DRILL   SYB1U.tempdbsyscolumns  id  1   (null)  NO  INTEGER
DRILL   SYB1U.tempdbsyscolumns  number  2   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  colid   3   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  status  4   (null)  NO  TINYINT
DRILL   SYB1U.tempdbsyscolumns  type5   (null)  NO  TINYINT
DRILL   SYB1U.tempdbsyscolumns  length  6   (null)  NO  INTEGER
DRILL   SYB1U.tempdbsyscolumns  offset  7   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  usertype8   (null)  NO  
SMALLINT
DRILL   SYB1U.tempdbsyscolumns  cdefault9   (null)  NO  
INTEGER
DRILL   SYB1U.tempdbsyscolumns  domain  10  (null)  NO  INTEGER
etc.. etc.. etc..

However, the following statements fail:

A. select * from SYB1U.tempdb.syscolumns
DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL 
query. 
sql SELECT *
FROM "tempdb"."syscolumns"
plugin SYB1U
Fragment 0:0

B. select * from SYB1U.tempdb.dbo.syscolumns
VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 
'SYB1U.tempdb.dbo.syscolumns' not found

C. select * from SYB1U.tempdb..syscolumns
PARSE ERROR: Encountered ".." at line 1, column 27.

In A, the execution engine doesn't include the "owner" portion.
In B, adding dbo fails validation
In C, the default behavior in Sybase for ".." isn't recognized

I'm not sure if this is a Drill issue or a Calcite issue.

  was:
The Sybase table syntax should be "database"."owner"."tablename", but "owner" 
is not being added which produces incorrect SQL when executed.

The default owner is "DBO" for most tables. If DBO is omitted then the syntax 
should be just "database".."tablename" 

When SYBASE sees  "database".."tablename" it:

A. Checks if there is a tablename owned by the logged in user.
B. If there is no tablename owned by the logged in user it uses DBO by default

This is what I'm seeing using a JDBC plug-in connection to Sybase ASE:

The following statements work fine:

A. show schemas

SCHEMA_NAME
cp.default
dfs.default
dfs.root
dfs.tmp
INFORMATION_SCHEMA
SYB1U
SYB1U.tempdb

B. use SYB1U.tempdb

ok  summary
trueDefault schema changed to [SYB1U.tempdb]

C. show tables

TABLE_SCHEMATABLE_NAME
SYB1U.tempdbsysalternates
SYB1U.tempdbsysattributes
SYB1U.tempdbsyscolumns
SYB1U.tempdbsyscomments
SYB1U.tempdbsysconstraints
etc.. etc.. etc..

D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS`
where TABLE_SCHEMA = 'SYB1U.tempdb'
and TABLE_NAME = 'syscolumns'

TABLE_CATALOG   TABLE_SCHEMATABLE_NAME  COLUMN_NAME 
ORDINAL_POSITIONCOLUMN_DEFAULT  IS_NULLABLE DATA_TYPE
DRILL   SYB1U.tempdbsyscolumns  id  1   (null)  NO  INTEGER
DRILL   SYB1U.tempdbsyscolumns  number  2   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  colid   3   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  status  4   (null)  NO  TINYINT
DRILL   SYB1U.tempdbsyscolumns  type5   (null)  NO  TINYINT
DRILL   SYB1U.tempdbsyscolumns  length  6   (null)  NO  INTEGER
DRILL   SYB1U.tempdbsyscolumns  offset  7   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  usertype8   (null)  NO  
SMALLINT
DRILL   SYB1U.tempdbsyscolumns  cdefault9   (null)  NO  
INTEGER
DRILL   SYB1U.tempdbsyscolumns  domain  10  (null)  NO  INTEGER

[jira] [Created] (DRILL-5427) SQL Execution Syntax incorrect for Sybase RDBMS

2017-04-10 Thread David Lee (JIRA)
David Lee created DRILL-5427:


 Summary: SQL Execution Syntax incorrect for Sybase RDBMS
 Key: DRILL-5427
 URL: https://issues.apache.org/jira/browse/DRILL-5427
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC, SQL Parser, Storage - JDBC
Affects Versions: 1.10.0
 Environment: Windows Linux
Reporter: David Lee
 Fix For: Future, 1.10.0


The Sybase table syntax should be "database"."owner"."tablename", but "owner" 
is not being added which produces incorrect SQL when executed.

The default owner is "DBO" for most tables. If DBO is omitted then the syntax 
should be just "database".."tablename" 

When SYBASE sees  "database".."tablename" it:

A. Checks if there is a tablename owned by the logged in user.
B. If there is no tablename owned by the logged in user it uses DBO by default

This is what I'm seeing using a JDBC plug-in connection to Sybase ASE:

The following statements work fine:

A. show schemas

SCHEMA_NAME
cp.default
dfs.default
dfs.root
dfs.tmp
INFORMATION_SCHEMA
SYB1U
SYB1U.tempdb

B. use SYB1U.tempdb

ok  summary
trueDefault schema changed to [SYB1U.tempdb]

C. show tables

TABLE_SCHEMATABLE_NAME
SYB1U.tempdbsysalternates
SYB1U.tempdbsysattributes
SYB1U.tempdbsyscolumns
SYB1U.tempdbsyscomments
SYB1U.tempdbsysconstraints
etc.. etc.. etc..

D. SELECT * FROM INFORMATION_SCHEMA.`COLUMNS`
where TABLE_SCHEMA = 'SYB1U.tempdb'
and TABLE_NAME = 'syscolumns'

TABLE_CATALOG   TABLE_SCHEMATABLE_NAME  COLUMN_NAME 
ORDINAL_POSITIONCOLUMN_DEFAULT  IS_NULLABLE DATA_TYPE
DRILL   SYB1U.tempdbsyscolumns  id  1   (null)  NO  INTEGER
DRILL   SYB1U.tempdbsyscolumns  number  2   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  colid   3   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  status  4   (null)  NO  TINYINT
DRILL   SYB1U.tempdbsyscolumns  type5   (null)  NO  TINYINT
DRILL   SYB1U.tempdbsyscolumns  length  6   (null)  NO  INTEGER
DRILL   SYB1U.tempdbsyscolumns  offset  7   (null)  NO  SMALLINT
DRILL   SYB1U.tempdbsyscolumns  usertype8   (null)  NO  
SMALLINT
DRILL   SYB1U.tempdbsyscolumns  cdefault9   (null)  NO  
INTEGER
DRILL   SYB1U.tempdbsyscolumns  domain  10  (null)  NO  INTEGER
etc.. etc.. etc..

However, the following statements fail:

A. select * from SYB1U.tempdb.syscolumns
DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL 
query. 
sql SELECT *
FROM "tempdb"."syscolumns"
plugin SYB1U
Fragment 0:0

B. select * from SYB1U.tempdb.dbo.syscolumns
VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 
'SYB1U.tempdb.dbo.syscolumns' not found

C. select * from SYB1U.tempdb..syscolumns
PARSE ERROR: Encountered ".." at line 1, column 27.

In A, the execution engine doesn't include the "owner" portion.
In B, adding dbo fails validation
In C, the default behavior in Sybase for ".." isn't recognized




use SYB1U.tempdb

show tables

SELECT *
FROM INFORMATION_SCHEMA.`COLUMNS`
where TABLE_SCHEMA = 'SYB1U.tempdb'
and TABLE_NAME = 'syscolumns'

I'm not sure if this is a Drill issue or a Calcite issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-4908) Unable to setup Sybase JDBC Plugin with access to multiple databases

2016-09-27 Thread David Lee (JIRA)
David Lee created DRILL-4908:


 Summary: Unable to setup Sybase JDBC Plugin with access to 
multiple databases
 Key: DRILL-4908
 URL: https://issues.apache.org/jira/browse/DRILL-4908
 Project: Apache Drill
  Issue Type: Improvement
  Components: SQL Parser
Affects Versions: 1.8.0
 Environment: linux, sybase ase, sybase iq, windows
Reporter: David Lee
 Fix For: Future


This may also be a problem with Microsoft SQL Server which uses the same SQL 
Syntax.

I am unable to setup a single JDBC plugin which allows me to query tables on 
different databases on the server.

I can setup multiple JDBC plugins for each database on the server and join data 
across multiple JDBC connections, but this is extremely inefficient and SQL 
queries 

just hang.

Test Case: Create two tables on two different databases and write a single SQL 
statement to join them together. Try to replicate the results in Apache Drill.

A. Temp tables in Sybase:

use tempdb
go

create table phone_book
(
first_name varchar(10),
last_name varchar(20),
phone_number varchar(12)
)
go

insert phone_book values ('Bob','Marley','555-555-')
insert phone_book values ('Mary','Jane','111-111-')
insert phone_book values ('Bat','Man','911-911-')
go


use tempdb_adhoc
go

create table cities
(
first_name varchar(10),
last_name varchar(20),
city varchar(20)
)
go

insert cities values ('Bob','Marley','San Francisco')
insert cities values ('Mary','Jane','New York')
insert cities values ('Bat','Man','Gotham')
go


select a.first_name, a.last_name, a.phone_number, b.city
from tempdb.guest.phone_book a
join tempdb_adhoc.guest.cities b
on b.first_name = a.first_name
and b.last_name = a.last_name
go

Returns Back in SYBASE ISQL:

 first_name last_namephone_number city   
 --   
 BobMarley   555-555- San Francisco
 Mary   Jane 111-111- New York
 BatMan  911-911- Gotham

B. Drill JDBC Plugin Setups:

DEV:

{
  "type": "jdbc",
  "driver": "com.sybase.jdbc4.jdbc.SybDriver",
  "url": "jdbc:sybase:Tds:my_server:4100",
  "username": "my_login",
  "password": "my_password",
  "enabled": true
}


DEV_TEMPDB:

{
  "type": "jdbc",
  "driver": "com.sybase.jdbc4.jdbc.SybDriver",
  "url": "jdbc:sybase:Tds:my_server:4100/tempdb",
  "username": "my_login",
  "password": "my_password",
  "enabled": true
}


DEV_TEMPDB_ADHOC:

{
  "type": "jdbc",
  "driver": "com.sybase.jdbc4.jdbc.SybDriver",
  "url": "jdbc:sybase:Tds:my_server:4100/tempdb_adhoc",
  "username": "my_login",
  "password": "my_password",
  "enabled": true
}

C. Examples of Drill Statements which work and don't work.

1. Returns back redundant schemas for each JDBC plugin:

0: jdbc:drill:zk=local> show schemas;

+--+
| SCHEMA_NAME  |
+--+
| DEV.tempdb   |
| DEV.tempdb_adhoc |
| DEV_TEMPDB.tempdb|
| DEV_TEMPDB.tempdb_adhoc  |
| DEV_TEMPDB_ADHOC.tempdb  |
| DEV_TEMPDB_ADHOC.tempdb_adhoc|
+--+

2. SQL selects work within schemas and joins across schemas:

0: jdbc:drill:zk=local> select * from DEV_TEMPDB.tempdb.guest.phone_book;
+-++---+
| first_name  | last_name  | phone_number  |
+-++---+
| Bob | Marley | 555-555-  |
| Mary| Jane   | 111-111-  |
| Bat | Man| 911-911-  |
+-++---+
3 rows selected (1.585 seconds)

0: jdbc:drill:zk=local> select * from 
DEV_TEMPDB_ADHOC.tempdb_adhoc.guest.cities;
;
+-+++
| first_name  | last_name  |  city  |
+-+++
| Bob | Marley | San Francisco  |
| Mary| Jane   | New York   |
| Bat | Man| Gotham |
+-+++
3 rows selected (1.173 seconds)

0: jdbc:drill:zk=local> select a.first_name, a.last_name, a.phone_number, b.city
. . . . . . . . . . . > from DEV_TEMPDB.tempdb.guest.phone_book a
. . . . . . . . . . . > join DEV_TEMPDB_ADHOC.tempdb_adhoc.guest.cities b
. . . . . . . . . . . > on b.first_name = a.first_name
. . . . . . . . . . . > and b.last_name = a.last_name;
+-++---++
| first_name  | last_name  | phone_number  |  city  |
+-++---++
| Bob | Marley | 555-555-  | San Francisco  |
| Mary| Jane   | 111-111-  | New York   |
| Bat | Man| 911-911-  | Gotham |