[jira] [Commented] (DRILL-8143) Error querying json with $date field

2022-02-21 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17495547#comment-17495547
 ] 

Anton Gozhiy commented on DRILL-8143:
-

Merged into master with commit acac9863e5ca062a6204ba5f9071a4f1201c0229.

> Error querying json with $date field
> 
>
> Key: DRILL-8143
> URL: https://issues.apache.org/jira/browse/DRILL-8143
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Blocker
> Attachments: extended.json
>
>
> *Test Data:*
>  [^extended.json]  attached.
> *Query:*
> {code:sql}
>  select * from dfs.drillTestDir.`complex/drill-2879/extended.json` where name 
> = 'd'
> {code}
> *Expected Results:*
> Query successful, no exception should be thrown.
> *Actual Result:*
> Exception happened:
> {noformat}
> UserRemoteException : INTERNAL_ERROR ERROR: Text 
> '2015-03-12T21:54:31.809+0530' could not be parsed at index 23
> 
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Text '2015-03-12T21:54:31.809+0530' could not be parsed at index 23
> Fragment: 0:0
> Please, refer to logs for more information.
> [Error Id: c984adbf-a455-4e0e-b3cd-b5aa7d83a765 on userf87d-pc:31010]
>   (java.time.format.DateTimeParseException) Text 
> '2015-03-12T21:54:31.809+0530' could not be parsed at index 23
> java.time.format.DateTimeFormatter.parseResolved0():2046
> java.time.format.DateTimeFormatter.parse():1948
> java.time.Instant.parse():395
> 
> org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.writeTimestamp():364
> org.apache.drill.exec.vector.complex.fn.VectorOutput.innerRun():115
> 
> org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.run():308
> 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeMapDataIfTyped():386
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData():262
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():192
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeDocument():178
> 
> org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.writeToVector():99
> org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.write():70
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next():234
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():234
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():423
> org.apache.hadoop.security.UserGroupInformation.doAs():1762
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1128
> 

[jira] [Assigned] (DRILL-8143) Error querying json with $date field

2022-02-19 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-8143:
---

Assignee: Anton Gozhiy

> Error querying json with $date field
> 
>
> Key: DRILL-8143
> URL: https://issues.apache.org/jira/browse/DRILL-8143
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Attachments: extended.json
>
>
> *Test Data:*
>  [^extended.json]  attached.
> *Query:*
> {code:sql}
>  select * from dfs.drillTestDir.`complex/drill-2879/extended.json` where name 
> = 'd'
> {code}
> *Expected Results:*
> Query successful, no exception should be thrown.
> *Actual Result:*
> Exception happened:
> {noformat}
> UserRemoteException : INTERNAL_ERROR ERROR: Text 
> '2015-03-12T21:54:31.809+0530' could not be parsed at index 23
> 
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Text '2015-03-12T21:54:31.809+0530' could not be parsed at index 23
> Fragment: 0:0
> Please, refer to logs for more information.
> [Error Id: c984adbf-a455-4e0e-b3cd-b5aa7d83a765 on userf87d-pc:31010]
>   (java.time.format.DateTimeParseException) Text 
> '2015-03-12T21:54:31.809+0530' could not be parsed at index 23
> java.time.format.DateTimeFormatter.parseResolved0():2046
> java.time.format.DateTimeFormatter.parse():1948
> java.time.Instant.parse():395
> 
> org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.writeTimestamp():364
> org.apache.drill.exec.vector.complex.fn.VectorOutput.innerRun():115
> 
> org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.run():308
> 
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeMapDataIfTyped():386
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData():262
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():192
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeDocument():178
> 
> org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.writeToVector():99
> org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.write():70
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next():234
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():234
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():423
> org.apache.hadoop.security.UserGroupInformation.doAs():1762
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1128
> java.util.concurrent.ThreadPoolExecutor$Worker.run():628
> java.lang.Thread.run():834
> {noformat}
> *Note:* It is not 

[jira] [Updated] (DRILL-8143) Error querying json with $date field

2022-02-18 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-8143:

Description: 
*Test Data:*
 [^extended.json]  attached.

*Query:*
{code:sql}
 select * from dfs.drillTestDir.`complex/drill-2879/extended.json` where name = 
'd'
{code}


*Expected Results:*
Query successful, no exception should be thrown.

*Actual Result:*
Exception happened:
{noformat}
UserRemoteException :   INTERNAL_ERROR ERROR: Text 
'2015-03-12T21:54:31.809+0530' could not be parsed at index 23

org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
Text '2015-03-12T21:54:31.809+0530' could not be parsed at index 23

Fragment: 0:0

Please, refer to logs for more information.

[Error Id: c984adbf-a455-4e0e-b3cd-b5aa7d83a765 on userf87d-pc:31010]

  (java.time.format.DateTimeParseException) Text '2015-03-12T21:54:31.809+0530' 
could not be parsed at index 23
java.time.format.DateTimeFormatter.parseResolved0():2046
java.time.format.DateTimeFormatter.parse():1948
java.time.Instant.parse():395

org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.writeTimestamp():364
org.apache.drill.exec.vector.complex.fn.VectorOutput.innerRun():115

org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.run():308
org.apache.drill.exec.vector.complex.fn.JsonReader.writeMapDataIfTyped():386
org.apache.drill.exec.vector.complex.fn.JsonReader.writeData():262
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():192
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDocument():178

org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.writeToVector():99
org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.write():70
org.apache.drill.exec.store.easy.json.JSONRecordReader.next():234
org.apache.drill.exec.physical.impl.ScanBatch.internalNext():234
org.apache.drill.exec.physical.impl.ScanBatch.next():298
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.physical.impl.BaseRootExec.next():103
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():93
org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():423
org.apache.hadoop.security.UserGroupInformation.doAs():1762
org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():834
{noformat}
*Note:* It is not reproducible in Drill 1.19.0

  was:
*Test Data:*
extended.json attached.

*Query:*
 # select * from dfs.drillTestDir.`complex/drill-2879/extended.json` where name 
= 'd'

*Expected Results:*
Query successful, no exception should be thrown.

*Actual Result:*
Exception happened:
{noformat}
UserRemoteException :   INTERNAL_ERROR ERROR: Text 
'2015-03-12T21:54:31.809+0530' could not be parsed at index 23

org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
Text '2015-03-12T21:54:31.809+0530' could not be parsed at index 23

Fragment: 0:0

Please, refer to logs for more 

[jira] [Created] (DRILL-8143) Error querying json with $date field

2022-02-18 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-8143:
---

 Summary: Error querying json with $date field
 Key: DRILL-8143
 URL: https://issues.apache.org/jira/browse/DRILL-8143
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.20.0
Reporter: Anton Gozhiy
 Attachments: extended.json

*Test Data:*
extended.json attached.

*Query:*
 # select * from dfs.drillTestDir.`complex/drill-2879/extended.json` where name 
= 'd'

*Expected Results:*
Query successful, no exception should be thrown.

*Actual Result:*
Exception happened:
{noformat}
UserRemoteException :   INTERNAL_ERROR ERROR: Text 
'2015-03-12T21:54:31.809+0530' could not be parsed at index 23

org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
Text '2015-03-12T21:54:31.809+0530' could not be parsed at index 23

Fragment: 0:0

Please, refer to logs for more information.

[Error Id: c984adbf-a455-4e0e-b3cd-b5aa7d83a765 on userf87d-pc:31010]

  (java.time.format.DateTimeParseException) Text '2015-03-12T21:54:31.809+0530' 
could not be parsed at index 23
java.time.format.DateTimeFormatter.parseResolved0():2046
java.time.format.DateTimeFormatter.parse():1948
java.time.Instant.parse():395

org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.writeTimestamp():364
org.apache.drill.exec.vector.complex.fn.VectorOutput.innerRun():115

org.apache.drill.exec.vector.complex.fn.VectorOutput$MapVectorOutput.run():308
org.apache.drill.exec.vector.complex.fn.JsonReader.writeMapDataIfTyped():386
org.apache.drill.exec.vector.complex.fn.JsonReader.writeData():262
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():192
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDocument():178

org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.writeToVector():99
org.apache.drill.exec.store.easy.json.reader.BaseJsonReader.write():70
org.apache.drill.exec.store.easy.json.JSONRecordReader.next():234
org.apache.drill.exec.physical.impl.ScanBatch.internalNext():234
org.apache.drill.exec.physical.impl.ScanBatch.next():298
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.physical.impl.BaseRootExec.next():103
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():93
org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0():321
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():423
org.apache.hadoop.security.UserGroupInformation.doAs():1762
org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():834
{noformat}
*Note:* It is not reproducible in Drill 1.19.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (DRILL-8120) Make Drill functional tests viable again.

2022-01-31 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-8120:
---

 Summary: Make Drill functional tests viable again.
 Key: DRILL-8120
 URL: https://issues.apache.org/jira/browse/DRILL-8120
 Project: Apache Drill
  Issue Type: Task
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy


There is an external test framework that was used for Drill regression testing 
before:
[https://github.com/mapr/drill-test-framework]
Although it is under mapr domain, it is public and licensed under the Apache 
License 2.0 so it can be used again.

*Problems need to be solved to make it work:*
 # Environment. It used to run on a quite powerful physical cluster with HDFS 
and Drill, with static configuration and reusable test data. This makes the 
framework inflexible, end even if you have suitable env it may still require 
some amount of manual tuning. Possible solution: wrap it up into a docker 
container to make it platform independent and minimize the effort to set it up.
 # Tests were not updated for 2 years so they need to be brought up to date. It 
can be done step by step, fixing some test suites, removing or disabling those 
that are not needed anymore.
 # Test pipeline. After first two paragraphs are resolved, a CI tool can be 
used to run tests regularly.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (DRILL-7785) Some hive tables fail with UndeclaredThrowableException

2020-08-31 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187867#comment-17187867
 ] 

Anton Gozhiy commented on DRILL-7785:
-

Reproduced with Drill 1.17.0 and MapR 6.1.0. So even if it is a Drill problem 
and not MapR-FS this is not a regression and therefore not a release blocker.

> Some hive tables fail with UndeclaredThrowableException
> ---
>
> Key: DRILL-7785
> URL: https://issues.apache.org/jira/browse/DRILL-7785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Hive
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Query: 
> {code}
> Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
> select * from hive_orc_transactional.orc_table_clustered_bucketed
> {code}
> Exception:
> {code}
> java.sql.SQLException: EXECUTION_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Failed to setup reader: HiveDefaultRecordReader
> Fragment: 0:0
> [Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.util.concurrent.ExecutionException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.reflect.UndeclaredThrowableException) null
> 

[jira] [Updated] (DRILL-7429) Wrong column order when selecting complex data using Hive storage plugin.

2020-07-02 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7429:

Description: 
*Data:*
customer_complex.zip attached
Hive ddl:

{code:sql}
create  external table if not exists customer_complex (
c_custkey int,
c_name string,
c_address string,
c_nation struct<
n_name:string,
n_comment:string, 
n_region:map>,
c_phone string,
c_acctbal double,
c_mktsegment string,
c_comment string,
c_orders array,
l_supplier:struct<
s_name:string,
s_address:string,
s_nationkey:int,
s_phone:string,
s_acctbal:double,
s_comment:string>,
l_linenumber:int,
l_quantity:double,
l_extendedprice:double,
l_discount:double,
l_tax:double,
l_returnflag:string,
l_linestatus:string,
l_shipdate:date,
l_commitdate:date,
l_receiptdate:date,
l_shipinstruct:string,
l_shipmode:string,
l_comment:string
)
STORED AS parquet
LOCATION '/drill/customer_complex';
{code}


*Query:*
{code:sql}
select t3.a, t3.b from (select t2.a, t2.a.o_lineitems[1].l_part.p_name b from 
(select t1.c_orders[0] a from hive.customer_complex t1) t2) t3 limit 1
{code}

*Expected result:*
Column order: a, b

*Actual result:*
Column order: b, a

*Physical plan:*
{noformat}
00-00Screen
00-01  Project(a=[ROW($0, $1, $2, $3, $4, $5, $6, $7)], b=[$8])
00-02Project(a=[ITEM($0, 0).o_orderstatus], a1=[ITEM($0, 
0).o_totalprice], a2=[ITEM($0, 0).o_orderdate], a3=[ITEM($0, 
0).o_orderpriority], a4=[ITEM($0, 0).o_clerk], a5=[ITEM($0, 0).o_shippriority], 
a6=[ITEM($0, 0).o_comment], a7=[ITEM($0, 0).o_lineitems], 
b=[ITEM(ITEM(ITEM(ITEM($0, 0).o_lineitems, 1), 'l_part'), 'p_name')])
00-03  Project(c_orders=[$0])
00-04SelectionVectorRemover
00-05  Limit(fetch=[10])
00-06Scan(table=[[hive, customer_complex]], 
groupscan=[HiveDrillNativeParquetScan [entries=[ReadEntryWithPath 
[path=/drill/customer_complex/00_0]], numFiles=1, numRowGroups=1, 
columns=[`c_orders`[0].`o_orderstatus`, `c_orders`[0].`o_totalprice`, 
`c_orders`[0].`o_orderdate`, `c_orders`[0].`o_orderpriority`, 
`c_orders`[0].`o_clerk`, `c_orders`[0].`o_shippriority`, 
`c_orders`[0].`o_comment`, `c_orders`[0].`o_lineitems`, 
`c_orders`[0].`o_lineitems`[1].`l_part`.`p_name`]]])
{noformat}

*Note:* Reproduced with both Hive and Native readers. Non-reproducible with 
Parquet reader.

  was:
*Data:*
customer_complex.zip attached

*Query:*
{code:sql}
select t3.a, t3.b from (select t2.a, t2.a.o_lineitems[1].l_part.p_name b from 
(select t1.c_orders[0] a from hive.customer_complex t1) t2) t3 limit 1
{code}

*Expected result:*
Column order: a, b

*Actual result:*
Column order: b, a

*Physical plan:*
{noformat}
00-00Screen
00-01  Project(a=[ROW($0, $1, $2, $3, $4, $5, $6, $7)], b=[$8])
00-02Project(a=[ITEM($0, 0).o_orderstatus], a1=[ITEM($0, 
0).o_totalprice], a2=[ITEM($0, 0).o_orderdate], a3=[ITEM($0, 
0).o_orderpriority], a4=[ITEM($0, 0).o_clerk], a5=[ITEM($0, 0).o_shippriority], 
a6=[ITEM($0, 0).o_comment], a7=[ITEM($0, 0).o_lineitems], 
b=[ITEM(ITEM(ITEM(ITEM($0, 0).o_lineitems, 1), 'l_part'), 'p_name')])
00-03  Project(c_orders=[$0])
00-04SelectionVectorRemover
00-05  Limit(fetch=[10])
00-06Scan(table=[[hive, customer_complex]], 
groupscan=[HiveDrillNativeParquetScan [entries=[ReadEntryWithPath 
[path=/drill/customer_complex/00_0]], numFiles=1, numRowGroups=1, 
columns=[`c_orders`[0].`o_orderstatus`, `c_orders`[0].`o_totalprice`, 
`c_orders`[0].`o_orderdate`, `c_orders`[0].`o_orderpriority`, 
`c_orders`[0].`o_clerk`, `c_orders`[0].`o_shippriority`, 
`c_orders`[0].`o_comment`, `c_orders`[0].`o_lineitems`, 
`c_orders`[0].`o_lineitems`[1].`l_part`.`p_name`]]])
{noformat}

*Note:* Reproduced with both Hive and Native readers. Non-reproducible with 
Parquet reader.


> Wrong column order when selecting complex data using Hive storage plugin.
> -
>
> Key: DRILL-7429
> URL: https://issues.apache.org/jira/browse/DRILL-7429
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
> Attachments: customer_complex.zip
>
>
> *Data:*
> customer_complex.zip attached
> Hive ddl:
> {code:sql}
> create  external table if not exists customer_complex (
> c_custkey int,
> c_name string,
> c_address string,

[jira] [Updated] (DRILL-7749) Drill-on-Yarn Application Master UI is broken due to bootstrap update

2020-06-18 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7749:

Fix Version/s: 1.18.0

> Drill-on-Yarn Application Master UI is broken due to bootstrap update
> -
>
> Key: DRILL-7749
> URL: https://issues.apache.org/jira/browse/DRILL-7749
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7749) Drill-on-Yarn Application Master UI is broken due to bootstrap update

2020-06-17 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7749:
---

 Summary: Drill-on-Yarn Application Master UI is broken due to 
bootstrap update
 Key: DRILL-7749
 URL: https://issues.apache.org/jira/browse/DRILL-7749
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7705) Update jQuery and Bootstrap libraries

2020-04-30 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17096365#comment-17096365
 ] 

Anton Gozhiy commented on DRILL-7705:
-

Merged into master with commit id 0a66da598a986971a77900442e20025ebfc56e9d.

> Update jQuery and Bootstrap libraries
> -
>
> Key: DRILL-7705
> URL: https://issues.apache.org/jira/browse/DRILL-7705
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> There are some vulnerabilities present in jQuery and Bootstrap libraries used 
> in Drill:
> * jQuery before 3.4.0, as used in Drupal, Backdrop CMS, and other products, 
> mishandles jQuery.extend(true, {}, ...) because of Object.prototype 
> pollution. If an unsanitized source object contained an enumerable __proto__ 
> property, it could extend the native Object.prototype.
> * In Bootstrap before 4.1.2, XSS is possible in the collapse data-parent 
> attribute.
> * In Bootstrap before 4.1.2, XSS is possible in the data-container property 
> of tooltip.
> * In Bootstrap before 3.4.0, XSS is possible in the affix configuration 
> target property.
> * In Bootstrap before 3.4.1 and 4.3.x before 4.3.1, XSS is possible in the 
> tooltip or popover data-template attribute.
> The following update is suggested to fix them:
> * jQuery: 3.2.1 -> 3.5.0
> * Bootstrap: 3.1.1 -> 4.4.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7705) Update jQuery and Bootstrap libraries

2020-04-17 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7705:
---

 Summary: Update jQuery and Bootstrap libraries
 Key: DRILL-7705
 URL: https://issues.apache.org/jira/browse/DRILL-7705
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.17.0
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy
 Fix For: 1.18.0


There are some vulnerabilities present in jQuery and Bootstrap libraries used 
in Drill:
* jQuery before 3.4.0, as used in Drupal, Backdrop CMS, and other products, 
mishandles jQuery.extend(true, {}, ...) because of Object.prototype pollution. 
If an unsanitized source object contained an enumerable __proto__ property, it 
could extend the native Object.prototype.
* In Bootstrap before 4.1.2, XSS is possible in the collapse data-parent 
attribute.
* In Bootstrap before 4.1.2, XSS is possible in the data-container property of 
tooltip.
* In Bootstrap before 3.4.0, XSS is possible in the affix configuration target 
property.
* In Bootstrap before 3.4.1 and 4.3.x before 4.3.1, XSS is possible in the 
tooltip or popover data-template attribute.

The following update is suggested to fix them:
* jQuery: 3.2.1 -> 3.5.0
* Bootstrap: 3.1.1 -> 4.4.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7700) Queries to sys schema hang if "use dfs;" was executed before

2020-04-14 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7700:
---

 Summary: Queries to sys schema hang if "use dfs;" was executed 
before
 Key: DRILL-7700
 URL: https://issues.apache.org/jira/browse/DRILL-7700
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy


*Steps:*
# Connect to Drill by sqlline
# Run query "use dfs;"
# Run query "select * from sys.drillbits;"

*Expected result:* The query should be executed successfully.

*Actual result:* The query hangs on planning stage.

*Note:* The issue is reproduced with Drill built with "mapr" profile.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7693) Upgrade Protobuf to 3.11.1

2020-04-09 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7693:
---

 Summary: Upgrade Protobuf to 3.11.1
 Key: DRILL-7693
 URL: https://issues.apache.org/jira/browse/DRILL-7693
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.17.0
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7650) Add option to enable Jetty's dump for troubleshooting

2020-03-27 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7650.
---

Verified with Drill version 1.18.0-SNAPSHOT (commit id 
83bc8e63586051962a6ab39731d3c0ae19b334d1), both embedded and distributed modes.

> Add option to enable Jetty's dump for troubleshooting
> -
>
> Key: DRILL-7650
> URL: https://issues.apache.org/jira/browse/DRILL-7650
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Jetty server implements a useful tool for 
> [dumping|https://www.eclipse.org/jetty/documentation/current/jetty-dump-tool.html#dump-tool-via-jmx]
>  the current state of the server, but in Drill, it's not possible to use it 
> without code changes.  This ticket aims to add option 
> *drill.exec.http.jetty.server.dumpAfterStart* in Apache Drill. 
> Another change is to remove the redundant setProtocol(protocol) method call 
> on sslContextFactory instance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7647) Drill Web server doesn't work with TLS protocol version 1.1

2020-03-20 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7647.
---

> Drill Web server doesn't work with TLS protocol version 1.1
> ---
>
> Key: DRILL-7647
> URL: https://issues.apache.org/jira/browse/DRILL-7647
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>
> *Prerequisites:*
> # Set the following config options:
> * drill.exec.http.ssl_enabled: true
> * drill.exec.ssl.protocol: "*TLSv1.1*"
> * Also:
> ** drill.exec.ssl.trustStorePath
> ** drill.exec.ssl.trustStorePassword
> ** drill.exec.ssl.keyStorePath
> ** keyStorePassword
> * Or, if on MapR platform: 
> ** drill.exec.ssl.useMapRSSLConfig: true
> *Steps:*
> # Start Drill
> # Try to open the Web UI
> # Try to connect by an ssl client:
> {noformat}
> openssl s_client -connect node1.cluster.com:8047 -tls1_1
> {noformat}
> *Expected result:*
> It should accept protocol version v1.1
> *Actual results:*
> * Cannot open the Web UI:
> {noformat}
> This site can't provide a secure connection
> node1.cluster.com uses an unsupported protocol.
> ERR_SSL_VERSION_OR_CIPHER_MISMATCH
> {noformat}
> * Openssl client fails to connect using either v1.1 or v1.2 protocols.
> {noformat}
> $ openssl s_client -connect node1.cluster.com:8047 -tls1_1
> CONNECTED(0003)
> 140310139057816:error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert 
> handshake failure:s3_pkt.c:1487:SSL alert number 40
> 140310139057816:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake 
> failure:s3_pkt.c:656:
> ---
> no peer certificate available
> ---
> No client certificate CA names sent
> ---
> SSL handshake has read 7 bytes and written 0 bytes
> ---
> New, (NONE), Cipher is (NONE)
> Secure Renegotiation IS NOT supported
> Compression: NONE
> Expansion: NONE
> No ALPN negotiated
> SSL-Session:
> Protocol  : TLSv1.1
> Cipher: 
> Session-ID: 
> Session-ID-ctx: 
> Master-Key: 
> Key-Arg   : None
> PSK identity: None
> PSK identity hint: None
> SRP username: None
> Start Time: 1584457371
> Timeout   : 7200 (sec)
> Verify return code: 0 (ok)
> ---
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7647) Drill Web server doesn't work with TLS protocol version 1.1

2020-03-17 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7647:
---

Assignee: Igor Guzenko

> Drill Web server doesn't work with TLS protocol version 1.1
> ---
>
> Key: DRILL-7647
> URL: https://issues.apache.org/jira/browse/DRILL-7647
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>
> *Prerequisites:*
> # Set the following config options:
> * drill.exec.http.ssl_enabled: true
> * drill.exec.ssl.protocol: "*TLSv1.1*"
> * Also:
> ** drill.exec.ssl.trustStorePath
> ** drill.exec.ssl.trustStorePassword
> ** drill.exec.ssl.keyStorePath
> ** keyStorePassword
> * Or, if on MapR platform: 
> ** drill.exec.ssl.useMapRSSLConfig: true
> *Steps:*
> # Start Drill
> # Try to open the Web UI
> # Try to connect by an ssl client:
> {noformat}
> openssl s_client -connect node1.cluster.com:8047 -tls1_1
> {noformat}
> *Expected result:*
> It should accept protocol version v1.1
> *Actual results:*
> * Cannot open the Web UI:
> {noformat}
> This site can't provide a secure connection
> node1.cluster.com uses an unsupported protocol.
> ERR_SSL_VERSION_OR_CIPHER_MISMATCH
> {noformat}
> * Openssl client fails to connect using either v1.1 or v1.2 protocols.
> {noformat}
> $ openssl s_client -connect node1.cluster.com:8047 -tls1_1
> CONNECTED(0003)
> 140310139057816:error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert 
> handshake failure:s3_pkt.c:1487:SSL alert number 40
> 140310139057816:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake 
> failure:s3_pkt.c:656:
> ---
> no peer certificate available
> ---
> No client certificate CA names sent
> ---
> SSL handshake has read 7 bytes and written 0 bytes
> ---
> New, (NONE), Cipher is (NONE)
> Secure Renegotiation IS NOT supported
> Compression: NONE
> Expansion: NONE
> No ALPN negotiated
> SSL-Session:
> Protocol  : TLSv1.1
> Cipher: 
> Session-ID: 
> Session-ID-ctx: 
> Master-Key: 
> Key-Arg   : None
> PSK identity: None
> PSK identity hint: None
> SRP username: None
> Start Time: 1584457371
> Timeout   : 7200 (sec)
> Verify return code: 0 (ok)
> ---
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7647) Drill Web server doesn't work with TLS protocol version 1.1

2020-03-17 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7647:
---

 Summary: Drill Web server doesn't work with TLS protocol version 
1.1
 Key: DRILL-7647
 URL: https://issues.apache.org/jira/browse/DRILL-7647
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Anton Gozhiy


*Prerequisites:*
# Set the following config options:
* drill.exec.http.ssl_enabled: true
* drill.exec.ssl.protocol: "*TLSv1.1*"
* Also:
** drill.exec.ssl.trustStorePath
** drill.exec.ssl.trustStorePassword
** drill.exec.ssl.keyStorePath
** keyStorePassword
* Or, if on MapR platform: 
** drill.exec.ssl.useMapRSSLConfig: true

*Steps:*
# Start Drill
# Try to open the Web UI
# Try to connect by an ssl client:
{noformat}
openssl s_client -connect node1.cluster.com:8047 -tls1_1
{noformat}

*Expected result:*
It should accept protocol version v1.1

*Actual results:*
* Cannot open the Web UI:
{noformat}
This site can't provide a secure connection
node1.cluster.com uses an unsupported protocol.
ERR_SSL_VERSION_OR_CIPHER_MISMATCH
{noformat}
* Openssl client fails to connect using either v1.1 or v1.2 protocols.
{noformat}
$ openssl s_client -connect node1.cluster.com:8047 -tls1_1
CONNECTED(0003)
140310139057816:error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert 
handshake failure:s3_pkt.c:1487:SSL alert number 40
140310139057816:error:1409E0E5:SSL routines:ssl3_write_bytes:ssl handshake 
failure:s3_pkt.c:656:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 7 bytes and written 0 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol  : TLSv1.1
Cipher: 
Session-ID: 
Session-ID-ctx: 
Master-Key: 
Key-Arg   : None
PSK identity: None
PSK identity hint: None
SRP username: None
Start Time: 1584457371
Timeout   : 7200 (sec)
Verify return code: 0 (ok)
---
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7637) Add an option to retrieve MapR SSL truststore/keystore credentials using MapR Web Security Manager

2020-03-13 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17058779#comment-17058779
 ] 

Anton Gozhiy commented on DRILL-7637:
-

Merged into Apache master with commit 42f9f66b5ff5e4a9dab99f732ccadf9a94c6eec0.

> Add an option to retrieve MapR SSL truststore/keystore credentials using MapR 
> Web Security Manager
> --
>
> Key: DRILL-7637
> URL: https://issues.apache.org/jira/browse/DRILL-7637
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.18.0
>
>
> If Drill is built with mapr profile and "useMapRSSLConfig" option is set to 
> true, then it will use MapR Web Security Manager to retrieve SSL credentials.
>  Example usage:
>  - Add an option to Drill config:
> {noformat}
> drill.exec.ssl.useMapRSSLConfig: true.
> {noformat}
>  - Connect by sqlline:
> {noformat}
> ./bin/sqlline -u 
> "jdbc:drill:drillbit=node1.cluster.com:31010;enableTLS=true;useMapRSSLConfig=true"{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7638) Cannot login to Drill Web Console, if go to it from YARN Application Master Web-UI

2020-03-11 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7638:

Description: 
*Preconditions:*
 # drill-on-yarn.conf:
{noformat}
auth-type: "drill"
{noformat}

 # drill-override.conf:
{noformat}
drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:5181"
  impersonation: {
enabled: true,
max_chained_user_hops: 3
},
security: {
auth.mechanisms : ["PLAIN"],
},
security.user.auth: {
enabled: true,
packages += "org.apache.drill.exec.rpc.user.security",
impl: "pam4j",
pam_profiles: [ "sudo", "login" ]
}
}
{noformat}

 # Start drill on yarn
{noformat}
./bin/drill-on-yarn.sh start --site 
{noformat}

 
 *Steps to reproduce:*
 # Open  Drill-on-YARN Web-UI
 # Login
 # Go to "drillbits" page
 # Click on available host of drillbit
 # Log In to Drill Web Console
 # Click on page "Query"

 

*Expected result:* opened page "Query"
 *Actual result:*  user is not logged in
 Also unwanted parameters were added to the URL:
 * mainLogin?redirect=%2Fquery
 * ;jsessionid=1b80pye4814gy56gqdz02jtkg

 

  was:
*Preconditions:*
 # drill-on-yarn.conf:
{noformat}
connection: "maprfs:///"
app-dir: "/users/drill"
auth-type: "drill"
{noformat}
 # drill-override.conf:
{noformat}
drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:5181"
  impersonation: {
enabled: true,
max_chained_user_hops: 3
},
security: {
auth.mechanisms : ["PLAIN"],
},
security.user.auth: {
enabled: true,
packages += "org.apache.drill.exec.rpc.user.security",
impl: "pam4j",
pam_profiles: [ "sudo", "login" ]
}
}
{noformat}
 # Start drill on yarn
{noformat}
./bin/drill-on-yarn.sh start --site 
{noformat}

 
 *Steps to reproduce:*
 # Open  Drill-on-YARN Web-UI
 # Login
 # Go to "drillbits" page
 # Click on available host of drillbit
 # Log In to Drill Web Console
 # Click on page "Query"

 

*Expected result:* opened page "Query"
 *Actual result:*  user is not logged in
Also unwanted parameters were added to the URL:
 * mainLogin?redirect=%2Fquery
 * ;jsessionid=1b80pye4814gy56gqdz02jtkg

 


> Cannot login to Drill Web Console, if go to it from YARN Application Master 
> Web-UI
> --
>
> Key: DRILL-7638
> URL: https://issues.apache.org/jira/browse/DRILL-7638
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Dmytro Kondriukov
>Priority: Major
>
> *Preconditions:*
>  # drill-on-yarn.conf:
> {noformat}
> auth-type: "drill"
> {noformat}
>  # drill-override.conf:
> {noformat}
> drill.exec: {
>   cluster-id: "drillbits1",
>   zk.connect: "localhost:5181"
>   impersonation: {
> enabled: true,
> max_chained_user_hops: 3
> },
> security: {
> auth.mechanisms : ["PLAIN"],
> },
> security.user.auth: {
> enabled: true,
> packages += "org.apache.drill.exec.rpc.user.security",
> impl: "pam4j",
> pam_profiles: [ "sudo", "login" ]
> }
> }
> {noformat}
>  # Start drill on yarn
> {noformat}
> ./bin/drill-on-yarn.sh start --site 
> {noformat}
>  
>  *Steps to reproduce:*
>  # Open  Drill-on-YARN Web-UI
>  # Login
>  # Go to "drillbits" page
>  # Click on available host of drillbit
>  # Log In to Drill Web Console
>  # Click on page "Query"
>  
> *Expected result:* opened page "Query"
>  *Actual result:*  user is not logged in
>  Also unwanted parameters were added to the URL:
>  * mainLogin?redirect=%2Fquery
>  * ;jsessionid=1b80pye4814gy56gqdz02jtkg
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7638) Cannot login to Drill Web Console, if go to it from YARN Application Master Web-UI

2020-03-11 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7638:

Description: 
*Preconditions:*
 # drill-on-yarn.conf:
{noformat}
auth-type: "drill"
{noformat}

 # drill-override.conf:
{noformat}
drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:5181"
  impersonation: {
enabled: true,
max_chained_user_hops: 3
},
security: {
auth.mechanisms : ["PLAIN"],
},
security.user.auth: {
enabled: true,
packages += "org.apache.drill.exec.rpc.user.security",
impl: "pam4j",
pam_profiles: [ "sudo", "login" ]
}
}
{noformat}

 # Start drill on yarn
{noformat}
./bin/drill-on-yarn.sh start --site 
{noformat}

 
 *Steps to reproduce:*
 # Open  Drill-on-YARN Web-UI
 # Login
 # Go to "drillbits" page
 # Click on available host of drillbit
 # Log In to Drill Web Console
 # Click on page "Query"

 

*Expected result:* opened page "Query"
 *Actual result:*  user is not logged in
 Also unwanted parameters were added to the URL:
 * mainLogin?redirect=%2Fquery
 * jsessionid=1b80pye4814gy56gqdz02jtkg

 

  was:
*Preconditions:*
 # drill-on-yarn.conf:
{noformat}
auth-type: "drill"
{noformat}

 # drill-override.conf:
{noformat}
drill.exec: {
  cluster-id: "drillbits1",
  zk.connect: "localhost:5181"
  impersonation: {
enabled: true,
max_chained_user_hops: 3
},
security: {
auth.mechanisms : ["PLAIN"],
},
security.user.auth: {
enabled: true,
packages += "org.apache.drill.exec.rpc.user.security",
impl: "pam4j",
pam_profiles: [ "sudo", "login" ]
}
}
{noformat}

 # Start drill on yarn
{noformat}
./bin/drill-on-yarn.sh start --site 
{noformat}

 
 *Steps to reproduce:*
 # Open  Drill-on-YARN Web-UI
 # Login
 # Go to "drillbits" page
 # Click on available host of drillbit
 # Log In to Drill Web Console
 # Click on page "Query"

 

*Expected result:* opened page "Query"
 *Actual result:*  user is not logged in
 Also unwanted parameters were added to the URL:
 * mainLogin?redirect=%2Fquery
 * ;jsessionid=1b80pye4814gy56gqdz02jtkg

 


> Cannot login to Drill Web Console, if go to it from YARN Application Master 
> Web-UI
> --
>
> Key: DRILL-7638
> URL: https://issues.apache.org/jira/browse/DRILL-7638
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Dmytro Kondriukov
>Priority: Major
>
> *Preconditions:*
>  # drill-on-yarn.conf:
> {noformat}
> auth-type: "drill"
> {noformat}
>  # drill-override.conf:
> {noformat}
> drill.exec: {
>   cluster-id: "drillbits1",
>   zk.connect: "localhost:5181"
>   impersonation: {
> enabled: true,
> max_chained_user_hops: 3
> },
> security: {
> auth.mechanisms : ["PLAIN"],
> },
> security.user.auth: {
> enabled: true,
> packages += "org.apache.drill.exec.rpc.user.security",
> impl: "pam4j",
> pam_profiles: [ "sudo", "login" ]
> }
> }
> {noformat}
>  # Start drill on yarn
> {noformat}
> ./bin/drill-on-yarn.sh start --site 
> {noformat}
>  
>  *Steps to reproduce:*
>  # Open  Drill-on-YARN Web-UI
>  # Login
>  # Go to "drillbits" page
>  # Click on available host of drillbit
>  # Log In to Drill Web Console
>  # Click on page "Query"
>  
> *Expected result:* opened page "Query"
>  *Actual result:*  user is not logged in
>  Also unwanted parameters were added to the URL:
>  * mainLogin?redirect=%2Fquery
>  * jsessionid=1b80pye4814gy56gqdz02jtkg
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7637) Add an option to retrieve MapR SSL truststore/keystore credentials using MapR Web Security Manager

2020-03-11 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7637:
---

 Summary: Add an option to retrieve MapR SSL truststore/keystore 
credentials using MapR Web Security Manager
 Key: DRILL-7637
 URL: https://issues.apache.org/jira/browse/DRILL-7637
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy
 Fix For: 1.18.0


If Drill is built with mapr profile and "useMapRSSLConfig" option is set to 
true, then it will use MapR Web Security Manager to retrieve SSL credentials.
 Example usage:
 - Add an option to Drill config:
{noformat}
drill.exec.ssl.useMapRSSLConfig: true.
{noformat}

 - Connect by sqlline:
{noformat}
./bin/sqlline -u 
"jdbc:drill:drillbit=node1.cluster.com:31010;enableTLS=true;useMapRSSLConf=true"
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7637) Add an option to retrieve MapR SSL truststore/keystore credentials using MapR Web Security Manager

2020-03-11 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7637:

Description: 
If Drill is built with mapr profile and "useMapRSSLConfig" option is set to 
true, then it will use MapR Web Security Manager to retrieve SSL credentials.
 Example usage:
 - Add an option to Drill config:
{noformat}
drill.exec.ssl.useMapRSSLConfig: true.
{noformat}

 - Connect by sqlline:
{noformat}
./bin/sqlline -u 
"jdbc:drill:drillbit=node1.cluster.com:31010;enableTLS=true;useMapRSSLConfig=true"{noformat}

  was:
If Drill is built with mapr profile and "useMapRSSLConfig" option is set to 
true, then it will use MapR Web Security Manager to retrieve SSL credentials.
 Example usage:
 - Add an option to Drill config:
{noformat}
drill.exec.ssl.useMapRSSLConfig: true.
{noformat}

 - Connect by sqlline:
{noformat}
./bin/sqlline -u 
"jdbc:drill:drillbit=node1.cluster.com:31010;enableTLS=true;useMapRSSLConf=true"
{noformat}


> Add an option to retrieve MapR SSL truststore/keystore credentials using MapR 
> Web Security Manager
> --
>
> Key: DRILL-7637
> URL: https://issues.apache.org/jira/browse/DRILL-7637
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.18.0
>
>
> If Drill is built with mapr profile and "useMapRSSLConfig" option is set to 
> true, then it will use MapR Web Security Manager to retrieve SSL credentials.
>  Example usage:
>  - Add an option to Drill config:
> {noformat}
> drill.exec.ssl.useMapRSSLConfig: true.
> {noformat}
>  - Connect by sqlline:
> {noformat}
> ./bin/sqlline -u 
> "jdbc:drill:drillbit=node1.cluster.com:31010;enableTLS=true;useMapRSSLConfig=true"{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7624) When Hive plugin is enabled with default config, cannot execute any SQL query

2020-03-04 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051485#comment-17051485
 ] 

Anton Gozhiy commented on DRILL-7624:
-

This is a regression after DRILL-7590.

> When Hive plugin is enabled with default config, cannot execute any SQL query
> -
>
> Key: DRILL-7624
> URL: https://issues.apache.org/jira/browse/DRILL-7624
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Dmytro Kondriukov
>Assignee: Paul Rogers
>Priority: Major
>
> *Preconditions:*
> Enable "hive" plugin, without editing configuration (default config)
> *Steps:*
>  Run any valid query:
> {code:sql}
> SELECT 100; 
> {code}
>  *Expected result:* The query should be successfully executed.
> *Actual result:*  "UserRemoteException : INTERNAL_ERROR ERROR: Failure 
> setting up Hive metastore client." 
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Failure setting up Hive metastore client. 
> Plugin name hive 
> Plugin class org.apache.drill.exec.store.hive.HiveStoragePlugin 
> Please, refer to logs for more information. 
> [Error Id: db44f5c3-5136-4fc6-8158-50b63d775fe0 ]
> {noformat}
>  
> {noformat}
>   (org.apache.drill.common.exceptions.ExecutionSetupException) Failure 
> setting up Hive metastore client.
> org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.():78
> org.apache.drill.exec.store.hive.HiveStoragePlugin.():77
> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> java.lang.reflect.Constructor.newInstance():423
> org.apache.drill.exec.store.ClassicConnectorLocator.create():274
> org.apache.drill.exec.store.ConnectorHandle.newInstance():98
> org.apache.drill.exec.store.PluginHandle.plugin():143
> 
> org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():616
> 
> org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():601
> org.apache.drill.exec.planner.sql.handlers.SqlHandlerConfig.getRules():48
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():367
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():351
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():338
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel():663
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():198
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():169
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():283
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():163
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():140
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():93
> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> org.apache.drill.exec.work.foreman.Foreman.run():275
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (org.apache.hadoop.hive.metastore.api.MetaException) Unable to 
> open a test connection to the given database. JDBC url = 
> jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true, username = 
> APP. Terminating connection pool (set lazyInit to true if you expect to start 
> your database after your app). Original Exception: --
> java.sql.SQLException: Failed to create database 
> '../sample-data/drill_hive_db', see the next exception for details.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
>   at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source)
>   at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
>   at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
>   at org.apache.derby.jdbc.Driver20.connect(Unknown Source)
>   at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:208)
>   at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
>   at 

[jira] [Assigned] (DRILL-7624) When Hive plugin is enabled with default config, cannot execute any SQL query

2020-03-04 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7624:
---

Assignee: Paul Rogers

> When Hive plugin is enabled with default config, cannot execute any SQL query
> -
>
> Key: DRILL-7624
> URL: https://issues.apache.org/jira/browse/DRILL-7624
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Dmytro Kondriukov
>Assignee: Paul Rogers
>Priority: Major
>
> *Preconditions:*
> Enable "hive" plugin, without editing configuration (default config)
> *Steps:*
>  Run any valid query:
> {code:sql}
> SELECT 100; 
> {code}
>  *Expected result:* The query should be successfully executed.
> *Actual result:*  "UserRemoteException : INTERNAL_ERROR ERROR: Failure 
> setting up Hive metastore client." 
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Failure setting up Hive metastore client. 
> Plugin name hive 
> Plugin class org.apache.drill.exec.store.hive.HiveStoragePlugin 
> Please, refer to logs for more information. 
> [Error Id: db44f5c3-5136-4fc6-8158-50b63d775fe0 ]
> {noformat}
>  
> {noformat}
>   (org.apache.drill.common.exceptions.ExecutionSetupException) Failure 
> setting up Hive metastore client.
> org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.():78
> org.apache.drill.exec.store.hive.HiveStoragePlugin.():77
> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> java.lang.reflect.Constructor.newInstance():423
> org.apache.drill.exec.store.ClassicConnectorLocator.create():274
> org.apache.drill.exec.store.ConnectorHandle.newInstance():98
> org.apache.drill.exec.store.PluginHandle.plugin():143
> 
> org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():616
> 
> org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():601
> org.apache.drill.exec.planner.sql.handlers.SqlHandlerConfig.getRules():48
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():367
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():351
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():338
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel():663
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():198
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():169
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():283
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():163
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():140
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():93
> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> org.apache.drill.exec.work.foreman.Foreman.run():275
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (org.apache.hadoop.hive.metastore.api.MetaException) Unable to 
> open a test connection to the given database. JDBC url = 
> jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true, username = 
> APP. Terminating connection pool (set lazyInit to true if you expect to start 
> your database after your app). Original Exception: --
> java.sql.SQLException: Failed to create database 
> '../sample-data/drill_hive_db', see the next exception for details.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
>   at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source)
>   at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
>   at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
>   at org.apache.derby.jdbc.Driver20.connect(Unknown Source)
>   at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:208)
>   at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
>   at com.jolbox.bonecp.BoneCP.(BoneCP.java:416)
>   at 
> 

[jira] [Updated] (DRILL-7624) When Hive plugin is enabled with default config, cannot execute any SQL query

2020-03-04 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7624:

Summary: When Hive plugin is enabled with default config, cannot execute 
any SQL query  (was: when enabled Hive plugin with default config, can not 
execute any SQL query)

> When Hive plugin is enabled with default config, cannot execute any SQL query
> -
>
> Key: DRILL-7624
> URL: https://issues.apache.org/jira/browse/DRILL-7624
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Dmytro Kondriukov
>Priority: Major
>
> *Preconditions:*
> Enable "hive" plugin, without editing configuration (default config)
> *Steps:*
>  Run any valid query.
> {code:sql}
> SELECT 100; 
> {code}
>  *Expected result:* success executed SQL query 
> *Actual result:*  "UserRemoteException : INTERNAL_ERROR ERROR: Failure 
> setting up Hive metastore client." 
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: 
> Failure setting up Hive metastore client. 
> Plugin name hive 
> Plugin class org.apache.drill.exec.store.hive.HiveStoragePlugin 
> Please, refer to logs for more information. 
> [Error Id: db44f5c3-5136-4fc6-8158-50b63d775fe0 ]
> {noformat}
>  
> {noformat}
>   (org.apache.drill.common.exceptions.ExecutionSetupException) Failure 
> setting up Hive metastore client.
> org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.():78
> org.apache.drill.exec.store.hive.HiveStoragePlugin.():77
> sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> java.lang.reflect.Constructor.newInstance():423
> org.apache.drill.exec.store.ClassicConnectorLocator.create():274
> org.apache.drill.exec.store.ConnectorHandle.newInstance():98
> org.apache.drill.exec.store.PluginHandle.plugin():143
> 
> org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():616
> 
> org.apache.drill.exec.store.StoragePluginRegistryImpl$PluginIterator.next():601
> org.apache.drill.exec.planner.sql.handlers.SqlHandlerConfig.getRules():48
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():367
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():351
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():338
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel():663
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():198
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():169
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():283
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():163
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():140
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():93
> org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> org.apache.drill.exec.work.foreman.Foreman.run():275
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (org.apache.hadoop.hive.metastore.api.MetaException) Unable to 
> open a test connection to the given database. JDBC url = 
> jdbc:derby:;databaseName=../sample-data/drill_hive_db;create=true, username = 
> APP. Terminating connection pool (set lazyInit to true if you expect to start 
> your database after your app). Original Exception: --
> java.sql.SQLException: Failed to create database 
> '../sample-data/drill_hive_db', see the next exception for details.
>   at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.Util.newEmbedSQLException(Unknown Source)
>   at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown 
> Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection.(Unknown Source)
>   at org.apache.derby.impl.jdbc.EmbedConnection40.(Unknown Source)
>   at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
>   at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
>   at org.apache.derby.jdbc.Driver20.connect(Unknown Source)
>   at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:208)
>   at 

[jira] [Updated] (DRILL-7623) Link error is displayed at the log content page on Web UI

2020-03-04 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7623:

Reviewer: Anton Gozhiy

> Link error is displayed at the log content page on Web UI
> -
>
> Key: DRILL-7623
> URL: https://issues.apache.org/jira/browse/DRILL-7623
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> *Steps:*
> # Open a log file from the Web UI:
> {noformat}
> /log/sqlline.log/content
> {noformat}
> *Expected result:*
> There should be no errors.
> *Actual result:*
> {noformat}
> GET http://localhost:8047/log/static/js/jquery-3.2.1.min.js net::ERR_ABORTED 
> 500 (Internal Server Error)
> bootstrap.min.js:6 Uncaught Error: Bootstrap's JavaScript requires jQuery
> at bootstrap.min.js:6
> (anonymous)   @   bootstrap.min.js:6
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7623) Link error is displayed at the log content page on Web UI

2020-03-04 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7623:

Labels: ready-to-commit  (was: )

> Link error is displayed at the log content page on Web UI
> -
>
> Key: DRILL-7623
> URL: https://issues.apache.org/jira/browse/DRILL-7623
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
>
> *Steps:*
> # Open a log file from the Web UI:
> {noformat}
> /log/sqlline.log/content
> {noformat}
> *Expected result:*
> There should be no errors.
> *Actual result:*
> {noformat}
> GET http://localhost:8047/log/static/js/jquery-3.2.1.min.js net::ERR_ABORTED 
> 500 (Internal Server Error)
> bootstrap.min.js:6 Uncaught Error: Bootstrap's JavaScript requires jQuery
> at bootstrap.min.js:6
> (anonymous)   @   bootstrap.min.js:6
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7623) Link error is displayed at the log content page on Web UI

2020-03-04 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7623:
---

Assignee: Vova Vysotskyi

> Link error is displayed at the log content page on Web UI
> -
>
> Key: DRILL-7623
> URL: https://issues.apache.org/jira/browse/DRILL-7623
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
>
> *Steps:*
> # Open a log file from the Web UI:
> {noformat}
> /log/sqlline.log/content
> {noformat}
> *Expected result:*
> There should be no errors.
> *Actual result:*
> {noformat}
> GET http://localhost:8047/log/static/js/jquery-3.2.1.min.js net::ERR_ABORTED 
> 500 (Internal Server Error)
> bootstrap.min.js:6 Uncaught Error: Bootstrap's JavaScript requires jQuery
> at bootstrap.min.js:6
> (anonymous)   @   bootstrap.min.js:6
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7623) Link error is displayed at the log content page on Web UI

2020-03-04 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7623:
---

 Summary: Link error is displayed at the log content page on Web UI
 Key: DRILL-7623
 URL: https://issues.apache.org/jira/browse/DRILL-7623
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy


*Steps:*
# Open a log file from the Web UI:
{noformat}
/log/sqlline.log/content
{noformat}

*Expected result:*
There should be no errors.

*Actual result:*
{noformat}
GET http://localhost:8047/log/static/js/jquery-3.2.1.min.js net::ERR_ABORTED 
500 (Internal Server Error)
bootstrap.min.js:6 Uncaught Error: Bootstrap's JavaScript requires jQuery
at bootstrap.min.js:6
(anonymous) @   bootstrap.min.js:6
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7620) Storage plugin update page shows that a plugin is disabled though it is actually enabled.

2020-03-03 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050299#comment-17050299
 ] 

Anton Gozhiy commented on DRILL-7620:
-

[~arina], Actually, it was introduced after DRILL-7617, or, more precisely, it 
wasn't fully fixed.

> Storage plugin update page shows that a plugin is disabled though it is 
> actually enabled.
> -
>
> Key: DRILL-7620
> URL: https://issues.apache.org/jira/browse/DRILL-7620
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Anton Gozhiy
>Assignee: Paul Rogers
>Priority: Major
>
> *Steps to reproduce:*
> # On Web UI, open storage page
> # Disable some plugin (e.g. "cp")
> # Enable this plugin (It is displayed in "enabled" section now)
> # Update the plugin, look at the "enabled" property
> *Expected result:*
> "enabled": true
> *Actual result:*
> "enabled": false
> *Note:* Though it is displayed as disabled in the config, queries to it are 
> working.
> *Workaround:* Enable it again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7620) Storage plugin update page shows that a plugin is disabled though it is actually enabled.

2020-03-03 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050228#comment-17050228
 ] 

Anton Gozhiy commented on DRILL-7620:
-

[~arina], I'll check this

> Storage plugin update page shows that a plugin is disabled though it is 
> actually enabled.
> -
>
> Key: DRILL-7620
> URL: https://issues.apache.org/jira/browse/DRILL-7620
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Anton Gozhiy
>Assignee: Paul Rogers
>Priority: Major
>
> *Steps to reproduce:*
> # On Web UI, open storage page
> # Disable some plugin (e.g. "cp")
> # Enable this plugin (It is displayed in "enabled" section now)
> # Update the plugin, look at the "enabled" property
> *Expected result:*
> "enabled": true
> *Actual result:*
> "enabled": false
> *Note:* Though it is displayed as disabled in the config, queries to it are 
> working.
> *Workaround:* Enable it again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7620) Storage plugin update page shows that a plugin is disabled though it is actually enabled.

2020-03-03 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7620:
---

 Summary: Storage plugin update page shows that a plugin is 
disabled though it is actually enabled.
 Key: DRILL-7620
 URL: https://issues.apache.org/jira/browse/DRILL-7620
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Anton Gozhiy
Assignee: Paul Rogers


*Steps to reproduce:*
# On Web UI, open storage page
# Disable some plugin (e.g. "cp")
# Enable this plugin (It is displayed in "enabled" section now)
# Update the plugin, look at the "enabled" property

*Expected result:*
"enabled": true

*Actual result:*
"enabled": false

*Note:* Though it is displayed as disabled in the config, queries to it are 
working.

*Workaround:* Enable it again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7619) Metrics is not displayed due to incorrect endpoint link on the Drill index page

2020-03-03 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7619:
---

 Summary: Metrics is not displayed due to incorrect endpoint link 
on the Drill index page
 Key: DRILL-7619
 URL: https://issues.apache.org/jira/browse/DRILL-7619
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Anton Gozhiy
Assignee: Anton Gozhiy
 Fix For: 1.18.0


Should be /status/metrics/\{hostname} instead of /status/\{hostname}/metrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7592) Add missing licenses and update plugins exclusion list

2020-03-02 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7592:

Labels: ready-to-commit  (was: )

> Add missing licenses and update plugins exclusion list
> --
>
> Key: DRILL-7592
> URL: https://issues.apache.org/jira/browse/DRILL-7592
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> The project contains a lot of files without licenses, like this one: 
> [https://github.com/apache/drill/blob/f2654eed3b6d444280b5c0c2a507edc355c676e6/contrib/format-excel/src/main/resources/drill-module.conf].
> Besides {{*.conf}} files, there are other file types that may and should 
> contain license.
> Except for adding licenses, some binary files should be excluded from the 
> check to avoid warnings like this one:
> {noformat}
> [INFO] --- license-maven-plugin:3.0:check (default) @ drill-root ---
> [INFO] Checking licenses...
> [WARNING] Unknown file extension: 
> /home/runner/work/drill/drill/contrib/format-esri/src/test/resources/shapefiles/CA-cities.shp
> [WARNING] Unknown file extension: 
> /home/runner/work/drill/drill/contrib/format-esri/src/test/resources/shapefiles/CA-cities.prj
> [WARNING] Unknown file extension: 
> /home/runner/work/drill/drill/contrib/format-esri/src/test/resources/shapefiles/CA-cities.dbf
> [WARNING] Unknown file extension: 
> /home/runner/work/drill/drill/contrib/storage-hive/core/src/test/resources/complex_types/map/map_union_tbl.avro
> [WARNING] Unknown file extension: 
> /home/runner/work/drill/drill/exec/java-exec/src/test/resources/avro/map_string_to_long.avro
> [WARNING] Unknown file extension: 
> /home/runner/work/drill/drill/exec/java-exec/src/test/resources/regex/firewall.ssdlog
> [WARNING] Unable to find a comment style definition for some files. You may 
> want to add a custom mapping for the relevant file extensions.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7605) Web UI should remember last Query text & settings

2020-03-02 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7605:

Labels: ready-to-commit  (was: )

> Web UI should remember last Query text & settings
> -
>
> Key: DRILL-7605
> URL: https://issues.apache.org/jira/browse/DRILL-7605
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.17.0
>Reporter: Dobes Vandermeer
>Assignee: Dobes Vandermeer
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> It would be a great convenience if the web UI would remember your previous 
> options when you run a query and return back to the query page.  This way if 
> you just want to slightly adjust your last query you don't have to re-type 
> it, or find it in the profiles list and click to edit it.  Also it should 
> remember you preference for the row limit and userName.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (DRILL-7586) drill-hive-exec-shaded contains commons-lang3 version 3.1

2020-02-21 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041964#comment-17041964
 ] 

Anton Gozhiy edited comment on DRILL-7586 at 2/21/20 3:48 PM:
--

Merged into Apache master with commit id 
[e8f9b7e.|https://github.com/apache/drill/commit/e8f9b7ee2940984c59783e36e4fbca3fa25d6ad7]


was (Author: angozhiy):
Merged into Apache master with commit id 
[e8f9b7e|https://github.com/apache/drill/commit/e8f9b7ee2940984c59783e36e4fbca3fa25d6ad7]

> drill-hive-exec-shaded contains commons-lang3 version 3.1
> -
>
> Key: DRILL-7586
> URL: https://issues.apache.org/jira/browse/DRILL-7586
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Oleg Zinoviev
>Assignee: Oleg Zinoviev
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> org.apache.hive:hive-exec contains shaded commons-lang3 library version 3.1.
> During the Drill build, these classes are added to the Vasya library. This 
> causes an incorrect version of StringUtils to be loaded in runtime (since the 
> jars directory is found in classpath earlier than jars/3rdparty)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7582) Drill docker Web UI doesn't show resources usage information if map the container to a non-default port

2020-02-14 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7582:
---

Assignee: Anton Gozhiy

> Drill docker Web UI doesn't show resources usage information if map the 
> container to a non-default port
> ---
>
> Key: DRILL-7582
> URL: https://issues.apache.org/jira/browse/DRILL-7582
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
>
> *Steps:*
> # Run Drill docker container with non-default port published:
> {noformat}
> $ docker container run -it --rm -p 9047:8047 apache/drill
> {noformat}
> # Open Drill Web UI (localhost:9047)
> *Expected result:*
> The following fields should contain relevant information:
> * Heap Memory Usage
> * Direct Memory Usage
> * CPU Usage
> * Avg Sys Load
> * Uptime
> *Actual result:*
> "Not Available" is displayed.
> *Note:* if publish the default port (-p 8047:8047), everything is showed 
> correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7582) Drill docker Web UI doesn't show resources usage information if map the container to a non-default port

2020-02-13 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7582:
---

 Summary: Drill docker Web UI doesn't show resources usage 
information if map the container to a non-default port
 Key: DRILL-7582
 URL: https://issues.apache.org/jira/browse/DRILL-7582
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy


*Steps:*
# Run Drill docker container with non-default port published:
{noformat}
$ docker container run -it --rm -p 9047:8047 apache/drill
{noformat}
# Open Drill Web UI (localhost:9047)

*Expected result:*
The following fields should contain relevant information:
* Heap Memory Usage
* Direct Memory Usage
* CPU Usage
* Avg Sys Load
* Uptime

*Actual result:*
"Not Available" is displayed.

*Note:* if publish the default port (-p 8047:8047), everything is showed 
correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7529) Building depends on poorly configured uncommon maven repositories

2020-02-12 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7529:

Priority: Trivial  (was: Blocker)

> Building depends on poorly configured uncommon maven repositories
> -
>
> Key: DRILL-7529
> URL: https://issues.apache.org/jira/browse/DRILL-7529
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.17.0
>Reporter: Niels Basjes
>Priority: Trivial
>
> *Summary: Apache Drill depends on modified copies of open software that is 
> hosted on non-standard (company owned) maven repositories. In addition due to 
> poor security configuration maven simply refuses to download the artifacts 
> from one of those.* 
> That is why I tagged this a blocker.
>  
> I have an open source project where I include a Drill UDF so other can use 
> this in Drill. ( [https://yauaa.basjes.nl/UDF-ApacheDrill.html] ).
> Today I tried to update the drill dependency from 1.16 to 1.17
> Resulting in 
> {{[ERROR] Failed to execute goal on project yauaa-drill: Could not resolve 
> dependencies for project 
> nl.basjes.parse.useragent:yauaa-drill:jar:5.15-SNAPSHOT: The following 
> artifacts could not be resolved: 
> com.github.vvysotskyi.drill-calcite:calcite-core:jar:1.20.0-drill-r2, 
> org.kohsuke:libpam4j:jar:1.8-rev2: Failure to find 
> com.github.vvysotskyi.drill-calcite:calcite-core:jar:1.20.0-drill-r2 in 
> https://oss.sonatype.org/content/repositories/snapshots/ was cached in the 
> local repository, resolution will not be reattempted until the update 
> interval of Sonatype snapshots has elapsed or updates are forced -> [Help 1]}}
> Turns out that 
> {{com.github.vvysotskyi.drill-calcite:calcite-core:jar:1.20.0-drill-r2}} is 
> (most likely) based here [https://github.com/vvysotskyi/drill-calcite/]
> Apparently this is a patched version of Calcite that is hosted under a 
> personal account but IS an important dependency of a released version of 
> Drill. (Side question: Why not simply improve calcite with these changes?)
> It took some digging and I found this one on this non standard maven 
> repository operated by a commercial company:
> [https://repository.mulesoft.org/nexus/content/repositories/public/]
>  
> The second dependency it failed over was even worse. 
>  {{org.kohsuke:libpam4j:jar:1.8-rev2}}
> This project IS present in maven central but NOT this specific version.
> [https://search.maven.org/artifact/org.kohsuke/libpam4j]
> The only place I have found this is here
> [https://repository.mapr.com/nexus/content/groups/mapr-public/]
> I did not encounter the sourcecode "github" for this modified version yet.
> So effectively I was forced to include two "company" repos in my project to 
> get it to build ... so you would think.
>  
> With these two in my pom.xml I got a new error which was much more 
> disturbing...
> {{[ERROR] Failed to execute goal on project yauaa-drill: Could not resolve 
> dependencies for project 
> nl.basjes.parse.useragent:yauaa-drill:jar:5.15-SNAPSHOT: Failed to collect 
> dependencies at org.apache.drill.exec:drill-java-exec:jar:1.17.0 -> 
> org.kohsuke:libpam4j:jar:1.8-rev2: Failed to read artifact descriptor for 
> org.kohsuke:libpam4j:jar:1.8-rev2: Could not transfer artifact 
> org.kohsuke:libpam4j:pom:1.8-rev2 from/to MapR 
> (https://repository.mapr.com/nexus/content/groups/mapr-public/): 
> sun.security.validator.ValidatorException: PKIX path building failed: 
> sun.security.provider.certpath.SunCertPathBuilderException: unable to find 
> valid certification path to requested target -> [Help 1]}}
> This error essentially means: Maven does not trust the certificate path of 
> the provided HTTPS so it refused to download the artifact.
> Why? This MapR site uses a wildcard certificate issues by GoDaddy.
> Apparently this is a "well known" problem with these GoDaddy certificates: 
> [https://tozny.com/blog/godaddys-ssl-certs-dont-work-in-java-the-right-solution/]
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7149) Kerberos Code Missing from Drill on YARN

2020-01-15 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7149:
---

Assignee: Anton Gozhiy  (was: Anton Gozhiy)

> Kerberos Code Missing from Drill on YARN
> 
>
> Key: DRILL-7149
> URL: https://issues.apache.org/jira/browse/DRILL-7149
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: 1.14.0
>Reporter: Charles Givre
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: kerberos, security
>
> My company is trying to deploy Drill using the Drill on Yarn (DoY) and we 
> have run into the issue that DoY does not seem to support passing Kerberos 
> credentials in order to interact with HDFS. 
> Upon checking the source code available in GIT 
> (https://github.com/apache/drill/blob/1.14.0/drill-yarn/src/main/java/org/apache/drill/yarn/core/)
>  and referring to Apache YARN documentation 
> (https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html)
>  , we saw no section for passing the security credentials needed by the 
> application to interact with any Hadoop cluster services and applications. 
> This we feel needs to be added to the source code so that delegation tokens 
> can be passed inside the container for the process to be able access Drill 
> archive on HDFS and start. It probably should be added to the 
> ContainerLaunchContext within the ApplicationSubmissionContext for DoY as 
> suggested under Apache documentation.
>  
> We tried the same DoY utility on a non-kerberised cluster and the process 
> started well. Although we ran into a different issue there of hosts getting 
> blacklisted
> We tested with the Single Principal per cluster option.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7149) Kerberos Code Missing from Drill on YARN

2020-01-15 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016003#comment-17016003
 ] 

Anton Gozhiy commented on DRILL-7149:
-

I was able to successfully start Dril-on-Yarn with Kerberos security (Drill 
version: 1.18.0-SNAPSHOT, commit 755529f3ac7ca77797f68b60e1d0713ad126e227).
[~cgivre] , if you still have this issue, could you please provide some 
details, such as:
 * Your configuration (hadoop version, config files etc.)
 * Steps to reproduce
 * Expected result
 * Actual result

> Kerberos Code Missing from Drill on YARN
> 
>
> Key: DRILL-7149
> URL: https://issues.apache.org/jira/browse/DRILL-7149
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: 1.14.0
>Reporter: Charles Givre
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: kerberos, security
>
> My company is trying to deploy Drill using the Drill on Yarn (DoY) and we 
> have run into the issue that DoY does not seem to support passing Kerberos 
> credentials in order to interact with HDFS. 
> Upon checking the source code available in GIT 
> (https://github.com/apache/drill/blob/1.14.0/drill-yarn/src/main/java/org/apache/drill/yarn/core/)
>  and referring to Apache YARN documentation 
> (https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html)
>  , we saw no section for passing the security credentials needed by the 
> application to interact with any Hadoop cluster services and applications. 
> This we feel needs to be added to the source code so that delegation tokens 
> can be passed inside the container for the process to be able access Drill 
> archive on HDFS and start. It probably should be added to the 
> ContainerLaunchContext within the ApplicationSubmissionContext for DoY as 
> suggested under Apache documentation.
>  
> We tried the same DoY utility on a non-kerberised cluster and the process 
> started well. Although we ran into a different issue there of hosts getting 
> blacklisted
> We tested with the Single Principal per cluster option.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7520) Cannot connect to Drill with PLAIN authentication enabled using JDBC client (mapr profile)

2020-01-13 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014257#comment-17014257
 ] 

Anton Gozhiy commented on DRILL-7520:
-

I made a fix by adding back some missing libraries. Here is the branch: 
https://github.com/agozhiy/drill/tree/DRILL-7520. A one major disadvantage of 
it is that the jdbc-all jar now contains MapR native client libs, and it is 
bloated to 96919225B. 
The better solution would be yo divide the jdbc-all to separate modules. There 
is already a task for this: DRILL-6800. 

> Cannot connect to Drill with PLAIN authentication enabled using JDBC client 
> (mapr profile)
> --
>
> Key: DRILL-7520
> URL: https://issues.apache.org/jira/browse/DRILL-7520
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.18.0
>
>
> *Prerequisites:*
> # Drill with the JDBC driver is built with "mapr" profile
> # Security is enabled and PLAIN authentication is configured
> *Steps:*
> # Use some external JDBC client to connect (e.g. DBeaver)
> # Connection string: "jdbc:drill:drillbit=node1:31010"
> # Set appropriate user/password
> # Test Connection
> *Expected result:*
> Connection successful.
> *Actual result:*
> Exception happens:
> {noformat}
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Exception in thread "main" java.sql.SQLNonTransientConnectionException: 
> Failure in connecting to Drill: oadd.org.apache.drill.exec.rpc.RpcException: 
> HANDSHAKE_VALIDATION : org/apache/hadoop/conf/Configuration
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:178)
>   at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>   at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>   at 
> oadd.org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>   at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:247)
>   at TheBestClientEver.main(TheBestClientEver.java:28)
> Caused by: oadd.org.apache.drill.exec.rpc.RpcException: HANDSHAKE_VALIDATION 
> : org/apache/hadoop/conf/Configuration
>   at 
> oadd.org.apache.drill.exec.rpc.user.UserClient$2.connectionFailed(UserClient.java:315)
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler.connectionFailed(QueryResultHandler.java:396)
>   at 
> oadd.org.apache.drill.exec.rpc.ConnectionMultiListener$HandshakeSendHandler.success(ConnectionMultiListener.java:170)
>   at 
> oadd.org.apache.drill.exec.rpc.ConnectionMultiListener$HandshakeSendHandler.success(ConnectionMultiListener.java:143)
>   at 
> oadd.org.apache.drill.exec.rpc.RequestIdMap$RpcListener.set(RequestIdMap.java:134)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClient$ClientHandshakeHandler.consumeHandshake(BasicClient.java:318)
>   at 
> oadd.org.apache.drill.exec.rpc.AbstractHandshakeHandler.decode(AbstractHandshakeHandler.java:57)
>   at 
> oadd.org.apache.drill.exec.rpc.AbstractHandshakeHandler.decode(AbstractHandshakeHandler.java:29)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
>   at 
> 

[jira] [Updated] (DRILL-7520) Cannot connect to Drill with PLAIN authentication enabled using JDBC client (mapr profile)

2020-01-08 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7520:

Summary: Cannot connect to Drill with PLAIN authentication enabled using 
JDBC client (mapr profile)  (was: Cannot connect to Drill with PLAIN 
authentication enabled using JDBC client)

> Cannot connect to Drill with PLAIN authentication enabled using JDBC client 
> (mapr profile)
> --
>
> Key: DRILL-7520
> URL: https://issues.apache.org/jira/browse/DRILL-7520
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Prerequisites:*
> # Drill with the JDBC driver is built with "mapr" profile
> # Security is enabled and PLAIN authentication is configured
> *Steps:*
> # Use some external JDBC client to connect (e.g. DBeaver)
> # Connection string: "jdbc:drill:drillbit=node1:31010"
> # Set appropriate user/password
> # Test Connection
> *Expected result:*
> Connection successful.
> *Actual result:*
> Exception happens:
> {noformat}
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Exception in thread "main" java.sql.SQLNonTransientConnectionException: 
> Failure in connecting to Drill: oadd.org.apache.drill.exec.rpc.RpcException: 
> HANDSHAKE_VALIDATION : org/apache/hadoop/conf/Configuration
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:178)
>   at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>   at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>   at 
> oadd.org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>   at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
>   at java.sql.DriverManager.getConnection(DriverManager.java:664)
>   at java.sql.DriverManager.getConnection(DriverManager.java:247)
>   at TheBestClientEver.main(TheBestClientEver.java:28)
> Caused by: oadd.org.apache.drill.exec.rpc.RpcException: HANDSHAKE_VALIDATION 
> : org/apache/hadoop/conf/Configuration
>   at 
> oadd.org.apache.drill.exec.rpc.user.UserClient$2.connectionFailed(UserClient.java:315)
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler.connectionFailed(QueryResultHandler.java:396)
>   at 
> oadd.org.apache.drill.exec.rpc.ConnectionMultiListener$HandshakeSendHandler.success(ConnectionMultiListener.java:170)
>   at 
> oadd.org.apache.drill.exec.rpc.ConnectionMultiListener$HandshakeSendHandler.success(ConnectionMultiListener.java:143)
>   at 
> oadd.org.apache.drill.exec.rpc.RequestIdMap$RpcListener.set(RequestIdMap.java:134)
>   at 
> oadd.org.apache.drill.exec.rpc.BasicClient$ClientHandshakeHandler.consumeHandshake(BasicClient.java:318)
>   at 
> oadd.org.apache.drill.exec.rpc.AbstractHandshakeHandler.decode(AbstractHandshakeHandler.java:57)
>   at 
> oadd.org.apache.drill.exec.rpc.AbstractHandshakeHandler.decode(AbstractHandshakeHandler.java:29)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>   at 
> oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
>   at 
> oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>   at 
> oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>   at 
> 

[jira] [Created] (DRILL-7520) Cannot connect to Drill with PLAIN authentication enabled using JDBC client

2020-01-08 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7520:
---

 Summary: Cannot connect to Drill with PLAIN authentication enabled 
using JDBC client
 Key: DRILL-7520
 URL: https://issues.apache.org/jira/browse/DRILL-7520
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy


*Prerequisites:*
# Drill with the JDBC driver is built with "mapr" profile
# Security is enabled and PLAIN authentication is configured

*Steps:*
# Use some external JDBC client to connect (e.g. DBeaver)
# Connection string: "jdbc:drill:drillbit=node1:31010"
# Set appropriate user/password
# Test Connection

*Expected result:*
Connection successful.

*Actual result:*
Exception happens:
{noformat}
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
Exception in thread "main" java.sql.SQLNonTransientConnectionException: Failure 
in connecting to Drill: oadd.org.apache.drill.exec.rpc.RpcException: 
HANDSHAKE_VALIDATION : org/apache/hadoop/conf/Configuration
at 
org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:178)
at 
org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
at 
org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
at 
oadd.org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
at org.apache.drill.jdbc.Driver.connect(Driver.java:75)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at TheBestClientEver.main(TheBestClientEver.java:28)
Caused by: oadd.org.apache.drill.exec.rpc.RpcException: HANDSHAKE_VALIDATION : 
org/apache/hadoop/conf/Configuration
at 
oadd.org.apache.drill.exec.rpc.user.UserClient$2.connectionFailed(UserClient.java:315)
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler.connectionFailed(QueryResultHandler.java:396)
at 
oadd.org.apache.drill.exec.rpc.ConnectionMultiListener$HandshakeSendHandler.success(ConnectionMultiListener.java:170)
at 
oadd.org.apache.drill.exec.rpc.ConnectionMultiListener$HandshakeSendHandler.success(ConnectionMultiListener.java:143)
at 
oadd.org.apache.drill.exec.rpc.RequestIdMap$RpcListener.set(RequestIdMap.java:134)
at 
oadd.org.apache.drill.exec.rpc.BasicClient$ClientHandshakeHandler.consumeHandshake(BasicClient.java:318)
at 
oadd.org.apache.drill.exec.rpc.AbstractHandshakeHandler.decode(AbstractHandshakeHandler.java:57)
at 
oadd.org.apache.drill.exec.rpc.AbstractHandshakeHandler.decode(AbstractHandshakeHandler.java:29)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 
oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at 

[jira] [Assigned] (DRILL-7474) Reduce size of Drill's tar.gz file

2019-12-10 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7474:
---

Assignee: Anton Gozhiy  (was: Anton Gozhiy)

> Reduce size of Drill's tar.gz file
> --
>
> Key: DRILL-7474
> URL: https://issues.apache.org/jira/browse/DRILL-7474
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.17.0
>Reporter: Igor Guzenko
>Assignee: Anton Gozhiy
>Priority: Major
>
> The size difference between *_apache-drill-1.16.0.tar.gz_* and 
> _*apache-drill-1.17.0.tar.gz*_ is 124 Mb. Purpose of the Jira is to 
> investigate and try to reduce the difference. 
> Most of the added size caused by the appearance of the 
> *aws-java-sdk-bundle-1.11.375.jar* and *poi-*.jar* files in our dependencies. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7475) Impersonation on local file system

2019-12-10 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16992505#comment-16992505
 ] 

Anton Gozhiy commented on DRILL-7475:
-

[~kstyrc], It would be very helpful, if you describe the issue according to the 
common bug reporting standards:
 * Environment / Initial Conditions
 * Steps To Reproduce
 * Expected Result
 * Actual Result
 * Additional Information / Notes (if any)

 

> Impersonation on local file system
> --
>
> Key: DRILL-7475
> URL: https://issues.apache.org/jira/browse/DRILL-7475
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Krzysztof Styrc
>Priority: Minor
>
> Hi,
> we'd like to setup Drill to as SQL interface for files stored on local file 
> system (non HDFS) with multi user access - each user/group authorized to 
> access only selected tables/views.
>  
> In order to achieve this we've configured Drill with plain PAM authentication 
> + impersonation following the docs:
> [https://drill.apache.org/docs/configuring-plain-security/]
> [https://drill.apache.org/docs/configuring-user-impersonation/]
> We've ended up with the following ```drill-override.conf``` config:
> {code:java}
> drill.exec: {
>   cluster-id: "unit8drill",
>   zk.connect: "localhost:2181",
>   impersonation: {
> enabled: true,
>   },
>   security: {
> auth.mechanisms : ["PLAIN"],
>   },
>   security.user.auth: {
> enabled: true,
> packages += "org.apache.drill.exec.rpc.user.security",
> impl: "pam4j",
> pam_profiles: [ "sudo", "login" ],
>   }
> }
> {code}
> The Drill process runs as root in order to have access to ```/etc/shadow``` 
> etc.
>  
> Authentication works fine. We're able to use sqlline as well as Web UI in 
> order to run SQL queries. Also, users that are in the root group have access 
> to Storage, Threads and Logs tabs.
>  
> Unfortunately, all the users have access to all tables/directories/views, 
> regardless of the permissions set on the local file system. Furthermore, 
> inspecting the Drill process with auditctl reveals that the Drill process 
> user (root) is accessing the files instead of impersonating user as one would 
> expect while using impersonation.
>  
> Attaching with java debugger also reveals that even though it's local file 
> system, Drill uses ```ProxyLocalFileSystem``` from hive-exec JAR in 
> ```ImpersonationUtil.createFileSystem(...)```.
>  
> The question is, does Drill support RBAC on local file system? If so, what 
> could we be doing wrong?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7470) drill-yarn unit tests print stack traces with NoSuchMethodError

2019-12-06 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7470:
---

Assignee: Anton Gozhiy

> drill-yarn unit tests print stack traces with NoSuchMethodError
> ---
>
> Key: DRILL-7470
> URL: https://issues.apache.org/jira/browse/DRILL-7470
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Vova Vysotskyi
>Assignee: Anton Gozhiy
>Priority: Minor
>
> Looks like it was caused by the Hadoop update.
> *Steps to reproduce:*
> 1. run {{mvn clean install}}
> 2. wait until drill-yarn unit tests are finished
> 3. check output
> *Expected output:*
> {noformat}
> [INFO] --- maven-surefire-plugin:3.0.0-M3:test (default-test) @ drill-yarn ---
> [INFO] 
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.drill.yarn.zk.TestAmRegistration
> [INFO] Running org.apache.drill.yarn.zk.TestZkRegistry
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.096 
> s - in org.apache.drill.yarn.zk.TestAmRegistration
> [INFO] Running org.apache.drill.yarn.client.TestCommandLineOptions
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 
> s - in org.apache.drill.yarn.client.TestCommandLineOptions
> [INFO] Running org.apache.drill.yarn.client.TestClient
> [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.057 
> s - in org.apache.drill.yarn.client.TestClient
> [INFO] Running org.apache.drill.yarn.scripts.TestScripts
> [WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
> 0.001 s - in org.apache.drill.yarn.scripts.TestScripts
> [INFO] Running org.apache.drill.yarn.core.TestConfig
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.307 
> s - in org.apache.drill.yarn.core.TestConfig
> [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.028 
> s - in org.apache.drill.yarn.zk.TestZkRegistry
> [INFO] 
> [INFO] Results:
> [INFO] 
> [WARNING] Tests run: 11, Failures: 0, Errors: 0, Skipped: 1
> [INFO] 
> [INFO] 
> [INFO] --- maven-surefire-plugin:3.0.0-M3:test (metastore-test) @ drill-yarn 
> ---
> {noformat}
> *Actual output*
> {noformat}
> [INFO] --- maven-surefire-plugin:3.0.0-M3:test (default-test) @ drill-yarn ---
> [INFO] 
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> Failed to instantiate [ch.qos.logback.classic.LoggerContext]
> Reported exception:
> java.lang.NoSuchMethodError: 
> ch.qos.logback.core.util.Loader.getResourceOccurrenceCount(Ljava/lang/String;Ljava/lang/ClassLoader;)Ljava/util/Set;
>   at 
> ch.qos.logback.classic.util.ContextInitializer.multiplicityWarning(ContextInitializer.java:158)
>   at 
> ch.qos.logback.classic.util.ContextInitializer.statusOnResourceSearch(ContextInitializer.java:181)
>   at 
> ch.qos.logback.classic.util.ContextInitializer.findConfigFileURLFromSystemProperties(ContextInitializer.java:109)
>   at 
> ch.qos.logback.classic.util.ContextInitializer.findURLOfDefaultConfigurationFile(ContextInitializer.java:118)
>   at 
> ch.qos.logback.classic.util.ContextInitializer.autoConfig(ContextInitializer.java:146)
>   at org.slf4j.impl.StaticLoggerBinder.init(StaticLoggerBinder.java:85)
>   at 
> org.slf4j.impl.StaticLoggerBinder.(StaticLoggerBinder.java:55)
>   at org.slf4j.LoggerFactory.bind(LoggerFactory.java:150)
>   at org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:124)
>   at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:412)
>   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:357)
>   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383)
>   at 
> org.apache.drill.common.util.ProtobufPatcher.(ProtobufPatcher.java:33)
>   at org.apache.drill.test.BaseTest.(BaseTest.java:35)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> 

[jira] [Created] (DRILL-7465) Revise the approach of handling dependencies with incompatible versions

2019-12-04 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7465:
---

 Summary: Revise the approach of handling dependencies with 
incompatible versions
 Key: DRILL-7465
 URL: https://issues.apache.org/jira/browse/DRILL-7465
 Project: Apache Drill
  Issue Type: Task
Reporter: Anton Gozhiy


In continuation of the conversation started 
[here|https://github.com/apache/drill/pull/1910].
Transitive dependencies with different versions is a common problem. The first 
and obvious solution would be to add them to Dependency Management in a 
pom.xml, and if backward compatibility is preserved it will work. But often 
there are changes in API, especially when major versions are different.

Current approaches used in Drill to handle this situation:
 * Using Maven Shade plugin
 ** Pros:
 *** Solves the problem, as libraries use their target dependency version.
 ** Cons:
 *** Requires a lot of changes in code and some tricky work bringing all 
component libraries together and relocating them.
 *** Will probably increase the jar size.
 * Patching conflicting classes with Javassist (Guava and Protobuf)
 ** Pros:
 *** Easier than shading to implement.
 *** Only one dependency version is used.
 ** Cons:
 *** This is a dark magic )
 *** It is hard to find all places that need a patch. This may cause some ugly 
exceptions when you least expect them.
 *** Often needs rework after the library version upgrade.
 *** For this to work, patching must happen in the first place. This can 
theoretically cause race conditions.
 *** Extending the previous paragraph, patching is need to be done before all 
tests so they all should be inherited from one with patching. This is obviously 
an overhead.

The idea of this task is to stop using patching at all due to it's cons and 
change it either to shading or some other approach if it exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7208) Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources

2019-11-29 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7208:

Labels: ready-to-commit  (was: )

> Drill commit is not shown if build Drill from the 1.16.0-rc1 release sources
> 
>
> Key: DRILL-7208
> URL: https://issues.apache.org/jira/browse/DRILL-7208
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> *Steps:*
>  # Download the rc1 sources tarball:
>  
> [apache-drill-1.16.0-src.tar.gz|http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/apache-drill-1.16.0-src.tar.gz]
>  # Unpack
>  # Build:
> {noformat}
> mvn clean install -DskipTests
> {noformat}
>  # Start Drill in embedded mode:
> {noformat}
> Linux:
> distribution/target/apache-drill-1.16.0/apache-drill-1.16.0/bin/drill-embedded
> Windows:
> distribution\target\apache-drill-1.16.0\apache-drill-1.16.0\bin\sqlline.bat 
> -u "jdbc:drill:zk=local"
> {noformat}
>  # Run the query:
> {code:sql}
> select * from sys.version;
> {code}
> *Expected result:*
>  Drill version, commit_id, commit_message, commit_time, build_email, 
> build_time should be correctly displayed.
> *Actual result:*
> {noformat}
> apache drill> select * from sys.version;
> +-+---++-+-++
> | version | commit_id | commit_message | commit_time | build_email | 
> build_time |
> +-+---++-+-++
> | 1.16.0  | Unknown   || | Unknown |  
>   |
> +-+---++-+-++
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7393) Revisit Drill tests to ensure that patching is executed before any test run

2019-11-26 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7393:
---

Assignee: Anton Gozhiy

> Revisit Drill tests to ensure that patching is executed before any test run
> ---
>
> Key: DRILL-7393
> URL: https://issues.apache.org/jira/browse/DRILL-7393
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0, 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Anton Gozhiy
>Priority: Major
>
> Apache Drill patches some Protobuf and Guava classes (see GuavaPatcher, 
> ProtobufPatcher), patching should be done before classes to be patched are 
> loaded. That's why this operation is executed in static block in Drillbit 
> class.
> Some tests in java-exec module use Drillbit class, some extend DrillTest 
> class, both of them patch Guava. But there are some tests that do not call 
> patcher but load classes to be patched. For example, 
> {{org.apache.drill.exec.sql.TestSqlBracketlessSyntax}} loads Guava 
> Preconditions class. If such tests run before tests that require patching, 
> tests run will fail since patching won't be successful. Patchers code does 
> not fail application if patching was not complete, just logs warning 
> ({{logger.warn("Unable to patch Guava classes.", e);}}), so sometimes it hard 
> to identify unit tests failure root cause.
> We need to revisit all Drill tests to ensure that all of them extend common 
> test base class which patchers Protobuf and Guava classes in static block. 
> Also refactor Patcher classes to have assert to fail if patching fails during 
> unit testing if there are any problems.
> After all tests are revised, we can remove {{metastore-test}} execution from 
> main.xml in {{maven-surefire-plugin}} which was added to ensure that all 
> Metastore tests run in a separate JVM where patching is done in first place 
> since Iceberg Metastore heavily depends on patched Guava Preconditions class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7440) Failure during loading of RepeatedCount functions

2019-11-06 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7440:
---

 Summary: Failure during loading of RepeatedCount functions
 Key: DRILL-7440
 URL: https://issues.apache.org/jira/browse/DRILL-7440
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy


*Steps:*
# Start Drillbit
# Look at the drillbit.log

*Expected result:* No exceptions should be present.

*Actual result:*
Null Pointer Exceptions occur:
{noformat}
2019-11-06 03:06:40,401 [main] WARN  o.a.d.exec.expr.fn.FunctionConverter - 
Failure loading function class 
org.apache.drill.exec.expr.fn.impl.RepeatedCountFunctions$RepeatedCountRepeatedDict,
 field input. Message: Failure while trying to access the ValueHolder's TYPE 
static variable.  All ValueHolders must contain a static TYPE variable that 
defines their MajorType.
java.lang.NullPointerException: null
at 
sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57) 
~[na:1.8.0_171]
at 
sun.reflect.UnsafeObjectFieldAccessorImpl.get(UnsafeObjectFieldAccessorImpl.java:36)
 ~[na:1.8.0_171]
at java.lang.reflect.Field.get(Field.java:393) ~[na:1.8.0_171]
at 
org.apache.drill.exec.expr.fn.FunctionConverter.getStaticFieldValue(FunctionConverter.java:220)
 ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.FunctionConverter.getHolder(FunctionConverter.java:136)
 ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.validate(LocalFunctionRegistry.java:130)
 [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.(LocalFunctionRegistry.java:88)
 [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.(FunctionImplementationRegistry.java:113)
 [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.server.DrillbitContext.(DrillbitContext.java:118) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.work.WorkManager.start(WorkManager.java:116) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:222) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:581) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:551) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:547) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
2019-11-06 03:06:40,402 [main] WARN  o.a.d.e.e.f.r.LocalFunctionRegistry - 
Unable to initialize function for class 
org.apache.drill.exec.expr.fn.impl.RepeatedCountFunctions$RepeatedCountRepeatedDict
2019-11-06 03:06:40,487 [main] WARN  o.a.d.exec.expr.fn.FunctionConverter - 
Failure loading function class 
org.apache.drill.exec.expr.fn.impl.gaggr.CountFunctions$RepeatedDictCountFunction,
 field in. Message: Failure while trying to access the ValueHolder's TYPE 
static variable.  All ValueHolders must contain a static TYPE variable that 
defines their MajorType.
java.lang.NullPointerException: null
at 
sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57) 
~[na:1.8.0_171]
at 
sun.reflect.UnsafeObjectFieldAccessorImpl.get(UnsafeObjectFieldAccessorImpl.java:36)
 ~[na:1.8.0_171]
at java.lang.reflect.Field.get(Field.java:393) ~[na:1.8.0_171]
at 
org.apache.drill.exec.expr.fn.FunctionConverter.getStaticFieldValue(FunctionConverter.java:220)
 ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.FunctionConverter.getHolder(FunctionConverter.java:136)
 ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.validate(LocalFunctionRegistry.java:130)
 [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.(LocalFunctionRegistry.java:88)
 [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.(FunctionImplementationRegistry.java:113)
 [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 
org.apache.drill.exec.server.DrillbitContext.(DrillbitContext.java:118) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.work.WorkManager.start(WorkManager.java:116) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:222) 
[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
at 

[jira] [Created] (DRILL-7429) Wrong column order when selecting complex data using Hive storage plugin.

2019-10-30 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7429:
---

 Summary: Wrong column order when selecting complex data using Hive 
storage plugin.
 Key: DRILL-7429
 URL: https://issues.apache.org/jira/browse/DRILL-7429
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Anton Gozhiy
 Attachments: customer_complex.zip

*Data:*
customer_complex.zip attached

*Query:*
{code:sql}
select t3.a, t3.b from (select t2.a, t2.a.o_lineitems[1].l_part.p_name b from 
(select t1.c_orders[0] a from hive.customer_complex t1) t2) t3 limit 1
{code}

*Expected result:*
Column order: a, b

*Actual result:*
Column order: b, a

*Physical plan:*
{noformat}
00-00Screen
00-01  Project(a=[ROW($0, $1, $2, $3, $4, $5, $6, $7)], b=[$8])
00-02Project(a=[ITEM($0, 0).o_orderstatus], a1=[ITEM($0, 
0).o_totalprice], a2=[ITEM($0, 0).o_orderdate], a3=[ITEM($0, 
0).o_orderpriority], a4=[ITEM($0, 0).o_clerk], a5=[ITEM($0, 0).o_shippriority], 
a6=[ITEM($0, 0).o_comment], a7=[ITEM($0, 0).o_lineitems], 
b=[ITEM(ITEM(ITEM(ITEM($0, 0).o_lineitems, 1), 'l_part'), 'p_name')])
00-03  Project(c_orders=[$0])
00-04SelectionVectorRemover
00-05  Limit(fetch=[10])
00-06Scan(table=[[hive, customer_complex]], 
groupscan=[HiveDrillNativeParquetScan [entries=[ReadEntryWithPath 
[path=/drill/customer_complex/00_0]], numFiles=1, numRowGroups=1, 
columns=[`c_orders`[0].`o_orderstatus`, `c_orders`[0].`o_totalprice`, 
`c_orders`[0].`o_orderdate`, `c_orders`[0].`o_orderpriority`, 
`c_orders`[0].`o_clerk`, `c_orders`[0].`o_shippriority`, 
`c_orders`[0].`o_comment`, `c_orders`[0].`o_lineitems`, 
`c_orders`[0].`o_lineitems`[1].`l_part`.`p_name`]]])
{noformat}

*Note:* Reproduced with both Hive and Native readers. Non-reproducible with 
Parquet reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-6540) Upgrade to HADOOP-3.0 libraries

2019-10-10 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-6540:
---

Assignee: Anton Gozhiy  (was: Vitalii Diravka)

> Upgrade to HADOOP-3.0 libraries 
> 
>
> Key: DRILL-6540
> URL: https://issues.apache.org/jira/browse/DRILL-6540
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: Anton Gozhiy
>Priority: Major
>
> Currently Drill uses 2.7.4 version of hadoop libraries (hadoop-common, 
> hadoop-hdfs, hadoop-annotations, hadoop-aws, hadoop-yarn-api, hadoop-client, 
> hadoop-yarn-client).
> A year ago the [Hadoop 3.0|https://hadoop.apache.org/docs/r3.0.0/index.html] 
> was released and recently it was updated to [Hadoop 
> 3.2.0|https://hadoop.apache.org/docs/r3.2.0/].
> To use Drill under Hadoop3.0 distribution we need this upgrade. Also the 
> newer version includes new features, which can be useful for Drill.
>  This upgrade is also needed to leverage the newest version of Zookeeper 
> libraries and Hive 3.1 version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7381) Query to a map field returns nulls with hive native reader

2019-10-01 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7381.
---

Verified with Drill 1.17-SNAPSHOT (commit 
d2645c7638a88a4afd162bc3f1e2d65353ca3a67).

> Query to a map field returns nulls with hive native reader
> --
>
> Key: DRILL-7381
> URL: https://issues.apache.org/jira/browse/DRILL-7381
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_nation.n_region.r_name from hive.customer_complex t limit 5
> {code}
> *Expected results:*
> {noformat}
> AFRICA
> MIDDLE EAST
> AMERICA
> MIDDLE EAST
> AMERICA
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> {noformat}
> *Workaround:*
> {code:sql}
> set store.hive.optimize_scan_with_native_readers = false;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-10-01 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7380.
---

Verified with Drill 1.17-SNAPSHOT (commit 
d2645c7638a88a4afd162bc3f1e2d65353ca3a67).

> Query of a field inside of an array of structs returns null
> ---
>
> Key: DRILL-7380
> URL: https://issues.apache.org/jira/browse/DRILL-7380
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Attachments: customer_complex.zip
>
>
> *Query:*
> {code:sql}
> select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
> {code}
> *Expected results (given from Hive):*
> {noformat}
> OK
> O
> F
> NULL
> O
> O
> NULL
> O
> O
> NULL
> F
> {noformat}
> *Actual results:*
> {noformat}
> null
> null
> null
> null
> null
> null
> null
> null
> null
> null
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7381) Query to a map field returns nulls with hive native reader

2019-09-19 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7381:
---

 Summary: Query to a map field returns nulls with hive native reader
 Key: DRILL-7381
 URL: https://issues.apache.org/jira/browse/DRILL-7381
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy
 Attachments: customer_complex.zip

*Query:*
{code:sql}
select t.c_nation.n_region.r_name from hive.customer_complex t limit 5
{code}

*Expected results:*
{noformat}
AFRICA
MIDDLE EAST
AMERICA
MIDDLE EAST
AMERICA
{noformat}

*Actual results:*
{noformat}
null
null
null
null
null
{noformat}

*Workaround:*

{code:sql}
set store.hive.optimize_scan_with_native_readers = false;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7380) Query of a field inside of an array of structs returns null

2019-09-19 Thread Anton Gozhiy (Jira)
Anton Gozhiy created DRILL-7380:
---

 Summary: Query of a field inside of an array of structs returns 
null
 Key: DRILL-7380
 URL: https://issues.apache.org/jira/browse/DRILL-7380
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Anton Gozhiy
 Attachments: customer_complex.zip

*Query:*
{code:sql}
select t.c_orders[0].o_orderstatus from hive.customer_complex t limit 10;
{code}

*Expected results (given from Hive):*
{noformat}
OK
O
F
NULL
O
O
NULL
O
O
NULL
F
{noformat}

*Actual results:*
{noformat}
null
null
null
null
null
null
null
null
null
null
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7222) Visualize estimated and actual row counts for a query

2019-08-22 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7222:

Labels: doc-impacting ready-to-commit user-experience  (was: doc-impacting 
user-experience)

> Visualize estimated and actual row counts for a query
> -
>
> Key: DRILL-7222
> URL: https://issues.apache.org/jira/browse/DRILL-7222
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit, user-experience
> Fix For: 1.17.0
>
>
> With statistics in place, it would be useful to have the *estimated* rowcount 
> along side the *actual* rowcount query profile's operator overview.
> We can extract this from the Physical Plan section of the profile.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (DRILL-7222) Visualize estimated and actual row counts for a query

2019-08-22 Thread Anton Gozhiy (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7222:

Labels: doc-impacting user-experience  (was: doc-impacting ready-to-commit 
user-experience)

> Visualize estimated and actual row counts for a query
> -
>
> Key: DRILL-7222
> URL: https://issues.apache.org/jira/browse/DRILL-7222
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, user-experience
> Fix For: 1.17.0
>
>
> With statistics in place, it would be useful to have the *estimated* rowcount 
> along side the *actual* rowcount query profile's operator overview.
> We can extract this from the Physical Plan section of the profile.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (DRILL-7351) WebUI is Vulnerable to CSRF

2019-08-20 Thread Anton Gozhiy (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911212#comment-16911212
 ] 

Anton Gozhiy commented on DRILL-7351:
-

Thank you, much better now.

> WebUI is Vulnerable to CSRF
> ---
>
> Key: DRILL-7351
> URL: https://issues.apache.org/jira/browse/DRILL-7351
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Don Perial
>Priority: Major
> Attachments: Screen Shot 2019-08-19 at 10.11.50 AM.png, 
> drill-csrf.html
>
>
> There is no way to protect the WebUI from CSRF and the fact that the value 
> for the access-control-allow-origin header is '*' appears to confound this 
> issue as well.
> The attached file demonstrates the vulnerability.
> Steps to replicate:
>  # Login to an instance of the Drill WebUI.
>  # Edit the attached [^drill-csrf.html]. Replace DRILL_HOST with the hostname 
> of the Drill WebUI from step #1.
>  # Load the file from #2 in the same browser as #1 either new tab or same 
> window will do.
>  # Return to the Drill WebUI and click on 'Profiles'.
> Observed results:
> The query 'SELECT 100' appears in the list of executed queries (see:  
> [^Screen Shot 2019-08-19 at 10.11.50 AM.png] ).
> Expected results:
> It should be possible to whitelist or completely restrict code from other 
> domain names to submit queries to the WebUI.
> Risks:
> Potential for code execution by unauthorized parties.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (DRILL-7351) WebUI is Vulnerable to CSRF

2019-08-16 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908992#comment-16908992
 ] 

Anton Gozhiy commented on DRILL-7351:
-

[~perialdon], there is a common agreement about what a bug report should 
contain:
- [optional] Initial conditions
- Steps to reproduce
- Expected results
- Actual results
- Logs, screenshots and any additional info that would help to track it down

>From your message it is not clear, what the problem exactly is and what use 
>cases it can affect.

> WebUI is Vulnerable to CSRF
> ---
>
> Key: DRILL-7351
> URL: https://issues.apache.org/jira/browse/DRILL-7351
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Don Perial
>Priority: Major
> Attachments: drill-csrf.html
>
>
> There is no way to protect the WebUI from CSRF and the fact that the value 
> for the access-control-allow-origin header is '*' appears to confound this 
> issue as well.
> The attached file demonstrates the vulnerability.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (DRILL-6961) Error Occurred: Cannot connect to the db. query INFORMATION_SCHEMA.VIEWS : Maybe you have incorrect connection params or db unavailable now (timeout)

2019-08-01 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-6961:

Fix Version/s: 1.17.0

> Error Occurred: Cannot connect to the db. query INFORMATION_SCHEMA.VIEWS : 
> Maybe you have incorrect connection params or db unavailable now (timeout)
> -
>
> Key: DRILL-6961
> URL: https://issues.apache.org/jira/browse/DRILL-6961
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Information Schema
>Affects Versions: 1.13.0
>Reporter: Khurram Faraaz
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.17.0
>
>
> Trying to query drill information_schema.views table returns error. Disabling 
> openTSDB plugin resolves the problem.
> Drill 1.13.0
> Failing query :
> {noformat}
> SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, VIEW_DEFINITION FROM 
> INFORMATION_SCHEMA.`VIEWS` where VIEW_DEFINITION not like 'kraken';
> {noformat}
> Stack Trace from drillbit.log
> {noformat}
> 2019-01-07 15:36:21,975 [23cc39aa-2618-e9f0-e77e-4fafa6edc314:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 23cc39aa-2618-e9f0-e77e-4fafa6edc314: SELECT TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME, VIEW_DEFINITION FROM INFORMATION_SCHEMA.`VIEWS` where 
> VIEW_DEFINITION not like 'kraken'
> 2019-01-07 15:36:35,221 [23cc39aa-2618-e9f0-e77e-4fafa6edc314:frag:0:0] INFO 
> o.a.d.e.s.o.c.services.ServiceImpl - User Error Occurred: Cannot connect to 
> the db. Maybe you have incorrect connection params or db unavailable now 
> (timeout)
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Cannot 
> connect to the db. Maybe you have incorrect connection params or db 
> unavailable now
> [Error Id: f8b4c074-ba62-4691-b142-a8ea6e4f6b2a ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.client.services.ServiceImpl.getTableNames(ServiceImpl.java:107)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.client.services.ServiceImpl.getAllMetricNames(ServiceImpl.java:70)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.schema.OpenTSDBSchemaFactory$OpenTSDBSchema.getTableNames(OpenTSDBSchemaFactory.java:78)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.calcite.jdbc.SimpleCalciteSchema.addImplicitTableToBuilder(SimpleCalciteSchema.java:106)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema.getTableNames(CalciteSchema.java:318) 
> [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTableNames(CalciteSchema.java:587)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTableNames(CalciteSchema.java:548)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.visitTables(InfoSchemaRecordGenerator.java:227)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:216)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:209)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:196)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaTableType.getRecordReader(InfoSchemaTableType.java:58)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:34)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:30)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:146) 
> [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:142) 
> [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_144]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_144]
> at 
> 

[jira] [Assigned] (DRILL-6961) Error Occurred: Cannot connect to the db. query INFORMATION_SCHEMA.VIEWS : Maybe you have incorrect connection params or db unavailable now (timeout)

2019-07-10 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-6961:
---

Assignee: Anton Gozhiy

> Error Occurred: Cannot connect to the db. query INFORMATION_SCHEMA.VIEWS : 
> Maybe you have incorrect connection params or db unavailable now (timeout)
> -
>
> Key: DRILL-6961
> URL: https://issues.apache.org/jira/browse/DRILL-6961
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Information Schema
>Affects Versions: 1.13.0
>Reporter: Khurram Faraaz
>Assignee: Anton Gozhiy
>Priority: Major
>
> Trying to query drill information_schema.views table returns error. Disabling 
> openTSDB plugin resolves the problem.
> Drill 1.13.0
> Failing query :
> {noformat}
> SELECT TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, VIEW_DEFINITION FROM 
> INFORMATION_SCHEMA.`VIEWS` where VIEW_DEFINITION not like 'kraken';
> {noformat}
> Stack Trace from drillbit.log
> {noformat}
> 2019-01-07 15:36:21,975 [23cc39aa-2618-e9f0-e77e-4fafa6edc314:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 23cc39aa-2618-e9f0-e77e-4fafa6edc314: SELECT TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME, VIEW_DEFINITION FROM INFORMATION_SCHEMA.`VIEWS` where 
> VIEW_DEFINITION not like 'kraken'
> 2019-01-07 15:36:35,221 [23cc39aa-2618-e9f0-e77e-4fafa6edc314:frag:0:0] INFO 
> o.a.d.e.s.o.c.services.ServiceImpl - User Error Occurred: Cannot connect to 
> the db. Maybe you have incorrect connection params or db unavailable now 
> (timeout)
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Cannot 
> connect to the db. Maybe you have incorrect connection params or db 
> unavailable now
> [Error Id: f8b4c074-ba62-4691-b142-a8ea6e4f6b2a ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.client.services.ServiceImpl.getTableNames(ServiceImpl.java:107)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.client.services.ServiceImpl.getAllMetricNames(ServiceImpl.java:70)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.openTSDB.schema.OpenTSDBSchemaFactory$OpenTSDBSchema.getTableNames(OpenTSDBSchemaFactory.java:78)
>  [drill-opentsdb-storage-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.calcite.jdbc.SimpleCalciteSchema.addImplicitTableToBuilder(SimpleCalciteSchema.java:106)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema.getTableNames(CalciteSchema.java:318) 
> [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTableNames(CalciteSchema.java:587)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getTableNames(CalciteSchema.java:548)
>  [calcite-core-1.15.0-drill-r0.jar:1.15.0-drill-r0]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.visitTables(InfoSchemaRecordGenerator.java:227)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:216)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:209)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema(InfoSchemaRecordGenerator.java:196)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaTableType.getRecordReader(InfoSchemaTableType.java:58)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:34)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch(InfoSchemaBatchCreator.java:30)
>  [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:146) 
> [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.ImplCreator$2.run(ImplCreator.java:142) 
> [drill-java-exec-1.13.0-mapr.jar:1.13.0-mapr]
> at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_144]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_144]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1633)
>  

[jira] [Assigned] (DRILL-7084) ResultSet getObject method throws not implemented exception if the column type is NULL

2019-07-09 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-7084:
---

Assignee: Anton Gozhiy

> ResultSet getObject method throws not implemented exception if the column 
> type is NULL
> --
>
> Key: DRILL-7084
> URL: https://issues.apache.org/jira/browse/DRILL-7084
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
>
> This method is used by some tools, for example DBeaver. Not reproduced with 
> sqlline or Drill Web-UI.
> *Query:*
> {code:sql}
> select coalesce(n_name1, n_name2) from cp.`tpch/nation.parquet` limit 1;
> {code}
> *Expected result:*
> null
> *Actual result:*
> Exception is thrown:
> {noformat}
> java.lang.RuntimeException: not implemented
>   at 
> oadd.org.apache.calcite.avatica.AvaticaSite.notImplemented(AvaticaSite.java:421)
>   at oadd.org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:380)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:183)
>   at 
> org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCResultSetImpl.getObject(JDBCResultSetImpl.java:628)
>   at 
> org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCObjectValueHandler.fetchColumnValue(JDBCObjectValueHandler.java:60)
>   at 
> org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCAbstractValueHandler.fetchValueObject(JDBCAbstractValueHandler.java:49)
>   at 
> org.jkiss.dbeaver.ui.controls.resultset.ResultSetDataReceiver.fetchRow(ResultSetDataReceiver.java:122)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.fetchQueryData(SQLQueryJob.java:729)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.executeStatement(SQLQueryJob.java:465)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.lambda$0(SQLQueryJob.java:392)
>   at org.jkiss.dbeaver.model.DBUtils.tryExecuteRecover(DBUtils.java:1598)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:390)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.extractData(SQLQueryJob.java:822)
>   at 
> org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsContainer.readData(SQLEditor.java:2532)
>   at 
> org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.lambda$0(ResultSetJobDataRead.java:93)
>   at org.jkiss.dbeaver.model.DBUtils.tryExecuteRecover(DBUtils.java:1598)
>   at 
> org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.run(ResultSetJobDataRead.java:91)
>   at org.jkiss.dbeaver.model.runtime.AbstractJob.run(AbstractJob.java:101)
>   at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6909) Graceful shutdown does not kill the process when drill is run in embedded mode

2019-07-09 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881147#comment-16881147
 ] 

Anton Gozhiy commented on DRILL-6909:
-

In embedded mode Drill JDBC driver starts Drill itself if the specific 
connection string is used: "jdbc:drill:zk=local". And then Sqlline uses this 
JDBC connection only, which doesn't imply any callbacks to the process that 
opened it.
Possible solution is to add a callback method say in DriverImpl class and then 
handle it in Sqlline. But that will create an unnecessary coupling which is not 
good.
Another thing to take in account is that Sqlline is not the only one to start 
Drill in embedded mode, for example, we can use Squirrel, providing it the 
right connection string and Drill classpath.

> Graceful shutdown does not kill the process when drill is run in embedded mode
> --
>
> Key: DRILL-6909
> URL: https://issues.apache.org/jira/browse/DRILL-6909
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.13.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Anton Gozhiy
>Priority: Minor
> Fix For: 1.17.0
>
>
> Graceful shutdown does not kill the process when drill is run in embedded 
> mode.
> Steps to reproduce:
> 1. Run drill in embedded mode.
> 2. Press the "Shutdown" button in Web-UI.
> 3. Check that the process which corresponds to the Drill wasn't killed.
> This issue is observed only for embedded mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6909) Graceful shutdown does not kill the process when drill is run in embedded mode

2019-07-09 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-6909:

Fix Version/s: (was: 1.17.0)
   Future

> Graceful shutdown does not kill the process when drill is run in embedded mode
> --
>
> Key: DRILL-6909
> URL: https://issues.apache.org/jira/browse/DRILL-6909
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.13.0
>Reporter: Volodymyr Vysotskyi
>Priority: Minor
> Fix For: Future
>
>
> Graceful shutdown does not kill the process when drill is run in embedded 
> mode.
> Steps to reproduce:
> 1. Run drill in embedded mode.
> 2. Press the "Shutdown" button in Web-UI.
> 3. Check that the process which corresponds to the Drill wasn't killed.
> This issue is observed only for embedded mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6909) Graceful shutdown does not kill the process when drill is run in embedded mode

2019-07-09 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-6909:
---

Assignee: (was: Anton Gozhiy)

> Graceful shutdown does not kill the process when drill is run in embedded mode
> --
>
> Key: DRILL-6909
> URL: https://issues.apache.org/jira/browse/DRILL-6909
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.13.0
>Reporter: Volodymyr Vysotskyi
>Priority: Minor
> Fix For: 1.17.0
>
>
> Graceful shutdown does not kill the process when drill is run in embedded 
> mode.
> Steps to reproduce:
> 1. Run drill in embedded mode.
> 2. Press the "Shutdown" button in Web-UI.
> 3. Check that the process which corresponds to the Drill wasn't killed.
> This issue is observed only for embedded mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-6952) Merge row set based "compliant" text reader

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-6952.
---
Resolution: Fixed

> Merge row set based "compliant" text reader
> ---
>
> Key: DRILL-6952
> URL: https://issues.apache.org/jira/browse/DRILL-6952
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The result set loader project created a revised version of the compliant text 
> reader that uses the result set loader framework (which includes the 
> schema-based projection framework.)
> This task merges that work into master:
> * Review the history of the complaint text reader for changes made in the 
> last year since the code was written.
> * Apply those changes to the row set-based code, as necessary.
> * Issue a PR for the new version of the compliant text reader
> * Work through any test issues that crop up in the pre-commit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6952) Merge row set based "compliant" text reader

2019-06-25 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872306#comment-16872306
 ] 

Anton Gozhiy edited comment on DRILL-6952 at 6/25/19 1:35 PM:
--

Verified with Drill version 1.17.0-SNAPSHOT (commit 
f3f7dbd40f5e899f2aacba35db8f50ffedfa9d3d)
Cases checked:

Tested with different storage plugin parameters (extractHeader, delimiters etc.)
The same with table function.
Complex json files with nesting maps and arrays.
Data with implicit columns (with v3 reader, all such columns are moved to the 
end of rows)
Aggregate functions with specific columns and wildcard.
Large text fields (they was limited to 65536 symbols, now fixed)
No significant changes in performance were discovered. (Compared test runs with 
different readers)
Some bugs were fixed by V3 reader:
DRILL-5487, DRILL-5554, DRILL- (partially fixed), DRILL-4814, DRILL-7034, 
DRILL-7082, DRILL-7083
Bugs that were introduced by V3 reader and then fixed:
DRILL-7181, DRILL-7257, DRILL-7258


was (Author: angozhiy):
Verified with Drill version 1.17.0-SNAPSHOT (commit 
f3f7dbd40f5e899f2aacba35db8f50ffedfa9d3d)
Cases checked:

Tested with different storage plugin parameters (extractHeader, delimiters etc.)
The same with table function.
Complex json files with nesting maps and arrays.
Data with implicit columns (with v3 reader, all such columns are moved to the 
end of rows)
Aggregate functions with specific columns and wildcard.
Large text fields (they was limited to 65536 symbols, now fixed)
No significant changes in performance were discovered. (Compared test runs with 
different readers)
Some bugs were fixed by V3 reader:
DRILL-5487, DRILL-5554, DRILL-, DRILL-4814, DRILL-7034, DRILL-7082, 
DRILL-7083
Bugs that were introduced by V3 reader and then fixed:
DRILL-7181, DRILL-7257, DRILL-7258

> Merge row set based "compliant" text reader
> ---
>
> Key: DRILL-6952
> URL: https://issues.apache.org/jira/browse/DRILL-6952
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The result set loader project created a revised version of the compliant text 
> reader that uses the result set loader framework (which includes the 
> schema-based projection framework.)
> This task merges that work into master:
> * Review the history of the complaint text reader for changes made in the 
> last year since the code was written.
> * Apply those changes to the row set-based code, as necessary.
> * Issue a PR for the new version of the compliant text reader
> * Work through any test issues that crop up in the pre-commit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-6952) Merge row set based "compliant" text reader

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reopened DRILL-6952:
-

> Merge row set based "compliant" text reader
> ---
>
> Key: DRILL-6952
> URL: https://issues.apache.org/jira/browse/DRILL-6952
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The result set loader project created a revised version of the compliant text 
> reader that uses the result set loader framework (which includes the 
> schema-based projection framework.)
> This task merges that work into master:
> * Review the history of the complaint text reader for changes made in the 
> last year since the code was written.
> * Apply those changes to the row set-based code, as necessary.
> * Issue a PR for the new version of the compliant text reader
> * Work through any test issues that crop up in the pre-commit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-5487) Vector corruption in CSV with headers and truncated last row

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-5487.
---
Resolution: Fixed

> Vector corruption in CSV with headers and truncated last row
> 
>
> Key: DRILL-5487
> URL: https://issues.apache.org/jira/browse/DRILL-5487
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> The CSV format plugin allows two ways of reading data:
> * As named columns
> * As a single array, called {{columns}}, that holds all columns for a row
> The named columns feature will corrupt the offset vectors if the last row of 
> the file is truncated: leaves off one or more columns.
> To illustrate the CSV data corruption, I created a CSV file, test4.csv, of 
> the following form:
> {code}
> h,u
> abc,def
> ghi
> {code}
> Note that the file is truncated: the command and second field is missing on 
> the last line.
> Then, I created a simple test using the "cluster fixture" framework:
> {code}
>   @Test
>   public void readerTest() throws Exception {
> FixtureBuilder builder = ClusterFixture.builder()
> .maxParallelization(1);
> try (ClusterFixture cluster = builder.build();
>  ClientFixture client = cluster.clientFixture()) {
>   TextFormatConfig csvFormat = new TextFormatConfig();
>   csvFormat.fieldDelimiter = ',';
>   csvFormat.skipFirstLine = false;
>   csvFormat.extractHeader = true;
>   cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat);
>   String sql = "SELECT * FROM `dfs.data`.`csv/test4.csv` LIMIT 10";
>   client.queryBuilder().sql(sql).printCsv();
> }
>   }
> {code}
> The results show we've got a problem:
> {code}
> Exception (no rows returned): 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: length: -3 (expected: >= 0)
> {code}
> If the last line were:
> {code}
> efg,
> {code}
> Then the offset vector should look like this:
> {code}
> [0, 3, 3]
> {code}
> Very likely we have an offset vector that looks like this instead:
> {code}
> [0, 3, 0]
> {code}
> When we compute the second column of the second row, we should compute:
> {code}
> length = offset[2] - offset[1] = 3 - 3 = 0
> {code}
> Instead we get:
> {code}
> length = offset[2] - offset[1] = 0 - 3 = -3
> {code}
> The summary is that a premature EOF appears to cause the "missing" columns to 
> be skipped; they are not filled with a blank value to "bump" the offset 
> vectors to fill in the last row. Instead, they are left at 0, causing havoc 
> downstream in the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7083) Wrong data type for explicit partition column beyond file depth

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7083.
---
Resolution: Fixed

> Wrong data type for explicit partition column beyond file depth
> ---
>
> Key: DRILL-7083
> URL: https://issues.apache.org/jira/browse/DRILL-7083
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
> Fix For: 1.17.0
>
>
> Consider the simple case in DRILL-7082. That ticket talks about implicit 
> partition columns created by the wildcard. Consider a very similar case:
> {code:sql}
> SELECT a, b, c, dir0, dir1 FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> If the query is run in "stock" Drill, the planner will place both files 
> within a single scan operator as described in DRILL-7082. The result schema 
> will be:
> {noformat}
> (a VARCHAR, b VARCHAR, c VARCHAR, dir0 VARCHAR, dir1 INT)
> {noformat}
> Notice that last column: why is "dir1" a (nullable) INT? The partition 
> mechanism only recognizes partitions that actually exist, leaving the Project 
> operator to fill in (with a Nullable INT) any partitions that don't exist 
> (any directory levels not actually seen by the scan operator.)
> Now, using the same trick as in DRILL-7082, try the query
> {code:sql}
> SELECT a, b, c, dir0 FROM `myTable`
> {code}
> Again, the trick causes Drill to read each file in a separate scan operator 
> (simulating what happens when queries run at scale.)
> The scan operator for {{file1.csv}} will see no partitions, so it will omit 
> "dir0" and the Project operator will helpfully fill in a Nullable INT. The 
> scan operator for {{file2.csv}} sees one level of partition, so sets {{dir0}} 
> to {{nested}} as a Nullable VARCHAR.
> What does the client see? Two records: one with "dir0" as a Nullable INT, the 
> other as a Nullable VARCHAR. Client such as JDBC and ODBC see a hard schema 
> change between the two records.
> The two cases described above are really two versions of the same issue. 
> Clients expect that, if they use the "dir0", "dir1", ... columns, that the 
> type is always Nullable Varchar so that the schema stays consistent across 
> batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7083) Wrong data type for explicit partition column beyond file depth

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7083:

Fix Version/s: 1.17.0

> Wrong data type for explicit partition column beyond file depth
> ---
>
> Key: DRILL-7083
> URL: https://issues.apache.org/jira/browse/DRILL-7083
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
> Fix For: 1.17.0
>
>
> Consider the simple case in DRILL-7082. That ticket talks about implicit 
> partition columns created by the wildcard. Consider a very similar case:
> {code:sql}
> SELECT a, b, c, dir0, dir1 FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> If the query is run in "stock" Drill, the planner will place both files 
> within a single scan operator as described in DRILL-7082. The result schema 
> will be:
> {noformat}
> (a VARCHAR, b VARCHAR, c VARCHAR, dir0 VARCHAR, dir1 INT)
> {noformat}
> Notice that last column: why is "dir1" a (nullable) INT? The partition 
> mechanism only recognizes partitions that actually exist, leaving the Project 
> operator to fill in (with a Nullable INT) any partitions that don't exist 
> (any directory levels not actually seen by the scan operator.)
> Now, using the same trick as in DRILL-7082, try the query
> {code:sql}
> SELECT a, b, c, dir0 FROM `myTable`
> {code}
> Again, the trick causes Drill to read each file in a separate scan operator 
> (simulating what happens when queries run at scale.)
> The scan operator for {{file1.csv}} will see no partitions, so it will omit 
> "dir0" and the Project operator will helpfully fill in a Nullable INT. The 
> scan operator for {{file2.csv}} sees one level of partition, so sets {{dir0}} 
> to {{nested}} as a Nullable VARCHAR.
> What does the client see? Two records: one with "dir0" as a Nullable INT, the 
> other as a Nullable VARCHAR. Client such as JDBC and ODBC see a hard schema 
> change between the two records.
> The two cases described above are really two versions of the same issue. 
> Clients expect that, if they use the "dir0", "dir1", ... columns, that the 
> type is always Nullable Varchar so that the schema stays consistent across 
> batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-7083) Wrong data type for explicit partition column beyond file depth

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reopened DRILL-7083:
-

> Wrong data type for explicit partition column beyond file depth
> ---
>
> Key: DRILL-7083
> URL: https://issues.apache.org/jira/browse/DRILL-7083
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Consider the simple case in DRILL-7082. That ticket talks about implicit 
> partition columns created by the wildcard. Consider a very similar case:
> {code:sql}
> SELECT a, b, c, dir0, dir1 FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> If the query is run in "stock" Drill, the planner will place both files 
> within a single scan operator as described in DRILL-7082. The result schema 
> will be:
> {noformat}
> (a VARCHAR, b VARCHAR, c VARCHAR, dir0 VARCHAR, dir1 INT)
> {noformat}
> Notice that last column: why is "dir1" a (nullable) INT? The partition 
> mechanism only recognizes partitions that actually exist, leaving the Project 
> operator to fill in (with a Nullable INT) any partitions that don't exist 
> (any directory levels not actually seen by the scan operator.)
> Now, using the same trick as in DRILL-7082, try the query
> {code:sql}
> SELECT a, b, c, dir0 FROM `myTable`
> {code}
> Again, the trick causes Drill to read each file in a separate scan operator 
> (simulating what happens when queries run at scale.)
> The scan operator for {{file1.csv}} will see no partitions, so it will omit 
> "dir0" and the Project operator will helpfully fill in a Nullable INT. The 
> scan operator for {{file2.csv}} sees one level of partition, so sets {{dir0}} 
> to {{nested}} as a Nullable VARCHAR.
> What does the client see? Two records: one with "dir0" as a Nullable INT, the 
> other as a Nullable VARCHAR. Client such as JDBC and ODBC see a hard schema 
> change between the two records.
> The two cases described above are really two versions of the same issue. 
> Clients expect that, if they use the "dir0", "dir1", ... columns, that the 
> type is always Nullable Varchar so that the schema stays consistent across 
> batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7082) Inconsistent results with implicit partition columns, multi scans

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-7082:

Fix Version/s: 1.17.0

> Inconsistent results with implicit partition columns, multi scans
> -
>
> Key: DRILL-7082
> URL: https://issues.apache.org/jira/browse/DRILL-7082
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
> Fix For: 1.17.0
>
>
> The runtime behavior of implicit partition columns is wildly inconsistent to 
> the point of being unusable. Consider the following query:
> {code:sql}
> SELECT * FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> Our test files are small. Turn out that, even if we write a test that scans a 
> few files, such as the above example, Drill will group all the reads into a 
> single fragment with a single scan operator. When that happens:
> * The partition columns appear before the data columns: (dir0, a, b, c).
> * The partition columns always appear in every row.
> We get the above result because a single scan operator sees both files and 
> knows the right number of partition columns to create for each.
> But, we know that, if two scans each read files at different depths, the 
> "shallower" one won't see as many partition directories as the "deeper" one. 
> To test this, I modified the text reader to accept a new session option that 
> sets the minimum parallelization. I set it to 2 (same as the number of 
> files.) One could probably also see this by creating large text files so that 
> the Drill parallelizer will choose to create two fragments.
> Then, I ran the above query 10 times. Now, I get these results:
> * Half the time, the first row has only the data columns (a, b, c), the other 
> half of the time the first row has a partition column. (Depending on which 
> file returned data first.)
> * Some of the time the partition column appears in the first position (dir0, 
> a, b, c) and some of the time in the last (a, b, c, dir0). (I have no idea 
> why.)
> The result is, from a two-file query, depending on random factors, your first 
> row schema could be:
> * (a, b, c)
> * (dir0, a, b, c)
> * (a, b, c, dir0)
> In many cases, the second row comes with a hard schema change to a different 
> format.
> The above is demonstrated in the (soon to be provided) {{TestPartitionRace}} 
> unit test.
> IMHO, the behavior is basically unusable as any JDBC/ODBC client will see an 
> inconsistent, changing schema. Instead, what a user would expect is:
> * The partition columns are in the same location in every row (preferably at 
> the end, so data columns remain in fixed positions regardless of the number 
> of partition columns.)
> * The same number of columns in every row. This means that all scan operators 
> must use a single uniform partition depth count, preferably set at plan type 
> in the group scan node that has visibility to all the files to scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-7082) Inconsistent results with implicit partition columns, multi scans

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reopened DRILL-7082:
-

> Inconsistent results with implicit partition columns, multi scans
> -
>
> Key: DRILL-7082
> URL: https://issues.apache.org/jira/browse/DRILL-7082
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
>
> The runtime behavior of implicit partition columns is wildly inconsistent to 
> the point of being unusable. Consider the following query:
> {code:sql}
> SELECT * FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> Our test files are small. Turn out that, even if we write a test that scans a 
> few files, such as the above example, Drill will group all the reads into a 
> single fragment with a single scan operator. When that happens:
> * The partition columns appear before the data columns: (dir0, a, b, c).
> * The partition columns always appear in every row.
> We get the above result because a single scan operator sees both files and 
> knows the right number of partition columns to create for each.
> But, we know that, if two scans each read files at different depths, the 
> "shallower" one won't see as many partition directories as the "deeper" one. 
> To test this, I modified the text reader to accept a new session option that 
> sets the minimum parallelization. I set it to 2 (same as the number of 
> files.) One could probably also see this by creating large text files so that 
> the Drill parallelizer will choose to create two fragments.
> Then, I ran the above query 10 times. Now, I get these results:
> * Half the time, the first row has only the data columns (a, b, c), the other 
> half of the time the first row has a partition column. (Depending on which 
> file returned data first.)
> * Some of the time the partition column appears in the first position (dir0, 
> a, b, c) and some of the time in the last (a, b, c, dir0). (I have no idea 
> why.)
> The result is, from a two-file query, depending on random factors, your first 
> row schema could be:
> * (a, b, c)
> * (dir0, a, b, c)
> * (a, b, c, dir0)
> In many cases, the second row comes with a hard schema change to a different 
> format.
> The above is demonstrated in the (soon to be provided) {{TestPartitionRace}} 
> unit test.
> IMHO, the behavior is basically unusable as any JDBC/ODBC client will see an 
> inconsistent, changing schema. Instead, what a user would expect is:
> * The partition columns are in the same location in every row (preferably at 
> the end, so data columns remain in fixed positions regardless of the number 
> of partition columns.)
> * The same number of columns in every row. This means that all scan operators 
> must use a single uniform partition depth count, preferably set at plan type 
> in the group scan node that has visibility to all the files to scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7082) Inconsistent results with implicit partition columns, multi scans

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7082.
---
Resolution: Fixed

> Inconsistent results with implicit partition columns, multi scans
> -
>
> Key: DRILL-7082
> URL: https://issues.apache.org/jira/browse/DRILL-7082
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
> Fix For: 1.17.0
>
>
> The runtime behavior of implicit partition columns is wildly inconsistent to 
> the point of being unusable. Consider the following query:
> {code:sql}
> SELECT * FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> Our test files are small. Turn out that, even if we write a test that scans a 
> few files, such as the above example, Drill will group all the reads into a 
> single fragment with a single scan operator. When that happens:
> * The partition columns appear before the data columns: (dir0, a, b, c).
> * The partition columns always appear in every row.
> We get the above result because a single scan operator sees both files and 
> knows the right number of partition columns to create for each.
> But, we know that, if two scans each read files at different depths, the 
> "shallower" one won't see as many partition directories as the "deeper" one. 
> To test this, I modified the text reader to accept a new session option that 
> sets the minimum parallelization. I set it to 2 (same as the number of 
> files.) One could probably also see this by creating large text files so that 
> the Drill parallelizer will choose to create two fragments.
> Then, I ran the above query 10 times. Now, I get these results:
> * Half the time, the first row has only the data columns (a, b, c), the other 
> half of the time the first row has a partition column. (Depending on which 
> file returned data first.)
> * Some of the time the partition column appears in the first position (dir0, 
> a, b, c) and some of the time in the last (a, b, c, dir0). (I have no idea 
> why.)
> The result is, from a two-file query, depending on random factors, your first 
> row schema could be:
> * (a, b, c)
> * (dir0, a, b, c)
> * (a, b, c, dir0)
> In many cases, the second row comes with a hard schema change to a different 
> format.
> The above is demonstrated in the (soon to be provided) {{TestPartitionRace}} 
> unit test.
> IMHO, the behavior is basically unusable as any JDBC/ODBC client will see an 
> inconsistent, changing schema. Instead, what a user would expect is:
> * The partition columns are in the same location in every row (preferably at 
> the end, so data columns remain in fixed positions regardless of the number 
> of partition columns.)
> * The same number of columns in every row. This means that all scan operators 
> must use a single uniform partition depth count, preferably set at plan type 
> in the group scan node that has visibility to all the files to scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-5554) Wrong error type for "SELECT a" from a CSV file without headers

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-5554.
---
Resolution: Fixed

> Wrong error type for "SELECT a" from a CSV file without headers
> ---
>
> Key: DRILL-5554
> URL: https://issues.apache.org/jira/browse/DRILL-5554
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Trivial
> Fix For: 1.17.0
>
>
> Create a CSV file without headers:
> {code}
> 10,foo,bar
> {code}
> Use a CSV storage plugin configured to not skip the first line and not read 
> headers.
> Then, issue the following query:
> {code}
> SELECT a FROM `dfs.data.example.csv`
> {code}
> The result is correct: an error:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: 
> DATA_READ ERROR: Selected column 'a' must have name 'columns' or must be 
> plain '*'
> {code}
> But, the type of error is wrong. This is not a data read error: the file read 
> just fine. The problem is a semantic error: a query form that is not 
> compatible wth the storage plugin.
> Suggest using {{UserException.unsupportedError()}} instead since the user is 
> asking the plugin to do something that the plugin does not support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-5554) Wrong error type for "SELECT a" from a CSV file without headers

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reopened DRILL-5554:
-

> Wrong error type for "SELECT a" from a CSV file without headers
> ---
>
> Key: DRILL-5554
> URL: https://issues.apache.org/jira/browse/DRILL-5554
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Trivial
>
> Create a CSV file without headers:
> {code}
> 10,foo,bar
> {code}
> Use a CSV storage plugin configured to not skip the first line and not read 
> headers.
> Then, issue the following query:
> {code}
> SELECT a FROM `dfs.data.example.csv`
> {code}
> The result is correct: an error:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: 
> DATA_READ ERROR: Selected column 'a' must have name 'columns' or must be 
> plain '*'
> {code}
> But, the type of error is wrong. This is not a data read error: the file read 
> just fine. The problem is a semantic error: a query form that is not 
> compatible wth the storage plugin.
> Suggest using {{UserException.unsupportedError()}} instead since the user is 
> asking the plugin to do something that the plugin does not support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5554) Wrong error type for "SELECT a" from a CSV file without headers

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-5554:

Fix Version/s: 1.17.0

> Wrong error type for "SELECT a" from a CSV file without headers
> ---
>
> Key: DRILL-5554
> URL: https://issues.apache.org/jira/browse/DRILL-5554
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Trivial
> Fix For: 1.17.0
>
>
> Create a CSV file without headers:
> {code}
> 10,foo,bar
> {code}
> Use a CSV storage plugin configured to not skip the first line and not read 
> headers.
> Then, issue the following query:
> {code}
> SELECT a FROM `dfs.data.example.csv`
> {code}
> The result is correct: an error:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: 
> DATA_READ ERROR: Selected column 'a' must have name 'columns' or must be 
> plain '*'
> {code}
> But, the type of error is wrong. This is not a data read error: the file read 
> just fine. The problem is a semantic error: a query form that is not 
> compatible wth the storage plugin.
> Suggest using {{UserException.unsupportedError()}} instead since the user is 
> asking the plugin to do something that the plugin does not support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5487) Vector corruption in CSV with headers and truncated last row

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy updated DRILL-5487:

Fix Version/s: (was: Future)
   1.17.0

> Vector corruption in CSV with headers and truncated last row
> 
>
> Key: DRILL-5487
> URL: https://issues.apache.org/jira/browse/DRILL-5487
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> The CSV format plugin allows two ways of reading data:
> * As named columns
> * As a single array, called {{columns}}, that holds all columns for a row
> The named columns feature will corrupt the offset vectors if the last row of 
> the file is truncated: leaves off one or more columns.
> To illustrate the CSV data corruption, I created a CSV file, test4.csv, of 
> the following form:
> {code}
> h,u
> abc,def
> ghi
> {code}
> Note that the file is truncated: the command and second field is missing on 
> the last line.
> Then, I created a simple test using the "cluster fixture" framework:
> {code}
>   @Test
>   public void readerTest() throws Exception {
> FixtureBuilder builder = ClusterFixture.builder()
> .maxParallelization(1);
> try (ClusterFixture cluster = builder.build();
>  ClientFixture client = cluster.clientFixture()) {
>   TextFormatConfig csvFormat = new TextFormatConfig();
>   csvFormat.fieldDelimiter = ',';
>   csvFormat.skipFirstLine = false;
>   csvFormat.extractHeader = true;
>   cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat);
>   String sql = "SELECT * FROM `dfs.data`.`csv/test4.csv` LIMIT 10";
>   client.queryBuilder().sql(sql).printCsv();
> }
>   }
> {code}
> The results show we've got a problem:
> {code}
> Exception (no rows returned): 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: length: -3 (expected: >= 0)
> {code}
> If the last line were:
> {code}
> efg,
> {code}
> Then the offset vector should look like this:
> {code}
> [0, 3, 3]
> {code}
> Very likely we have an offset vector that looks like this instead:
> {code}
> [0, 3, 0]
> {code}
> When we compute the second column of the second row, we should compute:
> {code}
> length = offset[2] - offset[1] = 3 - 3 = 0
> {code}
> Instead we get:
> {code}
> length = offset[2] - offset[1] = 0 - 3 = -3
> {code}
> The summary is that a premature EOF appears to cause the "missing" columns to 
> be skipped; they are not filled with a blank value to "bump" the offset 
> vectors to fill in the last row. Instead, they are left at 0, causing havoc 
> downstream in the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (DRILL-5487) Vector corruption in CSV with headers and truncated last row

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reopened DRILL-5487:
-

> Vector corruption in CSV with headers and truncated last row
> 
>
> Key: DRILL-5487
> URL: https://issues.apache.org/jira/browse/DRILL-5487
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
> Fix For: Future
>
>
> The CSV format plugin allows two ways of reading data:
> * As named columns
> * As a single array, called {{columns}}, that holds all columns for a row
> The named columns feature will corrupt the offset vectors if the last row of 
> the file is truncated: leaves off one or more columns.
> To illustrate the CSV data corruption, I created a CSV file, test4.csv, of 
> the following form:
> {code}
> h,u
> abc,def
> ghi
> {code}
> Note that the file is truncated: the command and second field is missing on 
> the last line.
> Then, I created a simple test using the "cluster fixture" framework:
> {code}
>   @Test
>   public void readerTest() throws Exception {
> FixtureBuilder builder = ClusterFixture.builder()
> .maxParallelization(1);
> try (ClusterFixture cluster = builder.build();
>  ClientFixture client = cluster.clientFixture()) {
>   TextFormatConfig csvFormat = new TextFormatConfig();
>   csvFormat.fieldDelimiter = ',';
>   csvFormat.skipFirstLine = false;
>   csvFormat.extractHeader = true;
>   cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat);
>   String sql = "SELECT * FROM `dfs.data`.`csv/test4.csv` LIMIT 10";
>   client.queryBuilder().sql(sql).printCsv();
> }
>   }
> {code}
> The results show we've got a problem:
> {code}
> Exception (no rows returned): 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: length: -3 (expected: >= 0)
> {code}
> If the last line were:
> {code}
> efg,
> {code}
> Then the offset vector should look like this:
> {code}
> [0, 3, 3]
> {code}
> Very likely we have an offset vector that looks like this instead:
> {code}
> [0, 3, 0]
> {code}
> When we compute the second column of the second row, we should compute:
> {code}
> length = offset[2] - offset[1] = 3 - 3 = 0
> {code}
> Instead we get:
> {code}
> length = offset[2] - offset[1] = 0 - 3 = -3
> {code}
> The summary is that a premature EOF appears to cause the "missing" columns to 
> be skipped; they are not filled with a blank value to "bump" the offset 
> vectors to fill in the last row. Instead, they are left at 0, causing havoc 
> downstream in the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-6952) Merge row set based "compliant" text reader

2019-06-25 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-6952.
---

Verified with Drill version 1.17.0-SNAPSHOT (commit 
f3f7dbd40f5e899f2aacba35db8f50ffedfa9d3d)
Cases checked:

Tested with different storage plugin parameters (extractHeader, delimiters etc.)
The same with table function.
Complex json files with nesting maps and arrays.
Data with implicit columns (with v3 reader, all such columns are moved to the 
end of rows)
Aggregate functions with specific columns and wildcard.
Large text fields (they was limited to 65536 symbols, now fixed)
No significant changes in performance were discovered. (Compared test runs with 
different readers)
Some bugs were fixed by V3 reader:
DRILL-5487, DRILL-5554, DRILL-, DRILL-4814, DRILL-7034, DRILL-7082, 
DRILL-7083
Bugs that were introduced by V3 reader and then fixed:
DRILL-7181, DRILL-7257, DRILL-7258

> Merge row set based "compliant" text reader
> ---
>
> Key: DRILL-6952
> URL: https://issues.apache.org/jira/browse/DRILL-6952
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The result set loader project created a revised version of the compliant text 
> reader that uses the result set loader framework (which includes the 
> schema-based projection framework.)
> This task merges that work into master:
> * Review the history of the complaint text reader for changes made in the 
> last year since the code was written.
> * Apply those changes to the row set-based code, as necessary.
> * Issue a PR for the new version of the compliant text reader
> * Work through any test issues that crop up in the pre-commit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7083) Wrong data type for explicit partition column beyond file depth

2019-06-21 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7083.
---
Resolution: Fixed

Verified with Drill version 1.17.0-SNAPSHOT (commit 
f3f7dbd40f5e899f2aacba35db8f50ffedfa9d3d)
Used function drillTypeOf() to determine the implicit column type.
Tested with different level of nesting.

> Wrong data type for explicit partition column beyond file depth
> ---
>
> Key: DRILL-7083
> URL: https://issues.apache.org/jira/browse/DRILL-7083
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Consider the simple case in DRILL-7082. That ticket talks about implicit 
> partition columns created by the wildcard. Consider a very similar case:
> {code:sql}
> SELECT a, b, c, dir0, dir1 FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> If the query is run in "stock" Drill, the planner will place both files 
> within a single scan operator as described in DRILL-7082. The result schema 
> will be:
> {noformat}
> (a VARCHAR, b VARCHAR, c VARCHAR, dir0 VARCHAR, dir1 INT)
> {noformat}
> Notice that last column: why is "dir1" a (nullable) INT? The partition 
> mechanism only recognizes partitions that actually exist, leaving the Project 
> operator to fill in (with a Nullable INT) any partitions that don't exist 
> (any directory levels not actually seen by the scan operator.)
> Now, using the same trick as in DRILL-7082, try the query
> {code:sql}
> SELECT a, b, c, dir0 FROM `myTable`
> {code}
> Again, the trick causes Drill to read each file in a separate scan operator 
> (simulating what happens when queries run at scale.)
> The scan operator for {{file1.csv}} will see no partitions, so it will omit 
> "dir0" and the Project operator will helpfully fill in a Nullable INT. The 
> scan operator for {{file2.csv}} sees one level of partition, so sets {{dir0}} 
> to {{nested}} as a Nullable VARCHAR.
> What does the client see? Two records: one with "dir0" as a Nullable INT, the 
> other as a Nullable VARCHAR. Client such as JDBC and ODBC see a hard schema 
> change between the two records.
> The two cases described above are really two versions of the same issue. 
> Clients expect that, if they use the "dir0", "dir1", ... columns, that the 
> type is always Nullable Varchar so that the schema stays consistent across 
> batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7082) Inconsistent results with implicit partition columns, multi scans

2019-06-21 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7082.
---
Resolution: Fixed

Verified with Drill version 1.17.0-SNAPSHOT (commit 
f3f7dbd40f5e899f2aacba35db8f50ffedfa9d3d)
Used option set `exec.storage.min_width` = 2; (and more).
Tested with different level of nesting.

> Inconsistent results with implicit partition columns, multi scans
> -
>
> Key: DRILL-7082
> URL: https://issues.apache.org/jira/browse/DRILL-7082
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Minor
>
> The runtime behavior of implicit partition columns is wildly inconsistent to 
> the point of being unusable. Consider the following query:
> {code:sql}
> SELECT * FROM `myTable`
> {code}
> Where {{myTable}} is a directory of CSV files, each with schema {{(a, b, c)}}:
> {noformat}
> myTable
> |- file1.csv
> |- nested
>|- file2.csv
> {noformat}
> Our test files are small. Turn out that, even if we write a test that scans a 
> few files, such as the above example, Drill will group all the reads into a 
> single fragment with a single scan operator. When that happens:
> * The partition columns appear before the data columns: (dir0, a, b, c).
> * The partition columns always appear in every row.
> We get the above result because a single scan operator sees both files and 
> knows the right number of partition columns to create for each.
> But, we know that, if two scans each read files at different depths, the 
> "shallower" one won't see as many partition directories as the "deeper" one. 
> To test this, I modified the text reader to accept a new session option that 
> sets the minimum parallelization. I set it to 2 (same as the number of 
> files.) One could probably also see this by creating large text files so that 
> the Drill parallelizer will choose to create two fragments.
> Then, I ran the above query 10 times. Now, I get these results:
> * Half the time, the first row has only the data columns (a, b, c), the other 
> half of the time the first row has a partition column. (Depending on which 
> file returned data first.)
> * Some of the time the partition column appears in the first position (dir0, 
> a, b, c) and some of the time in the last (a, b, c, dir0). (I have no idea 
> why.)
> The result is, from a two-file query, depending on random factors, your first 
> row schema could be:
> * (a, b, c)
> * (dir0, a, b, c)
> * (a, b, c, dir0)
> In many cases, the second row comes with a hard schema change to a different 
> format.
> The above is demonstrated in the (soon to be provided) {{TestPartitionRace}} 
> unit test.
> IMHO, the behavior is basically unusable as any JDBC/ODBC client will see an 
> inconsistent, changing schema. Instead, what a user would expect is:
> * The partition columns are in the same location in every row (preferably at 
> the end, so data columns remain in fixed positions regardless of the number 
> of partition columns.)
> * The same number of columns in every row. This means that all scan operators 
> must use a single uniform partition depth count, preferably set at plan type 
> in the group scan node that has visibility to all the files to scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-4814) extractHeader attribute not working with the table function

2019-06-13 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-4814.
---

Verified with Drill version 1.17.0-SNAPSHOT (commit 
de0aec7951254949ae9206d6f63b5077684dac8a).
The issue is not reproducible with V3 text reader.

> extractHeader attribute not working with the table function
> ---
>
> Key: DRILL-4814
> URL: https://issues.apache.org/jira/browse/DRILL-4814
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.8.0
>Reporter: Krystal
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.16.0
>
>
> I have the following table with line delimiter as \r\n:
> Id,col1,col2
> 1,aaa,bbb
> 2,ccc,ddd
> 3,eee
> 4,fff,ggg
> The following queries work fine:
> select * from 
> table(`drill-3149/header.csv`(type=>'text',lineDelimiter=>'\r\n',fieldDelimiter=>','));
> +---+
> |columns|
> +---+
> | ["Id","col1","col2"]  |
> | ["1","aaa","bbb"] |
> | ["2","ccc","ddd"] |
> | ["3","eee"]   |
> | ["4","fff","ggg"] |
> +---+
> select * from 
> table(`drill-3149/header.csv`(type=>'text',lineDelimiter=>'\r\n',fieldDelimiter=>',',skipFirstLine=>true));
> ++
> |  columns   |
> ++
> | ["1","aaa","bbb"]  |
> | ["2","ccc","ddd"]  |
> | ["3","eee"]|
> | ["4","fff","ggg"]  |
> ++
> The following query fail with extractHeader attribute:
> select * from 
> table(`drill-3149/header.csv`(type=>'text',lineDelimiter=>'\r\n',fieldDelimiter=>',',extractHeader=>true));
> {code}
> java.lang.IndexOutOfBoundsException: index: 254, length: 3 (expected: 
> range(0, 256))
>   at io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1134)
>   at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.getBytes(PooledUnsafeDirectByteBuf.java:136)
>   at io.netty.buffer.WrappedByteBuf.getBytes(WrappedByteBuf.java:289)
>   at 
> io.netty.buffer.UnsafeDirectLittleEndian.getBytes(UnsafeDirectLittleEndian.java:30)
>   at io.netty.buffer.DrillBuf.getBytes(DrillBuf.java:629)
>   at 
> org.apache.drill.exec.vector.VarCharVector$Accessor.get(VarCharVector.java:441)
>   at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getBytes(VarCharAccessor.java:125)
>   at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getString(VarCharAccessor.java:146)
>   at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getObject(VarCharAccessor.java:136)
>   at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getObject(VarCharAccessor.java:94)
>   at 
> org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:148)
>   at 
> org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getObject(TypeConvertingSqlAccessor.java:795)
>   at 
> org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:179)
>   at 
> net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:420)
>   at sqlline.Rows$Row.(Rows.java:157)
>   at sqlline.IncrementalRows.hasNext(IncrementalRows.java:63)
>   at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
>   at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
>   at sqlline.SqlLine.print(SqlLine.java:1593)
>   at sqlline.Commands.execute(Commands.java:852)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:746)
>   at sqlline.SqlLine.begin(SqlLine.java:621)
>   at sqlline.SqlLine.start(SqlLine.java:375)
>   at sqlline.SqlLine.main(SqlLine.java:268)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7034) Window function over a malformed CSV file crashes the JVM

2019-06-13 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7034.
---

Verified with Drill version 1.17.0-SNAPSHOT (commit 
de0aec7951254949ae9206d6f63b5077684dac8a).
The issue is not reproducible with V3 text reader.

> Window function over a malformed CSV file crashes the JVM 
> --
>
> Key: DRILL-7034
> URL: https://issues.apache.org/jira/browse/DRILL-7034
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.15.0
>Reporter: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: hs_err_pid23450.log, janino8470007454663483217.java
>
>
> The JVM crashes executing window functions over (an ordered) CSV file with a 
> small format issue - an empty line.
> To create: Take the following simple `a.csvh` file:
> {noformat}
> amount
> 10
> 11
> {noformat}
> And execute a simple window function like
> {code:sql}
> select max(amount) over(order by amount) FROM dfs.`/data/a.csvh`;
> {code}
> Then add an empty line between the `10` and the `11`:
> {noformat}
> amount
> 10
> 11
> {noformat}
>  and try again:
> {noformat}
> 0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM 
> dfs.`/data/a.csvh`;
> +-+
> | EXPR$0  |
> +-+
> | 10  |
> | 11  |
> +-+
> 2 rows selected (3.554 seconds)
> 0: jdbc:drill:zk=local> select max(amount) over(order by amount) FROM 
> dfs.`/data/a.csvh`;
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x0001064aeae7, pid=23450, tid=0x6103
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 
> 1.8.0_181-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode bsd-amd64 
> compressed oops)
> # Problematic frame:
> # J 6719% C2 
> org.apache.drill.exec.expr.fn.impl.ByteFunctionHelpers.memcmp(JIIJII)I (188 
> bytes) @ 0x0001064aeae7 [0x0001064ae920+0x1c7]
> #
> # Core dump written. Default location: /cores/core or core.23450
> #
> # An error report file with more information is saved as:
> # /Users/boazben-zvi/IdeaProjects/drill/hs_err_pid23450.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> Abort trap: 6 (core dumped)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7020) big varchar doesn't work with extractHeader=true

2019-06-11 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861196#comment-16861196
 ] 

Anton Gozhiy commented on DRILL-7020:
-

Fixed in Drill V3 text reader, see DRILL-7258 for details.

> big varchar doesn't work with extractHeader=true
> 
>
> Key: DRILL-7020
> URL: https://issues.apache.org/jira/browse/DRILL-7020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.15.0
>Reporter: benj
>Priority: Major
>
> with a TEST file of csv type like
> {code:java}
> col1,col2
> w,x
> ...y...,z
> {code}
> where ...y... is > 65536 characters string (let say 66000 for example)
> SELECT with +*extractHeader=false*+ are OK
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => false));
>     col1  | col2
> +-+--
> | w       | x
> | ...y... | z
> {code}
> But SELECT with +*extractHeader=true*+ gives an error
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => true));
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
> columnIndex 1
> Limit 65536
> Fragment 0:0
> {code}
> Note that is possible to use extractHeader=false with skipFirstLine=true but 
> in this case it's not possible to automatically get columns names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-7020) big varchar doesn't work with extractHeader=true

2019-06-11 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy resolved DRILL-7020.
-
Resolution: Duplicate

> big varchar doesn't work with extractHeader=true
> 
>
> Key: DRILL-7020
> URL: https://issues.apache.org/jira/browse/DRILL-7020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.15.0
>Reporter: benj
>Priority: Major
>
> with a TEST file of csv type like
> {code:java}
> col1,col2
> w,x
> ...y...,z
> {code}
> where ...y... is > 65536 characters string (let say 66000 for example)
> SELECT with +*extractHeader=false*+ are OK
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => false));
>     col1  | col2
> +-+--
> | w       | x
> | ...y... | z
> {code}
> But SELECT with +*extractHeader=true*+ gives an error
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => true));
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
> columnIndex 1
> Limit 65536
> Fragment 0:0
> {code}
> Note that is possible to use extractHeader=false with skipFirstLine=true but 
> in this case it's not possible to automatically get columns names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7020) big varchar doesn't work with extractHeader=true

2019-06-11 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7020.
---

> big varchar doesn't work with extractHeader=true
> 
>
> Key: DRILL-7020
> URL: https://issues.apache.org/jira/browse/DRILL-7020
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.15.0
>Reporter: benj
>Priority: Major
>
> with a TEST file of csv type like
> {code:java}
> col1,col2
> w,x
> ...y...,z
> {code}
> where ...y... is > 65536 characters string (let say 66000 for example)
> SELECT with +*extractHeader=false*+ are OK
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => false));
>     col1  | col2
> +-+--
> | w       | x
> | ...y... | z
> {code}
> But SELECT with +*extractHeader=true*+ gives an error
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => true));
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
> columnIndex 1
> Limit 65536
> Fragment 0:0
> {code}
> Note that is possible to use extractHeader=false with skipFirstLine=true but 
> in this case it's not possible to automatically get columns names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7258) [Text V3 Reader] Unsupported operation error is thrown when select a column with a long string

2019-06-10 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7258.
---

> [Text V3 Reader] Unsupported operation error is thrown when select a column 
> with a long string
> --
>
> Key: DRILL-7258
> URL: https://issues.apache.org/jira/browse/DRILL-7258
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
> Attachments: 10.tbl
>
>
> *Data:*
> 10.tbl is attached
> *Steps:*
> # Set exec.storage.enable_v3_text_reader=true
> # Run the following query:
> {code:sql}
> select * from dfs.`/tmp/drill/data/10.tbl`
> {code}
> *Expected result:*
> The query should return result normally.
> *Actual result:*
> Exception is thrown:
> {noformat}
> UNSUPPORTED_OPERATION ERROR: Drill Remote Exception
>   (java.lang.Exception) UNSUPPORTED_OPERATION ERROR: Text column is too large.
> Column 0
> Limit 65536
> Fragment 0:0
> [Error Id: 5f73232f-f0c0-48aa-ab0f-b5f86495d3c8 on userf87d-pc:31010]
> org.apache.drill.common.exceptions.UserException$Builder.build():630
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.BaseFieldOutput.append():131
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseValueAll():208
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseValue():225
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseField():341
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseRecord():137
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseNext():388
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.CompliantTextBatchReader.next():220
> 
> org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next():132
> org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch():397
> org.apache.drill.exec.physical.impl.scan.ReaderState.next():354
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction():184
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next():159
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext():176
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next():114
> 
> org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next():147
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283
> ...():0
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():283
> org.apache.drill.common.SelfCleaningRunnable.run():38
> ...():0
> {noformat}
> *Note:* works fine with v2 reader. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7258) [Text V3 Reader] Unsupported operation error is thrown when select a column with a long string

2019-06-10 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860136#comment-16860136
 ] 

Anton Gozhiy commented on DRILL-7258:
-

Verified with Drill version 1.17.0-SNAPSHOT (commit 
de0aec7951254949ae9206d6f63b5077684dac8a).
Checked both columns names and array, item star, combinations.

> [Text V3 Reader] Unsupported operation error is thrown when select a column 
> with a long string
> --
>
> Key: DRILL-7258
> URL: https://issues.apache.org/jira/browse/DRILL-7258
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
> Attachments: 10.tbl
>
>
> *Data:*
> 10.tbl is attached
> *Steps:*
> # Set exec.storage.enable_v3_text_reader=true
> # Run the following query:
> {code:sql}
> select * from dfs.`/tmp/drill/data/10.tbl`
> {code}
> *Expected result:*
> The query should return result normally.
> *Actual result:*
> Exception is thrown:
> {noformat}
> UNSUPPORTED_OPERATION ERROR: Drill Remote Exception
>   (java.lang.Exception) UNSUPPORTED_OPERATION ERROR: Text column is too large.
> Column 0
> Limit 65536
> Fragment 0:0
> [Error Id: 5f73232f-f0c0-48aa-ab0f-b5f86495d3c8 on userf87d-pc:31010]
> org.apache.drill.common.exceptions.UserException$Builder.build():630
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.BaseFieldOutput.append():131
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseValueAll():208
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseValue():225
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseField():341
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseRecord():137
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.TextReader.parseNext():388
> 
> org.apache.drill.exec.store.easy.text.compliant.v3.CompliantTextBatchReader.next():220
> 
> org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next():132
> org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch():397
> org.apache.drill.exec.physical.impl.scan.ReaderState.next():354
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction():184
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next():159
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext():176
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next():114
> 
> org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next():147
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283
> ...():0
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():283
> org.apache.drill.common.SelfCleaningRunnable.run():38
> ...():0
> {noformat}
> *Note:* works fine with v2 reader. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7257) [Text V3 Reader] dir0 is empty if a column filter returns all lines.

2019-06-10 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7257.
---

Verified with Drill version 1.17.0-SNAPSHOT (commit 
2615d68de4e44b1f03f5c047018548c06a7396b4)
Checked with different filter values and operators.

> [Text V3 Reader] dir0 is empty if a column filter returns all lines.
> 
>
> Key: DRILL-7257
> URL: https://issues.apache.org/jira/browse/DRILL-7257
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
> Attachments: lineitempart.zip
>
>
> *Data:*
> Unzip the attached archive: lineitempart.zip.
> *Steps:*
> # Set exec.storage.enable_v3_text_reader=true
> # Run the following query:
> {code:sql}
> select columns[0], dir0 from dfs.tmp.`/drill/data/lineitempart` where 
> dir0=1994 and columns[0]>29766 order by columns[0] limit 1;
> {code}
> *Expected result:*
> {noformat}
> ++--+
> | EXPR$0 | dir0 |
> ++--+
> | 29767  | 1994 |
> ++--+
> {noformat}
> *Actual result:*
> {noformat}
> ++--+
> | EXPR$0 | dir0 |
> ++--+
> | 29767  |  |
> ++--+
> {noformat}
> *Note:* If change filter a bit so it doesn't return all lines, everything is 
> ok:
> {noformat}
> apache drill> select columns[0], dir0 from dfs.tmp.`/drill/data/lineitempart` 
> where dir0=1994 and columns[0]>29767 order by columns[0] limit 1;
> ++--+
> | EXPR$0 | dir0 |
> ++--+
> | 29792  | 1994 |
> ++--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7286) Joining a table with itself using subquery results in exception.

2019-06-06 Thread Anton Gozhiy (JIRA)
Anton Gozhiy created DRILL-7286:
---

 Summary: Joining a table with itself using subquery results in 
exception.
 Key: DRILL-7286
 URL: https://issues.apache.org/jira/browse/DRILL-7286
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Anton Gozhiy


*Steps:*
# Create some test table, like:
{code:sql}
create table t as select * from cp.`employee.json`;
{code}
# Execute the query:
{code:sql}
select * from (select * from t) d1 join t d2 on d1.employee_id = d2.employee_id 
limit 1;
{code}

*Expected result:*
A result should be returned normally.

*Actual result:*
Exception happened:
{noformat}
Error: SYSTEM ERROR: IndexOutOfBoundsException: index (2) must be less than 
size (2)


Please, refer to logs for more information.

[Error Id: 92a5ce8e-8640-4636-a897-8f360ddf8ea3 on userf87d-pc:31010]

  (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
during fragment initialization: index (2) must be less than size (2)
org.apache.drill.exec.work.foreman.Foreman.run():305
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (java.lang.IndexOutOfBoundsException) index (2) must be less than 
size (2)
com.google.common.base.Preconditions.checkElementIndex():310
com.google.common.base.Preconditions.checkElementIndex():293
com.google.common.collect.RegularImmutableList.get():67
org.apache.calcite.util.Pair$3.get():295

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject():163

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject():44
org.apache.drill.exec.planner.physical.ProjectPrel.accept():105

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():196

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():44

org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitJoin():51
org.apache.drill.exec.planner.physical.JoinPrel.accept():71

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():196

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():44
org.apache.drill.exec.planner.physical.LimitPrel.accept():88

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():196

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():44

org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitExchange():46
org.apache.drill.exec.planner.physical.ExchangePrel.accept():36

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():196

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitPrel():44
org.apache.drill.exec.planner.physical.LimitPrel.accept():88

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject():157

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject():44
org.apache.drill.exec.planner.physical.ProjectPrel.accept():105

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen():76

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen():44
org.apache.drill.exec.planner.physical.ScreenPrel.accept():65

org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.insertRenameProject():71

org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():513
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():178
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():226
org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():124
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():90
org.apache.drill.exec.work.foreman.Foreman.runSQL():593
org.apache.drill.exec.work.foreman.Foreman.run():276
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748 (state=,code=0)
{noformat}

*Note:* The same query without subquery works fine:
{code:sql}
select * from t d1 join t d2 on d1.employee_id = d2.employee_id limit 1;
{code}
{noformat}
+-+--++---+-++--+---++---+-+---+-+++---+--+--+-++--+-+---++-+---+-++--+-+-+---+
| employee_id |  full_name   | first_name | last_name | position_id | 
position_title | 

[jira] [Created] (DRILL-7285) A temporary table has a higher priority than the cte table with the same name.

2019-06-06 Thread Anton Gozhiy (JIRA)
Anton Gozhiy created DRILL-7285:
---

 Summary: A temporary table has a higher priority than the cte 
table with the same name.
 Key: DRILL-7285
 URL: https://issues.apache.org/jira/browse/DRILL-7285
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.16.0
Reporter: Anton Gozhiy


*Steps:*
# Switch to a workspace:
{code:sql}
use dfs.tmp
{code}
# Create a temporary table:
{code:sql}
create temporary table t as select 'temp table' as a;
{code}
# Run the following query:
{code:sql}
with t as (select 'cte' as a) select * from t;
{code}

*Expected result:* content from the CTE table should be returned:
{noformat}
++
| a  |
++
|cte |
++
{noformat}

*Actual result:* the temporary table content is returned instead:
{noformat}
++
| a  |
++
| temp table |
++
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7181) [Text V3 Reader] Exception with inadequate message is thrown if select columns as array with extractHeader set to true

2019-06-05 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7181.
---

Verified with Drill version 1.17.0-SNAPSHOT (commit 
2615d68de4e44b1f03f5c047018548c06a7396b4)
The message is clear now.

> [Text V3 Reader] Exception with inadequate message is thrown if select 
> columns as array with extractHeader set to true
> --
>
> Key: DRILL-7181
> URL: https://issues.apache.org/jira/browse/DRILL-7181
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> *Prerequisites:*
>  # Create a simple .csv file with header, like this:
> {noformat}
> col1,col2,col3
> 1,2,3
> 4,5,6
> 7,8,9
> {noformat}
>  # Set exec.storage.enable_v3_text_reader=true
>  # Set "extractHeader": true for csv format in dfs storage plugin.
> *Query:*
> {code:sql}
> select columns[0] from dfs.tmp.`/test.csv`
> {code}
> *Expected result:* Exception should happen, here is the message from V2 
> reader:
> {noformat}
> UNSUPPORTED_OPERATION ERROR: Drill Remote Exception
>   (java.lang.Exception) UNSUPPORTED_OPERATION ERROR: With extractHeader 
> enabled, only header names are supported
> column name columns
> column index
> Fragment 0:0
> [Error Id: 5affa696-1dbd-43d7-ac14-72d235c00f43 on userf87d-pc:31010]
> org.apache.drill.common.exceptions.UserException$Builder.build():630
> 
> org.apache.drill.exec.store.easy.text.compliant.FieldVarCharOutput.():106
> 
> org.apache.drill.exec.store.easy.text.compliant.CompliantTextRecordReader.setup():139
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():321
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():216
> org.apache.drill.exec.physical.impl.ScanBatch.next():271
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():101
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():101
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.record.AbstractRecordBatch.next():126
> org.apache.drill.exec.record.AbstractRecordBatch.next():116
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283
> ...():0
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():283
> org.apache.drill.common.SelfCleaningRunnable.run():38
> ...():0
> {noformat}
> *Actual result:* The exception message is inadequate:
> {noformat}
> org.apache.drill.common.exceptions.UserRemoteException: EXECUTION_ERROR 
> ERROR: Table schema must have exactly one column.
> Exception thrown from 
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec
> Fragment 0:0
> [Error Id: a76a1576-419a-413f-840f-088157167a6d on userf87d-pc:31010]
>   (java.lang.IllegalStateException) Table schema must have exactly one column.
> 
> org.apache.drill.exec.physical.impl.scan.columns.ColumnsArrayManager.resolveColumn():108
> 
> org.apache.drill.exec.physical.impl.scan.project.ReaderLevelProjection.resolveSpecial():91
> 
> org.apache.drill.exec.physical.impl.scan.project.ExplicitSchemaProjection.resolveRootTuple():62
> 
> org.apache.drill.exec.physical.impl.scan.project.ExplicitSchemaProjection.():52
> 
> 

[jira] [Closed] (DRILL-4843) Trailing spaces in CSV column headers cause IndexOutOfBoundsException

2019-05-21 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-4843.
---
Resolution: Fixed

The issue is not reproducible with Dill version 1.17.0-SNAPSHOT (commit id 
0195d1f34be7fd385ba76d2fd3e14a9fa13bd375)

> Trailing spaces in CSV column headers cause IndexOutOfBoundsException
> -
>
> Key: DRILL-4843
> URL: https://issues.apache.org/jira/browse/DRILL-4843
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.6.0, 1.7.0
> Environment: MapR Community cluster on CentOS 7.2
>Reporter: Matt Keranen
>Assignee: Paul Rogers
>Priority: Major
>
> When a CSV file with a header row has spaces after commas, an IOBE is thrown 
> when trying to reference column names. For example, this will cause the 
> exeption:
> {{col1, col2, col3}}
> Where this will not
> {{col1,col2,col3}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5555) CSV file without headers: "SELECT a" fails, "SELECT columns, a" succeeds

2019-05-21 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844966#comment-16844966
 ] 

Anton Gozhiy commented on DRILL-:
-

The issue is fixed in V3 reader for the case with "columns", but the case with 
star is still reproducible.

> CSV file without headers: "SELECT a" fails, "SELECT columns, a" succeeds
> 
>
> Key: DRILL-
> URL: https://issues.apache.org/jira/browse/DRILL-
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Consider the case discussed in DRILL-5554. Do exactly the same setup, but 
> with a slightly different query. The results are much different.
> Create a CSV file without headers:
> {code}
> 10,foo,bar
> {code}
> Use a CSV storage plugin configured to not skip the first line and not read 
> headers.
> Then, issue the following query:
> {code}
> SELECT columns, a FROM `dfs.data.example.csv`
> {code}
> Result:
> {code}
> columns,a
> ["10","foo","bar"],null
> {code}
> Schema:
> {code}
> columns(VARCHAR:REPEATED), 
> a(INT:OPTIONAL)
> {code}
> Since the query in DRILL-5554 fails:
> {code}
> SELECT a FROM ...
> {code}
> Expected the query described here to also fail, for a similar reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-5554) Wrong error type for "SELECT a" from a CSV file without headers

2019-05-21 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-5554.
---
Resolution: Fixed

Verified with Dill version 1.17.0-SNAPSHOT (commit id 
0195d1f34be7fd385ba76d2fd3e14a9fa13bd375)

The issue is fixed in V3 Text Reader.
It is validation error now.

> Wrong error type for "SELECT a" from a CSV file without headers
> ---
>
> Key: DRILL-5554
> URL: https://issues.apache.org/jira/browse/DRILL-5554
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Trivial
>
> Create a CSV file without headers:
> {code}
> 10,foo,bar
> {code}
> Use a CSV storage plugin configured to not skip the first line and not read 
> headers.
> Then, issue the following query:
> {code}
> SELECT a FROM `dfs.data.example.csv`
> {code}
> The result is correct: an error:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: 
> DATA_READ ERROR: Selected column 'a' must have name 'columns' or must be 
> plain '*'
> {code}
> But, the type of error is wrong. This is not a data read error: the file read 
> just fine. The problem is a semantic error: a query form that is not 
> compatible wth the storage plugin.
> Suggest using {{UserException.unsupportedError()}} instead since the user is 
> asking the plugin to do something that the plugin does not support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-5487) Vector corruption in CSV with headers and truncated last row

2019-05-21 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-5487.
---
Resolution: Fixed

Verified with Dill version 1.17.0-SNAPSHOT (commit id 
0195d1f34be7fd385ba76d2fd3e14a9fa13bd375)

The issue is fixed in V3 Text Reader.

> Vector corruption in CSV with headers and truncated last row
> 
>
> Key: DRILL-5487
> URL: https://issues.apache.org/jira/browse/DRILL-5487
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Priority: Major
> Fix For: Future
>
>
> The CSV format plugin allows two ways of reading data:
> * As named columns
> * As a single array, called {{columns}}, that holds all columns for a row
> The named columns feature will corrupt the offset vectors if the last row of 
> the file is truncated: leaves off one or more columns.
> To illustrate the CSV data corruption, I created a CSV file, test4.csv, of 
> the following form:
> {code}
> h,u
> abc,def
> ghi
> {code}
> Note that the file is truncated: the command and second field is missing on 
> the last line.
> Then, I created a simple test using the "cluster fixture" framework:
> {code}
>   @Test
>   public void readerTest() throws Exception {
> FixtureBuilder builder = ClusterFixture.builder()
> .maxParallelization(1);
> try (ClusterFixture cluster = builder.build();
>  ClientFixture client = cluster.clientFixture()) {
>   TextFormatConfig csvFormat = new TextFormatConfig();
>   csvFormat.fieldDelimiter = ',';
>   csvFormat.skipFirstLine = false;
>   csvFormat.extractHeader = true;
>   cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat);
>   String sql = "SELECT * FROM `dfs.data`.`csv/test4.csv` LIMIT 10";
>   client.queryBuilder().sql(sql).printCsv();
> }
>   }
> {code}
> The results show we've got a problem:
> {code}
> Exception (no rows returned): 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> IllegalArgumentException: length: -3 (expected: >= 0)
> {code}
> If the last line were:
> {code}
> efg,
> {code}
> Then the offset vector should look like this:
> {code}
> [0, 3, 3]
> {code}
> Very likely we have an offset vector that looks like this instead:
> {code}
> [0, 3, 0]
> {code}
> When we compute the second column of the second row, we should compute:
> {code}
> length = offset[2] - offset[1] = 3 - 3 = 0
> {code}
> Instead we get:
> {code}
> length = offset[2] - offset[1] = 0 - 3 = -3
> {code}
> The summary is that a premature EOF appears to cause the "missing" columns to 
> be skipped; they are not filled with a blank value to "bump" the offset 
> vectors to fill in the last row. Instead, they are left at 0, causing havoc 
> downstream in the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >