[jira] [Created] (DRILL-7764) Cleanup warning messages in GuavaPatcher class

2020-07-03 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7764:
-

 Summary: Cleanup warning messages in GuavaPatcher class
 Key: DRILL-7764
 URL: https://issues.apache.org/jira/browse/DRILL-7764
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


Currently GuavaPatcher contains
{code}
logger.warn("Unable to patch Guava classes.", e);
{code}
which outputs whole exception stack trace to logs which is unnecessary alarming.

This log message will be changed to 
{code}
logger.warn("Unable to patch Guava classes: {}", e.getMessage());
logger.debug("Exception:", e);
{code}
logging the stack trace only in debug mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7760) REFRESH TABLE METADATA increases the planning time while querying over parquet files.

2020-06-30 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17148815#comment-17148815
 ] 

Bohdan Kazydub commented on DRILL-7760:
---

[~mtewathia99] it's hard to tell anything without a query and table schema. Can 
you provide reproducible example with steps to reproduce, please?

> REFRESH TABLE METADATA increases the planning time while querying over 
> parquet files.
> -
>
> Key: DRILL-7760
> URL: https://issues.apache.org/jira/browse/DRILL-7760
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.17.0
> Environment: Apache Drill Web UI ( 1.170.0 )
>Reporter: Mukul Tewathia
>Priority: Major
> Fix For: Future
>
>
> After running 
> REFRESH TABLE METADATA 
> The planning time increases for a particular query increases from 35 secs to 
> more than 3 minutes.
> Parquet file consist of 10M rows and 7 columns and is partitioned on 2 
> columns ( both the column contains 8 unique values ).
> What could be the possible explanation for this behaviour ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7759) Code compilation exception for queries containing (untyped) NULL

2020-06-30 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7759:
-

 Summary: Code compilation exception for queries containing 
(untyped) NULL
 Key: DRILL-7759
 URL: https://issues.apache.org/jira/browse/DRILL-7759
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7750) Drill fails to read KeyStore password from Credential provider

2020-06-17 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7750:
-

 Summary: Drill fails to read KeyStore password from Credential 
provider
 Key: DRILL-7750
 URL: https://issues.apache.org/jira/browse/DRILL-7750
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub
 Fix For: 1.18


When core-site.xml has keystore or truststore specific properties along with 
Hadoop's CredentialProvider path, e.g.:
{code}



...
 
  ssl.server.truststore.location
  /etc/conf/ssl_truststore


  ssl.server.truststore.type
  jks


  ssl.server.truststore.reload.interval
  1


  ssl.server.keystore.location
  /etc/conf/ssl_keystore


  ssl.server.keystore.type
  jks


  hadoop.security.credential.provider.path
jceks://file/etc/conf/ssl_server.jceks

{code}
Drill fails to start.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7694) Register drill.queries.* counter metrics on Drillbit startup

2020-04-09 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7694:
-

 Summary: Register drill.queries.* counter metrics on Drillbit 
startup 
 Key: DRILL-7694
 URL: https://issues.apache.org/jira/browse/DRILL-7694
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7677) NPE in getStringFromVarCharHolder(NullableVarCharHolder)

2020-03-30 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17070945#comment-17070945
 ] 

Bohdan Kazydub commented on DRILL-7677:
---

In the example UDF you are referencing there is a [nullability 
check|https://github.com/apache/drill/blob/master/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java#L104]
 before calling the 
{{org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input)}}
 to make sure this method is not invoked when {{input.isSet == 0}}. So no NPE 
is thrown, is it?
Could you provide an example of the issue you're having, please?

> NPE in getStringFromVarCharHolder(NullableVarCharHolder) 
> -
>
> Key: DRILL-7677
> URL: https://issues.apache.org/jira/browse/DRILL-7677
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Niels Basjes
>Priority: Major
>
> Assume you have a function ([like this 
> one|https://github.com/apache/drill/blob/master/contrib/udfs/src/main/java/org/apache/drill/exec/udfs/UserAgentFunctions.java#L110])
>  that has
> {code:java}
> @Param
> NullableVarCharHolder input;
> {code}
> and in the {code:java}eval(){code} function you do
> {code:java}String inputString = 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.getStringFromVarCharHolder(input);{code}
> When this function is called with the input actually being null the 'input' 
> parameter is an instance of NullableVarCharHolder where isSet is 0 and buffer 
> is null.
> The buffer being null causes an NPE in the call to 
> getStringFromVarCharHolder(NullableVarCharHolder)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7565) ANALYZE TABLE ... REFRESH METADATA does not work for empty Parquet files

2020-02-18 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7565:
--
Labels: ready-to-commit  (was: )

> ANALYZE TABLE ... REFRESH METADATA does not work for empty Parquet files
> 
>
> Key: DRILL-7565
> URL: https://issues.apache.org/jira/browse/DRILL-7565
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Bohdan Kazydub
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> The following query does not create metadata for empty Parquet table:
> {code:java}
> @Test
>   public void testAnalyzeEmptyParquetTable() throws Exception {
> dirTestWatcher.copyResourceToRoot(Paths.get("parquet", "empty"));
> String tableName = "parquet/empty/simple/empty_simple.parquet";
> try {
>   client.alterSession(ExecConstants.METASTORE_ENABLED, true);
>   testBuilder()
>   .sqlQuery("ANALYZE TABLE dfs.`%s` REFRESH METADATA", tableName)
>   .unOrdered()
>   .baselineColumns("ok", "summary")
>   .baselineValues(true, String.format("Collected / refreshed metadata 
> for table [dfs.default.%s]", tableName))
>   .go();
> } finally {
>   run("analyze table dfs.`%s` drop metadata if exists", tableName);
>   client.resetSession(ExecConstants.METASTORE_ENABLED);
> }
>   }
> {code}
> but yields
> {code:java}
> java.lang.AssertionError: Different number of records returned 
> Expected :1
> Actual   :0
> 
>   at 
> org.apache.drill.test.DrillTestWrapper.compareResults(DrillTestWrapper.java:862)
>   at 
> org.apache.drill.test.DrillTestWrapper.compareUnorderedResults(DrillTestWrapper.java:567)
>   at org.apache.drill.test.DrillTestWrapper.run(DrillTestWrapper.java:171)
>   at org.apache.drill.test.TestBuilder.go(TestBuilder.java:145)
>   at 
> org.apache.drill.exec.store.parquet.TestEmptyParquet.testSelectWithDisabledMetastore(TestEmptyParquet.java:430)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> When changing expected result set to empty 
> ({{TestBuilder#expectsEmptyResultSet()}}), {{SHOW TABLES}} command after 
> {{ANALYZE TABLE ...}} does not show any table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7565) ANALYZE TABLE ... REFRESH METADATA does not work for empty Parquet files

2020-02-03 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7565:
-

 Summary: ANALYZE TABLE ... REFRESH METADATA does not work for 
empty Parquet files
 Key: DRILL-7565
 URL: https://issues.apache.org/jira/browse/DRILL-7565
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.17.0
Reporter: Bohdan Kazydub
Assignee: Vova Vysotskyi


The following query does not create metadata for empty Parquet table: 
{code}
@Test
  public void testAnalyzeEmptyParquetTable() throws Exception {

String tableName = "parquet/empty/simple/empty_simple.parquet";

try {
  client.alterSession(ExecConstants.METASTORE_ENABLED, true);
  testBuilder()
  .sqlQuery("ANALYZE TABLE dfs.`%s` REFRESH METADATA", tableName)
  .unOrdered()
  .baselineColumns("ok", "summary")
  .baselineValues(true, String.format("Collected / refreshed metadata 
for table [dfs.default.%s]", tableName))
  .go();
} finally {
  run("analyze table dfs.`%s` drop metadata if exists", tableName);
  client.resetSession(ExecConstants.METASTORE_ENABLED);
}
  }
{code}
but yields
{code}
java.lang.AssertionError: Different number of records returned 
Expected :1
Actual   :0



at 
org.apache.drill.test.DrillTestWrapper.compareResults(DrillTestWrapper.java:862)
at 
org.apache.drill.test.DrillTestWrapper.compareUnorderedResults(DrillTestWrapper.java:567)
at org.apache.drill.test.DrillTestWrapper.run(DrillTestWrapper.java:171)
at org.apache.drill.test.TestBuilder.go(TestBuilder.java:145)
at 
org.apache.drill.exec.store.parquet.TestEmptyParquet.testSelectWithDisabledMetastore(TestEmptyParquet.java:430)
at java.lang.Thread.run(Thread.java:748)
{code}


When changing expected result set to empty 
({{TestBuilder#expectsEmptyResultSet()}}), {{SHOW TABLES}} command after 
{{ANALYZE TABLE ...}} does not show any table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7527) DROP METADATA doesn't work with table name starting with '/' inside workspace

2020-01-22 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7527:
--
Labels: ready-to-commit  (was: )

> DROP METADATA doesn't work with table name starting with '/' inside workspace
> -
>
> Key: DRILL-7527
> URL: https://issues.apache.org/jira/browse/DRILL-7527
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Denys Ordynskiy
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> *Description:*
> - create parquet table with CTAS: CREATE TABLE dfs.tmp.`folder/file` AS 
> SELECT * FROM cp.`employee.json`;
> - refresh metadata for table: ANALYZE TABLE dfs.tmp.`folder/file` REFRESH 
> METADATA;
> - drop metadata: ANALYZE TABLE dfs.tmp.`/folder/file` DROP METADATA [IF 
> EXISTS];
> *Expexted result:*
> Metadata for table [folder/file] dropped.
> *Actual result:*
> Error: VALIDATION ERROR: Metadata for table [/folder/file] not found.
> Shoud work as DROP TABLE:
> DROP TABLE dfs.tmp.`/folder/file`;
> Table [folder/file] dropped



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7504) Upgrade Parquet library to 1.11.0

2020-01-21 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17020188#comment-17020188
 ] 

Bohdan Kazydub commented on DRILL-7504:
---

[~arina], after upgrading the Parquet versions, each of the tests in 
`org.apache.drill.metastore.iceberg.write.TestParquetFileWriter` fails with
{code}
java.lang.ExceptionInInitializerError
at 
org.apache.iceberg.parquet.Parquet$WriteBuilder.build(Parquet.java:212)
at 
org.apache.drill.metastore.iceberg.write.ParquetFileWriter.write(ParquetFileWriter.java:82)
at 
org.apache.drill.metastore.iceberg.write.TestParquetFileWriter.testAllTypes(TestParquetFileWriter.java:100)
Caused by: java.lang.RuntimeException: Cannot find constructor for interface 
org.apache.parquet.column.page.PageWriteStore
Missing 
org.apache.parquet.hadoop.ColumnChunkPageWriteStore(org.apache.parquet.hadoop.CodecFactory$BytesCompressor,org.apache.parquet.schema.MessageType,org.apache.parquet.bytes.ByteBufferAllocator)
 [java.lang.NoSuchMethodException: 
org.apache.parquet.hadoop.ColumnChunkPageWriteStore.(org.apache.parquet.hadoop.CodecFactory$BytesCompressor,
 org.apache.parquet.schema.MessageType, 
org.apache.parquet.bytes.ByteBufferAllocator)]
at 
org.apache.iceberg.common.DynConstructors$Builder.build(DynConstructors.java:235)
at 
org.apache.iceberg.parquet.ParquetWriter.(ParquetWriter.java:57)
... 3 more
{code}
This happens because constructor for a Parquet's class 
`org.apache.parquet.hadoop.ColumnChunkPageWriteStore` was expanded with more 
arguments. Because of this issue, the upgrade is not possible yet, should wait 
for a new version, the upgrade is already done in the 
[PR|https://github.com/apache/incubator-iceberg/pull/708].

The tests previously being ignored do pass now, checked other test classes 
(manually) with Parquet - they seem to pass too, though haven't ran the whole 
test suit (advanced, functional and full unit tests).

> Upgrade Parquet library to 1.11.0
> -
>
> Key: DRILL-7504
> URL: https://issues.apache.org/jira/browse/DRILL-7504
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.18.0
>
>
> Upgrade Parquet library to 1.11.0
> Apache Parquet Format to 2.7.0 (2.8.0 will be released soon)
> Check ignored tests in:
> org.apache.drill.exec.store.parquet.TestParquetMetadataCache
> org.apache.drill.exec.store.parquet.TestPushDownAndPruningForDecimal
> org.apache.drill.exec.store.parquet.TestPushDownAndPruningForVarchar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7506) Simplify code gen error handling

2020-01-15 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7506:
--
Labels: ready-to-commit  (was: )

> Simplify code gen error handling
> 
>
> Key: DRILL-7506
> URL: https://issues.apache.org/jira/browse/DRILL-7506
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Code generation can generate a variety of errors. Most operators bubble these 
> exceptions up several layers in the code before catching them. This patch 
> moves error handling closer to the code gen itself to allow a) simpler code, 
> and b) clearer error messages.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7359) Add support for DICT type in RowSet Framework

2020-01-02 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006926#comment-17006926
 ] 

Bohdan Kazydub commented on DRILL-7359:
---

Merged into Apache master with commit id 
[7c813ff6440a118de15f552145b40eb07bb8e7a2|https://github.com/apache/drill/commit/7c813ff6440a118de15f552145b40eb07bb8e7a2].

> Add support for DICT type in RowSet Framework
> -
>
> Key: DRILL-7359
> URL: https://issues.apache.org/jira/browse/DRILL-7359
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Add support for new DICT data type (see DRILL-7096) in RowSet Framework



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (DRILL-7497) Fix warnings when starting Drill on Windows using Java 11

2020-01-02 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006923#comment-17006923
 ] 

Bohdan Kazydub edited comment on DRILL-7497 at 1/2/20 4:43 PM:
---

Merged into Apache master with commit id 
[43cc1402fc9a569092fe31e8b3b0cd47cbd2ec6d|https://github.com/apache/drill/commit/43cc1402fc9a569092fe31e8b3b0cd47cbd2ec6d].


was (Author: kazydubb):
Merged int Apache master with commit id 
[43cc1402fc9a569092fe31e8b3b0cd47cbd2ec6d|https://github.com/apache/drill/commit/43cc1402fc9a569092fe31e8b3b0cd47cbd2ec6d].

> Fix warnings when starting Drill on Windows using Java 11
> -
>
> Key: DRILL-7497
> URL: https://issues.apache.org/jira/browse/DRILL-7497
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Warnings are displayed in SqlLine when starting Drill in embedded mode on 
> Windows using Java 11:
> {noformat}
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by javassist.util.proxy.SecurityActions 
> (file:/C:/drill_1_17/apache-drill-1.17.0/apache-drill-1.17.0/jars/3rdparty/javassist-3.26.0-GA.jar)
>  to method 
> java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
> WARNING: Please consider reporting this to the maintainers of 
> javassist.util.proxy.SecurityActions
> WARNING: Use --illegal-access=warn to enable warnings of further illegal 
> reflective access operations
> WARNING: All illegal access operations will be denied in a future release
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7406) Update Calcite to 1.21.0

2020-01-02 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006924#comment-17006924
 ] 

Bohdan Kazydub commented on DRILL-7406:
---

Merged into Apache master with commit id 
[4f55e71dc971d42054a031acd000ddf8337e90d9|https://github.com/apache/drill/commit/4f55e71dc971d42054a031acd000ddf8337e90d9].

> Update Calcite to 1.21.0
> 
>
> Key: DRILL-7406
> URL: https://issues.apache.org/jira/browse/DRILL-7406
> Project: Apache Drill
>  Issue Type: Task
>  Components: Query Planning  Optimization, SQL Parser
>Affects Versions: 1.17.0
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> DRILL-7340 should be fixed by this update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7497) Fix warnings when starting Drill on Windows using Java 11

2020-01-02 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006923#comment-17006923
 ] 

Bohdan Kazydub commented on DRILL-7497:
---

Merged int Apache master with commit id 
[43cc1402fc9a569092fe31e8b3b0cd47cbd2ec6d|https://github.com/apache/drill/commit/43cc1402fc9a569092fe31e8b3b0cd47cbd2ec6d].

> Fix warnings when starting Drill on Windows using Java 11
> -
>
> Key: DRILL-7497
> URL: https://issues.apache.org/jira/browse/DRILL-7497
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Vova Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Warnings are displayed in SqlLine when starting Drill in embedded mode on 
> Windows using Java 11:
> {noformat}
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by javassist.util.proxy.SecurityActions 
> (file:/C:/drill_1_17/apache-drill-1.17.0/apache-drill-1.17.0/jars/3rdparty/javassist-3.26.0-GA.jar)
>  to method 
> java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
> WARNING: Please consider reporting this to the maintainers of 
> javassist.util.proxy.SecurityActions
> WARNING: Use --illegal-access=warn to enable warnings of further illegal 
> reflective access operations
> WARNING: All illegal access operations will be denied in a future release
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7505) PCAP Plugin Fails on IPv6 Packets

2020-01-02 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006921#comment-17006921
 ] 

Bohdan Kazydub commented on DRILL-7505:
---

Merged into Apache master with commit id 
[a00d70db4c3d63eeaec03deb94070ff494a12b64|https://github.com/apache/drill/commit/a00d70db4c3d63eeaec03deb94070ff494a12b64].

> PCAP Plugin Fails on IPv6 Packets
> -
>
> Key: DRILL-7505
> URL: https://issues.apache.org/jira/browse/DRILL-7505
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> In its current implementation, the PCAP parser fails with an exception if it 
> encounters a protocol it does not have in its enumerated list. 
> This is not an acceptable strategy in that there are many protocols out there 
> and having Drill crash when it encounters an unknown protocol is not helpful 
> or useful.  This minor PR makes the PCAP plugin more fault tolerant.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7509) Incorrect TupleSchema is created for DICT column when querying Parquet files

2020-01-02 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7509:
--
Affects Version/s: 1.16.0

> Incorrect TupleSchema is created for DICT column when querying Parquet files
> 
>
> Key: DRILL-7509
> URL: https://issues.apache.org/jira/browse/DRILL-7509
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Bohdan Kazydub
>Priority: Major
>
> When {{DICT}} column is queried from Parquet file, its {{TupleSchema}} 
> contains nested element, e.g. `map`, itself contains `key` and `value` 
> fields, rather than containing the `key` and `value` fields in the {{DICT}}'s 
> {{TupleSchema}} itself. The nested element, `map`, comes from the inner 
> structure of Parquet's {{MAP}} (which corresponds to Drill's {{DICT}}) 
> representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7509) Incorrect TupleSchema is created for DICT column when querying Parquet files

2020-01-02 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7509:
--
Description: When {{DICT}} column is queried from Parquet file, its 
{{TupleSchema}} contains nested element, e.g. `map`, itself contains `key` and 
`value` fields, rather than containing the `key` and `value` fields in the 
{{DICT}}'s {{TupleSchema}} itself. The nested element, `map`, comes from the 
inner structure of Parquet's {{MAP}} (which corresponds to Drill's {{DICT}}) 
representation.

> Incorrect TupleSchema is created for DICT column when querying Parquet files
> 
>
> Key: DRILL-7509
> URL: https://issues.apache.org/jira/browse/DRILL-7509
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Bohdan Kazydub
>Priority: Major
>
> When {{DICT}} column is queried from Parquet file, its {{TupleSchema}} 
> contains nested element, e.g. `map`, itself contains `key` and `value` 
> fields, rather than containing the `key` and `value` fields in the {{DICT}}'s 
> {{TupleSchema}} itself. The nested element, `map`, comes from the inner 
> structure of Parquet's {{MAP}} (which corresponds to Drill's {{DICT}}) 
> representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7509) Incorrect TupleSchema is created for DICT column when querying Parquet files

2020-01-02 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7509:
--
Summary: Incorrect TupleSchema is created for DICT column when querying 
Parquet files  (was: Incorrect TupleSchema is created when )

> Incorrect TupleSchema is created for DICT column when querying Parquet files
> 
>
> Key: DRILL-7509
> URL: https://issues.apache.org/jira/browse/DRILL-7509
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Bohdan Kazydub
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7509) Incorrect TupleSchema is created when

2020-01-02 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7509:
-

 Summary: Incorrect TupleSchema is created when 
 Key: DRILL-7509
 URL: https://issues.apache.org/jira/browse/DRILL-7509
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7453) Update joda-time to 2.10.5 to have correct time zone info

2019-11-21 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7453:
-

 Summary: Update joda-time to 2.10.5 to have correct time zone info
 Key: DRILL-7453
 URL: https://issues.apache.org/jira/browse/DRILL-7453
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


As Brazil decided not to follow the DST changes for 2019 
(https://www.timeanddate.com/news/time/brazil-scraps-dst.html), update 
joda-time to the latest {{2.10.5}} version which contains the most recent dbtz 
info.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7448) Fix warnings when running Drill memory tests

2019-11-18 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub reassigned DRILL-7448:
-

Assignee: Bohdan Kazydub  (was: Bohdan Kazydub)

> Fix warnings when running Drill memory tests
> 
>
> Key: DRILL-7448
> URL: https://issues.apache.org/jira/browse/DRILL-7448
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.17.0
>
>
> {noformat}
> -- drill-memory-base 
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.drill.exec.memory.TestEndianess
> [INFO] Running org.apache.drill.exec.memory.TestAccountant
> 16:21:45,719 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could 
> NOT find resource [logback.groovy]
> 16:21:45,719 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found 
> resource [logback-test.xml] at 
> [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml]
> 16:21:45,733 |-INFO in 
> ch.qos.logback.core.joran.spi.ConfigurationWatchList@dbd940d - URL 
> [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml]
>  is not of type file
> 16:21:45,780 |-INFO in 
> ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not 
> set
> 16:21:45,802 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - 
> Could not find Janino library on the class path. Skipping conditional 
> processing.
> 16:21:45,802 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See 
> also http://logback.qos.ch/codes.html#ifJanino
> 16:21:45,803 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - 
> About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
> 16:21:45,811 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - 
> Naming appender as [STDOUT]
> 16:21:45,826 |-INFO in 
> ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default 
> type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] 
> property
> 16:21:45,866 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT 
> level set to ERROR
> 16:21:45,866 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - 
> Could not find Janino library on the class path. Skipping conditional 
> processing.
> 16:21:45,866 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See 
> also http://logback.qos.ch/codes.html#ifJanino
> 16:21:45,866 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - 
> The object on the top the of the stack is not the root logger
> 16:21:45,866 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - 
> It is: ch.qos.logback.core.joran.conditional.IfAction
> 16:21:45,866 |-INFO in 
> ch.qos.logback.classic.joran.action.ConfigurationAction - End of 
> configuration.
> 16:21:45,867 |-INFO in 
> ch.qos.logback.classic.joran.JoranConfigurator@71d15f18 - Registering current 
> configuration as safe fallback point
> 16:21:45,717 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could 
> NOT find resource [logback.groovy]
> 16:21:45,717 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found 
> resource [logback-test.xml] at 
> [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml]
> 16:21:45,729 |-INFO in 
> ch.qos.logback.core.joran.spi.ConfigurationWatchList@2698dc7 - URL 
> [jar:file:/Users/arina/Development/git_repo/drill/common/target/drill-common-1.17.0-SNAPSHOT-tests.jar!/logback-test.xml]
>  is not of type file
> 16:21:45,778 |-INFO in 
> ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not 
> set
> 16:21:45,807 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - 
> Could not find Janino library on the class path. Skipping conditional 
> processing.
> 16:21:45,807 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See 
> also http://logback.qos.ch/codes.html#ifJanino
> 16:21:45,808 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - 
> About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
> 16:21:45,814 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - 
> Naming appender as [STDOUT]
> 16:21:45,829 |-INFO in 
> ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default 
> type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] 
> property
> 16:21:45,868 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT 
> level set to ERROR
> 16:21:45,868 |-ERROR in 

[jira] [Assigned] (DRILL-7440) Failure during loading of RepeatedCount functions

2019-11-08 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub reassigned DRILL-7440:
-

Assignee: Bohdan Kazydub  (was: Bohdan Kazydub)

> Failure during loading of RepeatedCount functions
> -
>
> Key: DRILL-7440
> URL: https://issues.apache.org/jira/browse/DRILL-7440
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Critical
> Fix For: 1.17.0
>
>
> *Steps:*
> # Start Drillbit
> # Look at the drillbit.log
> *Expected result:* No exceptions should be present.
> *Actual result:*
> Null Pointer Exceptions occur:
> {noformat}
> 2019-11-06 03:06:40,401 [main] WARN  o.a.d.exec.expr.fn.FunctionConverter - 
> Failure loading function class 
> org.apache.drill.exec.expr.fn.impl.RepeatedCountFunctions$RepeatedCountRepeatedDict,
>  field input. Message: Failure while trying to access the ValueHolder's TYPE 
> static variable.  All ValueHolders must contain a static TYPE variable that 
> defines their MajorType.
> java.lang.NullPointerException: null
>   at 
> sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57)
>  ~[na:1.8.0_171]
>   at 
> sun.reflect.UnsafeObjectFieldAccessorImpl.get(UnsafeObjectFieldAccessorImpl.java:36)
>  ~[na:1.8.0_171]
>   at java.lang.reflect.Field.get(Field.java:393) ~[na:1.8.0_171]
>   at 
> org.apache.drill.exec.expr.fn.FunctionConverter.getStaticFieldValue(FunctionConverter.java:220)
>  ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.FunctionConverter.getHolder(FunctionConverter.java:136)
>  ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.validate(LocalFunctionRegistry.java:130)
>  [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.(LocalFunctionRegistry.java:88)
>  [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.(FunctionImplementationRegistry.java:113)
>  [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.server.DrillbitContext.(DrillbitContext.java:118) 
> [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at org.apache.drill.exec.work.WorkManager.start(WorkManager.java:116) 
> [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:222) 
> [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:581) 
> [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:551) 
> [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:547) 
> [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
> 2019-11-06 03:06:40,402 [main] WARN  o.a.d.e.e.f.r.LocalFunctionRegistry - 
> Unable to initialize function for class 
> org.apache.drill.exec.expr.fn.impl.RepeatedCountFunctions$RepeatedCountRepeatedDict
> 2019-11-06 03:06:40,487 [main] WARN  o.a.d.exec.expr.fn.FunctionConverter - 
> Failure loading function class 
> org.apache.drill.exec.expr.fn.impl.gaggr.CountFunctions$RepeatedDictCountFunction,
>  field in. Message: Failure while trying to access the ValueHolder's TYPE 
> static variable.  All ValueHolders must contain a static TYPE variable that 
> defines their MajorType.
> java.lang.NullPointerException: null
>   at 
> sun.reflect.UnsafeFieldAccessorImpl.ensureObj(UnsafeFieldAccessorImpl.java:57)
>  ~[na:1.8.0_171]
>   at 
> sun.reflect.UnsafeObjectFieldAccessorImpl.get(UnsafeObjectFieldAccessorImpl.java:36)
>  ~[na:1.8.0_171]
>   at java.lang.reflect.Field.get(Field.java:393) ~[na:1.8.0_171]
>   at 
> org.apache.drill.exec.expr.fn.FunctionConverter.getStaticFieldValue(FunctionConverter.java:220)
>  ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.FunctionConverter.getHolder(FunctionConverter.java:136)
>  ~[drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.validate(LocalFunctionRegistry.java:130)
>  [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.registry.LocalFunctionRegistry.(LocalFunctionRegistry.java:88)
>  [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.expr.fn.FunctionImplementationRegistry.(FunctionImplementationRegistry.java:113)
>  

[jira] [Commented] (DRILL-7397) Fix logback errors when building the project

2019-10-30 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963136#comment-16963136
 ] 

Bohdan Kazydub commented on DRILL-7397:
---

Merged into master with commit id 81dba65268fd86fd04d9c98a4de0be5288d43ae9.

> Fix logback errors when building the project
> 
>
> Key: DRILL-7397
> URL: https://issues.apache.org/jira/browse/DRILL-7397
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> {noformat}
> [INFO] Compiling 75 source files to /.../drill/common/target/classes
> [WARNING] Unable to autodetect 'javac' path, using 'javac' from the 
> environment.
> [INFO] 
> [INFO] --- exec-maven-plugin:1.6.0:java (default) @ drill-common ---
> 17:46:05,674 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could 
> NOT find resource [logback.groovy]
> 17:46:05,675 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found 
> resource [logback-test.xml] at 
> [file:/.../drill/common/src/test/resources/logback-test.xml]
> 17:46:05,712 |-INFO in 
> ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not 
> set
> 17:46:05,714 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - 
> Could not find Janino library on the class path. Skipping conditional 
> processing.
> 17:46:05,714 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See 
> also http://logback.qos.ch/codes.html#ifJanino
> 17:46:05,714 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - 
> About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
> 17:46:05,719 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - 
> Naming appender as [STDOUT]
> 17:46:05,724 |-INFO in 
> ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default 
> type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] 
> property
> 17:46:05,740 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT 
> level set to ERROR
> 17:46:05,740 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - 
> Could not find Janino library on the class path. Skipping conditional 
> processing.
> 17:46:05,740 |-ERROR in ch.qos.logback.core.joran.conditional.IfAction - See 
> also http://logback.qos.ch/codes.html#ifJanino
> 17:46:05,740 |-ERROR in ch.qos.logback.core.joran.action.AppenderRefAction - 
> Could not find an AppenderAttachable at the top of execution stack. Near 
> [appender-ref] line 59
> 17:46:05,740 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - 
> The object on the top the of the stack is not the root logger
> 17:46:05,740 |-WARN in ch.qos.logback.classic.joran.action.RootLoggerAction - 
> It is: ch.qos.logback.core.joran.conditional.IfAction
> 17:46:05,740 |-INFO in 
> ch.qos.logback.classic.joran.action.ConfigurationAction - End of 
> configuration.
> 17:46:05,741 |-INFO in 
> ch.qos.logback.classic.joran.JoranConfigurator@58e3a2c7 - Registering current 
> configuration as safe fallback point
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7406) Update Calcite to 1.21.0

2019-10-28 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16961118#comment-16961118
 ] 

Bohdan Kazydub commented on DRILL-7406:
---

Also please undo changes done in 
{{org.apache.drill.exec.planner.physical.JoinPrel}} in scope of DRILL-7200 (as 
it is already fixed in CALCITE-3174).

> Update Calcite to 1.21.0
> 
>
> Key: DRILL-7406
> URL: https://issues.apache.org/jira/browse/DRILL-7406
> Project: Apache Drill
>  Issue Type: Task
>  Components: Query Planning  Optimization, SQL Parser
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
>
> DRILL-7340 should be fixed by this update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-2000) Hive generated parquet files with maps show up in drill as map(key value)

2019-09-19 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub resolved DRILL-2000.
---
Fix Version/s: (was: Future)
   Resolution: Fixed

Fixed in scope of DRILL-7096

> Hive generated parquet files with maps show up in drill as map(key value)
> -
>
> Key: DRILL-2000
> URL: https://issues.apache.org/jira/browse/DRILL-2000
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 0.7.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Bohdan Kazydub
>Priority: Major
>
> Created a parquet file in hive having the following DDL
> hive> desc alltypesparquet; 
> OK
> c1 int 
> c2 boolean 
> c3 double 
> c4 string 
> c5 array 
> c6 map 
> c7 map 
> c8 struct
> c9 tinyint 
> c10 smallint 
> c11 float 
> c12 bigint 
> c13 array>  
> c15 struct>
> c16 array,n:int>> 
> Time taken: 0.076 seconds, Fetched: 15 row(s)
> Columns which are maps such as c6 map 
> show up as 
> 0: jdbc:drill:> select c6 from `/user/hive/warehouse/alltypesparquet`;
> ++
> | c6 |
> ++
> | {"map":[]} |
> | {"map":[]} |
> | {"map":[{"key":1,"value":"eA=="},{"key":2,"value":"eQ=="}]} |
> ++
> 3 rows selected (0.078 seconds)
> hive> select c6 from alltypesparquet;   
> NULL
> NULL
> {1:"x",2:"y"}
> Ignore the wrong values, I have raised DRILL-1997 for the same. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7373) Fix problems involving reading from DICT type

2019-09-17 Thread Bohdan Kazydub (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931569#comment-16931569
 ] 

Bohdan Kazydub commented on DRILL-7373:
---

Merged into Apache master with commit id 
e32488c7d5992b1d111e8d6e4bbaf6369e8dd433.

> Fix problems involving reading from DICT type
> -
>
> Key: DRILL-7373
> URL: https://issues.apache.org/jira/browse/DRILL-7373
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.17.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Add better support for different key types ({{boolean}}, {{decimal}}, 
> {{float}}, {{double}} etc.) when retrieving values by key from {{DICT}} 
> column  when querying data source with known (during query validation phase) 
> field types (such as Hive table), so that actual key object instance  is 
> created in generated code and is passed to given {{DICT}} reader instead of 
> generating its value for every row based on {{int}} ({{ArraySegment}}) or 
> {{String}} ({{NamedSegment}}) value.
> This may be achieved by storing original literal value of passed key (as 
> {{Object}}) in {{PathSegment}} and its type (as {{MajorType}}) and using it 
> during code generation when reading {{DICT}}'s values by key in 
> {{EvaluationVisitor}}.
> Also, fix NPE when reading some cases involving reading values from {{DICT}} 
> and fix wrong result when reading complex structures using many ITEM 
> operators (i.e. , [] brackets), e.g. 
> {code}
> SELECT rid, mc.map_arr_map['key01'][1]['key01.1'] p16 FROM 
> hive.map_complex_tbl mc
> {code}
> where {{map_arr_map}} is of following type: {{MAP INT>>>}}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (DRILL-7252) Read Hive map using Dict vector

2019-09-16 Thread Bohdan Kazydub (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7252:
--
Labels: doc-impacting ready-to-commit  (was: doc-impacting)

> Read Hive map using Dict vector
> 
>
> Key: DRILL-7252
> URL: https://issues.apache.org/jira/browse/DRILL-7252
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.17.0
>
>
> Described in DRILL-3290 design doc. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (DRILL-7373) Fix problems involving reading from DICT type

2019-09-11 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7373:
-

 Summary: Fix problems involving reading from DICT type
 Key: DRILL-7373
 URL: https://issues.apache.org/jira/browse/DRILL-7373
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


Add better support for different key types ({{boolean}}, {{decimal}}, 
{{float}}, {{double}} etc.) when retrieving values by key from {{DICT}} column  
when querying data source with known (during query validation phase) field 
types (such as Hive table), so that actual key object instance  is created in 
generated code and is passed to given {{DICT}} reader instead of generating its 
value for every row based on {{int}} ({{ArraySegment}}) or {{String}} 
({{NamedSegment}}) value.

This may be achieved by storing original literal value of passed key (as 
{{Object}}) in {{PathSegment}} and its type (as {{MajorType}}) and using it 
during code generation when reading {{DICT}}'s values by key in 
{{EvaluationVisitor}}.

Also, fix NPE when reading some cases involving reading values from {{DICT}} 
and fix wrong result when reading complex structures using many ITEM operators 
(i.e. , [] brackets), e.g. 
{code}
SELECT rid, mc.map_arr_map['key01'][1]['key01.1'] p16 FROM hive.map_complex_tbl 
mc
{code}
where {{map_arr_map}} is of following type: {{MAP>>}}




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (DRILL-7359) Add support for DICT type in RowSet Framework

2019-08-23 Thread Bohdan Kazydub (Jira)
Bohdan Kazydub created DRILL-7359:
-

 Summary: Add support for DICT type in RowSet Framework
 Key: DRILL-7359
 URL: https://issues.apache.org/jira/browse/DRILL-7359
 Project: Apache Drill
  Issue Type: New Feature
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


Add support for new DICT data type (see DRILL-7096) in RowSet Framework



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (DRILL-7336) `cast_empty_string_to_null` option doesn't work when text file has > 1 column

2019-08-12 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905057#comment-16905057
 ] 

Bohdan Kazydub commented on DRILL-7336:
---

This option works when casting empty string as some other type, i.e. 
{{CAST(columns[0] as INT)}}. The option's description is wrong.

> `cast_empty_string_to_null` option doesn't work when text file has > 1 column
> -
>
> Key: DRILL-7336
> URL: https://issues.apache.org/jira/browse/DRILL-7336
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Denys Ordynskiy
>Priority: Major
>
> *Description:*
> 1 - create 2 nullable csv files with 1 and 2 columns:
> _one_col.csv_
> {noformat}
> 1
> 2
> 4
> {noformat}
> _two_col.csv_
> {noformat}
> 1,1
> 2,
> ,3
> 4,4
> {noformat}
> 2 - enable option:
> {noformat}
> alter system set `drill.exec.functions.cast_empty_string_to_null`=true;
> {noformat}
> 3 - query file with 1 column:
> {noformat}
> select columns[0] from dfs.tmp.`one_col.csv`;
> {noformat}
> | EXPR$0  |
> | 1   |
> | 2   |
> | null|
> | 4   |
> 4 - query file with 2 columns:
> {noformat}
> select columns[0] from dfs.tmp.`two_col.csv`;
> {noformat}
> *Expected result:*
> Table with NULL in the 3-rd row:
> | EXPR$0  |
> | 1   |
> | 2   |
> | null|
> | 4   |
> *Actual result:*
> {color:#d04437}Drill returns an empty string in the 3-rd row:{color}
> | EXPR$0  |
> | 1   |
> | 2   |
> | |
> | 4   |



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (DRILL-7084) ResultSet getObject method throws not implemented exception if the column type is NULL

2019-07-29 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7084:
--
Labels: ready-to-commit  (was: )

> ResultSet getObject method throws not implemented exception if the column 
> type is NULL
> --
>
> Key: DRILL-7084
> URL: https://issues.apache.org/jira/browse/DRILL-7084
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> This method is used by some tools, for example DBeaver. Not reproduced with 
> sqlline or Drill Web-UI.
> *Query:*
> {code:sql}
> select coalesce(n_name1, n_name2) from cp.`tpch/nation.parquet` limit 1;
> {code}
> *Expected result:*
> null
> *Actual result:*
> Exception is thrown:
> {noformat}
> java.lang.RuntimeException: not implemented
>   at 
> oadd.org.apache.calcite.avatica.AvaticaSite.notImplemented(AvaticaSite.java:421)
>   at oadd.org.apache.calcite.avatica.AvaticaSite.get(AvaticaSite.java:380)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:183)
>   at 
> org.jkiss.dbeaver.model.impl.jdbc.exec.JDBCResultSetImpl.getObject(JDBCResultSetImpl.java:628)
>   at 
> org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCObjectValueHandler.fetchColumnValue(JDBCObjectValueHandler.java:60)
>   at 
> org.jkiss.dbeaver.model.impl.jdbc.data.handlers.JDBCAbstractValueHandler.fetchValueObject(JDBCAbstractValueHandler.java:49)
>   at 
> org.jkiss.dbeaver.ui.controls.resultset.ResultSetDataReceiver.fetchRow(ResultSetDataReceiver.java:122)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.fetchQueryData(SQLQueryJob.java:729)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.executeStatement(SQLQueryJob.java:465)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.lambda$0(SQLQueryJob.java:392)
>   at org.jkiss.dbeaver.model.DBUtils.tryExecuteRecover(DBUtils.java:1598)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.executeSingleQuery(SQLQueryJob.java:390)
>   at 
> org.jkiss.dbeaver.runtime.sql.SQLQueryJob.extractData(SQLQueryJob.java:822)
>   at 
> org.jkiss.dbeaver.ui.editors.sql.SQLEditor$QueryResultsContainer.readData(SQLEditor.java:2532)
>   at 
> org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.lambda$0(ResultSetJobDataRead.java:93)
>   at org.jkiss.dbeaver.model.DBUtils.tryExecuteRecover(DBUtils.java:1598)
>   at 
> org.jkiss.dbeaver.ui.controls.resultset.ResultSetJobDataRead.run(ResultSetJobDataRead.java:91)
>   at org.jkiss.dbeaver.model.runtime.AbstractJob.run(AbstractJob.java:101)
>   at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (DRILL-7312) Allow case sensitivity for column names when it is supported by storage format

2019-07-01 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-7312:
-

 Summary: Allow case sensitivity for column names when it is 
supported by storage format
 Key: DRILL-7312
 URL: https://issues.apache.org/jira/browse/DRILL-7312
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub


After upgrade to Calcite 1.20.0 (DRILL-7200), there is a following issue:
If HBase table has 2 columns which are equal if the case is ignored and are not 
equal if case is considered, e.g. a table has column 'F' and 'f', a following 
query
{code}
select * from hbase.`TestTableMultiCF` t
{code}
fails with following exception
{code}
 (org.apache.calcite.runtime.CalciteContextException) At line 1, column 8: 
Column 'F' is ambiguous
sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
sun.reflect.NativeConstructorAccessorImpl.newInstance():62
sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
java.lang.reflect.Constructor.newInstance():423
org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
org.apache.calcite.sql.SqlUtil.newContextException():824
org.apache.calcite.sql.SqlUtil.newContextException():809
org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
org.apache.calcite.sql.validate.DelegatingScope.fullyQualify():496
org.apache.calcite.sql.validate.SqlValidatorImpl.findTableColumnPair():3501
org.apache.calcite.sql.validate.SqlValidatorImpl.isRolledUpColumn():3535
org.apache.calcite.sql.validate.SqlValidatorImpl.expandStar():519
org.apache.calcite.sql.validate.SqlValidatorImpl.expandSelectItem():429
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelectList():4069
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3376
org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
org.apache.calcite.sql.validate.AbstractNamespace.validate():84
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
org.apache.calcite.sql.SqlSelect.validate():216
{code}

If HBase is case-sensitive in regards to column name Drill should support this 
as well when querying from HBase table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7297) Query hangs in planning stage when Error is thrown

2019-06-21 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7297:
--
Labels: ready-to-commit  (was: )

> Query hangs in planning stage when Error is thrown
> --
>
> Key: DRILL-7297
> URL: https://issues.apache.org/jira/browse/DRILL-7297
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Query hangs in the planning stage when Error (not OOM or AssertionError) is 
> thrown during query planning. After canceling the query it will stay in 
> Cancellation Requested state.
> Such error may be thrown due to the mistake in the code, including UDF. Since 
> the user may provide custom UDFs, Drill should be able to handle such cases 
> also.
> Steps to reproduce this issue:
> 1. Create UDF which throws Error in either {{eval()}} or {{setup()}} method 
> (instructions how to create custom UDF may be found 
> [here|https://drill.apache.org/docs/tutorial-develop-a-simple-function/].
>  2. Register custom UDF which throws an error (instruction is 
> [here|https://drill.apache.org/docs/adding-custom-functions-to-drill-introduction/]).
>  3. Run the query with this UDF.
> After submitting the query, the following stack trace is printed:
> {noformat}
> Exception in thread "drill-executor-1" java.lang.Error
>   at 
> org.apache.drill.contrib.function.FunctionExample.setup(FunctionExample.java:19)
>   at 
> org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator.evaluateFunction(InterpreterEvaluator.java:139)
>   at 
> org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.visitFunctionHolderExpression(InterpreterEvaluator.java:355)
>   at 
> org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.visitFunctionHolderExpression(InterpreterEvaluator.java:204)
>   at 
> org.apache.drill.common.expression.FunctionHolderExpression.accept(FunctionHolderExpression.java:53)
>   at 
> org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator.evaluateConstantExpr(InterpreterEvaluator.java:70)
>   at 
> org.apache.drill.exec.planner.logical.DrillConstExecutor.reduce(DrillConstExecutor.java:152)
>   at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:620)
>   at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressions(ReduceExpressionsRule.java:541)
>   at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:288)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:643)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:339)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:430)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:370)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:250)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:319)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:177)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:226)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:124)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:90)
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:593)
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:276)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> 4. Check that query is still in progress state, cancel query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7200) Update Calcite to 1.19.0

2019-04-24 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7200:
--
Summary: Update Calcite to 1.19.0  (was: Update Calcite to 1.19)

> Update Calcite to 1.19.0
> 
>
> Key: DRILL-7200
> URL: https://issues.apache.org/jira/browse/DRILL-7200
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.16.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.17.0
>
>
> Calcite has released the 1.19.0 version. Upgrade Calcite dependency in Drill 
> to the newest version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7200) Update Calcite to 1.19

2019-04-24 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-7200:
-

 Summary: Update Calcite to 1.19
 Key: DRILL-7200
 URL: https://issues.apache.org/jira/browse/DRILL-7200
 Project: Apache Drill
  Issue Type: Task
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


Calcite has released the 1.19.0 version. Upgrade Calcite dependency in Drill to 
the newest version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-04-03 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808416#comment-16808416
 ] 

Bohdan Kazydub edited comment on DRILL-7038 at 4/3/19 6:48 AM:
---

Hi, [~bbevens]. I think it's OK, but I think it is needed to specify that 
additionally to {{DISTINCT}} or {{GROUP BY}} operation the query has to query 
({{SELECT}}) partition columns (dir0, dir1,..., dirN) only.


was (Author: kazydubb):
Hi, [~bbevens]. I think it's OK, but I think it is needed to specify that 
additionally for {{DISTINCT}} or {{GROUP BY}} operation the query has to query 
({{SELECT}}) partition columns (dir0, dir1,..., dirN) only.

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: https://issues.apache.org/jira/browse/DRILL-7038
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> For tables with hive-style partitions like
> {code}
> /table/2018/Q1
> /table/2018/Q2
> /table/2019/Q1
> etc.
> {code}
> if any of the following queries is run:
> {code}
> select distinct dir0 from dfs.`/table`
> {code}
> {code}
> select dir0 from dfs.`/table` group by dir0
> {code}
> it will actually scan every single record in the table rather than just 
> getting a list of directories at the dir0 level. This applies even when 
> cached metadata is available. This is a big penalty especially as the 
> datasets grow.
> To avoid such situations, a logical prune rule can be used to collect 
> partition columns (`dir0`), either from metadata cache (if available) or 
> group scan, and drop unnecessary files from being read. The rule will be 
> applied on following conditions:
> 1) all queried columns are partitoin columns, and
> 2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-04-03 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808416#comment-16808416
 ] 

Bohdan Kazydub commented on DRILL-7038:
---

Hi, [~bbevens]. I think it's OK, but I think it is needed to specify that 
additionally for {{DISTINCT}} or {{GROUP BY}} operation the query has to query 
({{SELECT}}) partition columns (dir0, dir1,..., dirN) only.

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: https://issues.apache.org/jira/browse/DRILL-7038
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> For tables with hive-style partitions like
> {code}
> /table/2018/Q1
> /table/2018/Q2
> /table/2019/Q1
> etc.
> {code}
> if any of the following queries is run:
> {code}
> select distinct dir0 from dfs.`/table`
> {code}
> {code}
> select dir0 from dfs.`/table` group by dir0
> {code}
> it will actually scan every single record in the table rather than just 
> getting a list of directories at the dir0 level. This applies even when 
> cached metadata is available. This is a big penalty especially as the 
> datasets grow.
> To avoid such situations, a logical prune rule can be used to collect 
> partition columns (`dir0`), either from metadata cache (if available) or 
> group scan, and drop unnecessary files from being read. The rule will be 
> applied on following conditions:
> 1) all queried columns are partitoin columns, and
> 2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-03-28 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803689#comment-16803689
 ] 

Bohdan Kazydub commented on DRILL-7038:
---

Hi, [~bbevens]. No, it's not like that. Those {{dir0}}, {{dir1}}, ... columns 
refer to directory levels from root directory (see [Querying 
Directories|https://drill.apache.org/docs/querying-directories/]).  For 
example, if {{table1}} had following directory structure:
{code}
/table1/2016/Q1
/table1/2016/Q2
...
{code}
and when querying
{code}
select distinct dir0[, dir1[,...]] from dfs.`/table1`;
select dir0[, dir1[,...]] from dfs.`/table1` group by dir0;
{code}
{{dir0}} references first level directories from `table1` (which is root), i.e. 
'2016' directory, {{dir1}} references second level directories 'Q1' and 'Q2' 
and so on.

Before, Drill was scanning all the *files* in all directories. With this 
optimization, file scanning is discarded and Scan operator is replaced with 
Values operator containing literal values, with this values being collected 
from directory metadata cache file (if exists) or from scan file selection.

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: https://issues.apache.org/jira/browse/DRILL-7038
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> For tables with hive-style partitions like
> {code}
> /table/2018/Q1
> /table/2018/Q2
> /table/2019/Q1
> etc.
> {code}
> if any of the following queries is run:
> {code}
> select distinct dir0 from dfs.`/table`
> {code}
> {code}
> select dir0 from dfs.`/table` group by dir0
> {code}
> it will actually scan every single record in the table rather than just 
> getting a list of directories at the dir0 level. This applies even when 
> cached metadata is available. This is a big penalty especially as the 
> datasets grow.
> To avoid such situations, a logical prune rule can be used to collect 
> partition columns (`dir0`), either from metadata cache (if available) or 
> group scan, and drop unnecessary files from being read. The rule will be 
> applied on following conditions:
> 1) all queried columns are partitoin columns, and
> 2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6430) Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or Locally

2019-03-19 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795938#comment-16795938
 ] 

Bohdan Kazydub commented on DRILL-6430:
---

The functionality seems to be implemented in DRILL-2304.

> Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or 
> Locally
> --
>
> Key: DRILL-6430
> URL: https://issues.apache.org/jira/browse/DRILL-6430
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.17.0
>
>
> This is required for resource management since we will likely remove many 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-6430) Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or Locally

2019-03-19 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub resolved DRILL-6430.
---
   Resolution: Done
Fix Version/s: (was: 1.17.0)
   1.16.0

> Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or 
> Locally
> --
>
> Key: DRILL-6430
> URL: https://issues.apache.org/jira/browse/DRILL-6430
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> This is required for resource management since we will likely remove many 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-03-07 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7038:
--
Labels:   (was: doc-impacting)

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: https://issues.apache.org/jira/browse/DRILL-7038
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> For tables with hive-style partitions like
> {code}
> /table/2018/Q1
> /table/2018/Q2
> /table/2019/Q1
> etc.
> {code}
> if any of the following queries is run:
> {code}
> select distinct dir0 from dfs.`/table`
> {code}
> {code}
> select dir0 from dfs.`/table` group by dir0
> {code}
> it will actually scan every single record in the table rather than just 
> getting a list of directories at the dir0 level. This applies even when 
> cached metadata is available. This is a big penalty especially as the 
> datasets grow.
> To avoid such situations, a logical prune rule can be used to collect 
> partition columns (`dir0`), either from metadata cache (if available) or 
> group scan, and drop unnecessary files from being read. The rule will be 
> applied on following conditions:
> 1) all queried columns are partitoin columns, and
> 2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-03-06 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-7038:
--
Labels: doc-impacting  (was: )

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: https://issues.apache.org/jira/browse/DRILL-7038
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> For tables with hive-style partitions like
> {code}
> /table/2018/Q1
> /table/2018/Q2
> /table/2019/Q1
> etc.
> {code}
> if any of the following queries is run:
> {code}
> select distinct dir0 from dfs.`/table`
> {code}
> {code}
> select dir0 from dfs.`/table` group by dir0
> {code}
> it will actually scan every single record in the table rather than just 
> getting a list of directories at the dir0 level. This applies even when 
> cached metadata is available. This is a big penalty especially as the 
> datasets grow.
> To avoid such situations, a logical prune rule can be used to collect 
> partition columns (`dir0`), either from metadata cache (if available) or 
> group scan, and drop unnecessary files from being read. The rule will be 
> applied on following conditions:
> 1) all queried columns are partitoin columns, and
> 2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-4858) REPEATED_COUNT on an array of maps and an array of arrays is not implemented

2019-02-27 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-4858:
--
Description: 
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}
{"id": 1, "array": [[1, 2], [1, 3], [2, 3]]}
{"id": 2, "array": []}
{"id": 3, "array": [[2, 3], [1, 3, 4]]}
{"id": 4, "array": [[1], [2], [3, 4], [5], [6]]}
{"id": 5, "array": [[1, 2, 3], [4, 5], [6], [7], [8, 9], [2, 3], [2, 3], [2, 
3], [2]]}
{"id": 6, "array": [[1, 2], [3], [4], [5]]}
{"id": 7, "array": []}
{"id": 8, "array": [[1], [2], [3]]}
{code}
the following error is shown
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthMap implements DrillSimpleFunc {
...
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthList implements DrillSimpleFunc {
{code}


Also make {{REPEATED_COUNT}} function to support other REPEATED type. So 
Drill's {{REPEATED_COUNT}} function supports following REPEATED types: 
RepeatedBit, RepeatedInt, RepeatedBigInt, RepeatedFloat4, RepeatedFloat8, 
RepeatedDate, RepeatedTimeStamp, RepeatedTime, RepeatedIntervalDay, 
RepeatedIntervalYear, RepeatedInterval, RepeatedVarChar, RepeatedVarBinary, 
RepeatedVarDecimal, RepeatedDecimal9, RepeatedDecimal18, 
RepeatedDecimal28Sparse, RepeatedDecimal38Sparse, RepeatedList, RepeatedMap


  was:
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}

[jira] [Updated] (DRILL-4858) REPEATED_COUNT on an array of maps and an array of arrays is not implemented

2019-02-27 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-4858:
--
Description: 
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}
{"id": 1, "array": [[1, 2], [1, 3], [2, 3]]}
{"id": 2, "array": []}
{"id": 3, "array": [[2, 3], [1, 3, 4]]}
{"id": 4, "array": [[1], [2], [3, 4], [5], [6]]}
{"id": 5, "array": [[1, 2, 3], [4, 5], [6], [7], [8, 9], [2, 3], [2, 3], [2, 
3], [2]]}
{"id": 6, "array": [[1, 2], [3], [4], [5]]}
{"id": 7, "array": []}
{"id": 8, "array": [[1], [2], [3]]}
{code}
the following error is shown
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthMap implements DrillSimpleFunc {
...
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthList implements DrillSimpleFunc {
{code}

*Made more REPEATED types* to {{REPEATED_COUNT}} function


  was:
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}
{"id": 1, "array": [[1, 2], [1, 3], [2, 3]]}
{"id": 2, "array": []}
{"id": 3, "array": [[2, 3], [1, 3, 4]]}
{"id": 4, "array": [[1], [2], [3, 4], [5], [6]]}
{"id": 5, "array": [[1, 2, 3], [4, 5], [6], [7], [8, 9], [2, 3], [2, 3], [2, 
3], [2]]}
{"id": 6, "array": [[1, 2], [3], [4], [5]]}
{"id": 7, "array": []}
{"id": 8, "array": [[1], [2], [3]]}
{code}
the following error is shown
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) 

[jira] [Updated] (DRILL-4858) REPEATED_COUNT on an array of maps and an array of arrays is not implemented

2019-02-14 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-4858:
--
Description: 
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}
{"id": 1, "array": [[1, 2], [1, 3], [2, 3]]}
{"id": 2, "array": []}
{"id": 3, "array": [[2, 3], [1, 3, 4]]}
{"id": 4, "array": [[1], [2], [3, 4], [5], [6]]}
{"id": 5, "array": [[1, 2, 3], [4, 5], [6], [7], [8, 9], [2, 3], [2, 3], [2, 
3], [2]]}
{"id": 6, "array": [[1, 2], [3], [4], [5]]}
{"id": 7, "array": []}
{"id": 8, "array": [[1], [2], [3]]}
{code}
the following error is shown
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthMap implements DrillSimpleFunc {
...
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthList implements DrillSimpleFunc {
{code}



  was:
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}
{"id": 1, "array": [1, 2, 3]}
{"id": 2, "array": []}
{"id": 3, "array": [2, 3]}
{"id": 4, "array": [1, 2, 3, 4, 5]}
{"id": 5, "array": [1, 2, 3, 4, 5, 6, 7, 8, 9]}
{"id": 6, "array": [1, 2, 3, 4]}
{"id": 7, "array": []}
{"id": 8, "array": [1, 2, 3]}
{code}
the following error is shown
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression 

[jira] [Updated] (DRILL-4858) REPEATED_COUNT on an array of maps and an array of arrays is not implemented

2019-02-14 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-4858:
--
Description: 
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays
for JSON file
{code}
{"id": 1, "array": [1, 2, 3]}
{"id": 2, "array": []}
{"id": 3, "array": [2, 3]}
{"id": 4, "array": [1, 2, 3, 4, 5]}
{"id": 5, "array": [1, 2, 3, 4, 5, 6, 7, 8, 9]}
{"id": 6, "array": [1, 2, 3, 4]}
{"id": 7, "array": []}
{"id": 8, "array": [1, 2, 3]}
{code}
the following error is shown
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(array) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthMap implements DrillSimpleFunc {
...
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthList implements DrillSimpleFunc {
{code}



  was:
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays:
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(outarray) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these 

[jira] [Updated] (DRILL-4858) REPEATED_COUNT on an array of maps and an array of arrays is not implemented

2019-02-14 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-4858:
--
Description: 
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

The same issue is present for an array of arrays:
{code}
0: jdbc:drill:schema=dfs.tmp> select REPEATED_COUNT(outarray) from 
`arrayOfArrays.json`;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(LIST-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 12b81b85-c84b-4773-8427-48b80098cafe on qa102-45.qa.lab:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthMap implements DrillSimpleFunc {
{code}



  was:
REPEATED_COUNT of JSON containing an array of map does not work.

JSON file
{code}
drill$ cat /Users/jccote/repeated_count.json 
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], "label": 
"foo"}
{code}

select
{code}
0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
dfs.`/Users/jccote/repeated_count.json`;
{code}

error
{code}
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to materialize 
incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
(state=,code=0)
{code}

Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
Looks like it's not enabled yet. 
{code}
  // TODO - need to confirm that these work   SMP: They do not
  @FunctionTemplate(name = "repeated_count", scope = 
FunctionTemplate.FunctionScope.SIMPLE)
  public static class RepeatedLengthMap implements DrillSimpleFunc {
{code}




> REPEATED_COUNT on an array of maps and an array of arrays is not implemented
> 
>
> Key: DRILL-4858
> URL: https://issues.apache.org/jira/browse/DRILL-4858
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: jean-claude
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.16.0
>
>
> REPEATED_COUNT of JSON containing an array of map does not work.
> JSON file
> {code}
> drill$ cat /Users/jccote/repeated_count.json 
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 

[jira] [Updated] (DRILL-4858) REPEATED_COUNT on an array of maps and an array of arrays is not implemented

2019-02-14 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-4858:
--
Summary: REPEATED_COUNT on an array of maps and an array of arrays is not 
implemented  (was: REPEATED_COUNT on JSON containing an array of maps)

> REPEATED_COUNT on an array of maps and an array of arrays is not implemented
> 
>
> Key: DRILL-4858
> URL: https://issues.apache.org/jira/browse/DRILL-4858
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: jean-claude
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.16.0
>
>
> REPEATED_COUNT of JSON containing an array of map does not work.
> JSON file
> {code}
> drill$ cat /Users/jccote/repeated_count.json 
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {code}
> select
> {code}
> 0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
> dfs.`/Users/jccote/repeated_count.json`;
> {code}
> error
> {code}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
> (state=,code=0)
> {code}
> Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
> Looks like it's not enabled yet. 
> {code}
>   // TODO - need to confirm that these work   SMP: They do not
>   @FunctionTemplate(name = "repeated_count", scope = 
> FunctionTemplate.FunctionScope.SIMPLE)
>   public static class RepeatedLengthMap implements DrillSimpleFunc {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-02-13 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-7038:
-

 Summary: Queries on partitioned columns scan the entire datasets
 Key: DRILL-7038
 URL: https://issues.apache.org/jira/browse/DRILL-7038
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub
 Fix For: 1.16.0


For tables with hive-style partitions like
{code}
/table/2018/Q1
/table/2018/Q2
/table/2019/Q1
etc.
{code}
if any of the following queries is run:
{code}
select distinct dir0 from dfs.`/table`
{code}
{code}
select dir0 from dfs.`/table` group by dir0
{code}
it will actually scan every single record in the table rather than just getting 
a list of directories at the dir0 level. This applies even when cached metadata 
is available. This is a big penalty especially as the datasets grow.

To avoid such situations, a logical prune rule can be used to collect partition 
columns (`dir0`), either from metadata cache (if available) or group scan, and 
drop unnecessary files from being read. The rule will be applied on following 
conditions:
1) all queried columns are partitoin columns, and
2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6976) SchemaChangeException happens when using split function in subquery if it returns empty result.

2019-02-07 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762478#comment-16762478
 ] 

Bohdan Kazydub commented on DRILL-6976:
---

[~agozhiy], please verify if this is fixed

> SchemaChangeException happens when using split function in subquery if it 
> returns empty result.
> ---
>
> Key: DRILL-6976
> URL: https://issues.apache.org/jira/browse/DRILL-6976
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select substr(col, 2, 3) 
> from (select split(n_comment, ' ') [3] col 
>   from cp.`tpch/nation.parquet` 
>   where n_nationkey = -1 
>   group by n_comment 
>   order by n_comment 
>   limit 5);
> {code}
> *Expected result:*
> {noformat}
> +-+
> | EXPR$0  |
> +-+
> +-+
> {noformat}
> *Actual result:*
> {noformat}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: 86515d74-7b9c-4949-8ece-c9c17e00afc3 on userf87d-pc:31010]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while 
> trying to materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():498
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748 (state=,code=0)
> {noformat}
> *Note:* Filter "where n_nationkey = -1" doesn't return any rows. In case of " 
> = 1", for example, the query will return result without error.
> *Workaround:* Use cast on the split function, like
> {code:sql}
> cast(split(n_comment, ' ') [3] as varchar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6976) SchemaChangeException happens when using split function in subquery if it returns empty result.

2019-02-07 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub reassigned DRILL-6976:
-

Assignee: Anton Gozhiy  (was: Bohdan Kazydub)

> SchemaChangeException happens when using split function in subquery if it 
> returns empty result.
> ---
>
> Key: DRILL-6976
> URL: https://issues.apache.org/jira/browse/DRILL-6976
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Anton Gozhiy
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select substr(col, 2, 3) 
> from (select split(n_comment, ' ') [3] col 
>   from cp.`tpch/nation.parquet` 
>   where n_nationkey = -1 
>   group by n_comment 
>   order by n_comment 
>   limit 5);
> {code}
> *Expected result:*
> {noformat}
> +-+
> | EXPR$0  |
> +-+
> +-+
> {noformat}
> *Actual result:*
> {noformat}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: 86515d74-7b9c-4949-8ece-c9c17e00afc3 on userf87d-pc:31010]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while 
> trying to materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():498
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748 (state=,code=0)
> {noformat}
> *Note:* Filter "where n_nationkey = -1" doesn't return any rows. In case of " 
> = 1", for example, the query will return result without error.
> *Workaround:* Use cast on the split function, like
> {code:sql}
> cast(split(n_comment, ' ') [3] as varchar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6993) VARBINARY length is ignored on cast

2019-01-22 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6993:
-

 Summary: VARBINARY length is ignored on cast
 Key: DRILL-6993
 URL: https://issues.apache.org/jira/browse/DRILL-6993
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


{{VARBINARY}} precision is not set when casting to {{VARBINARY}} with specified 
length.
For example, test case 
{code}
  String query = "select cast(r_name as varbinary(31)) as vb from 
cp.`tpch/region.parquet`;"
  MaterializedField field = new ColumnBuilder("vb", 
TypeProtos.MinorType.VARBINARY)
  .setMode(TypeProtos.DataMode.OPTIONAL)
  .setWidth(31)
  .build();
  BatchSchema expectedSchema = new SchemaBuilder()
  .add(field)
  .build();

  // Validate schema
  testBuilder()
  .sqlQuery(q)
  .schemaBaseLine(expectedSchema)
  .go();
{code}
will fail with
{code}
java.lang.Exception: Schema path or type mismatch for column #0:
Expected schema path: vb
Actual   schema path: vb
Expected type: MajorType[minor_type: VARBINARY mode: OPTIONAL precision: 31 
scale: 0]
Actual   type: MajorType[minor_type: VARBINARY mode: OPTIONAL]
{code}
while for other types, like {{VARCHAR}}, it seems to work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6962) Function coalesce returns an Error when none of the columns in coalesce exist in a parquet file

2019-01-18 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746246#comment-16746246
 ] 

Bohdan Kazydub commented on DRILL-6962:
---

[~benj641] thank you for correcting error message. Regarding your second query 
with {{NULL}} literals - this is an expected behavior. The aim of the 
improvement is to allow using columns not present in queried files in 
{{coalesce}} function because some files may contain the mentioned columns and 
some may not.

> Function coalesce returns an Error when none of the columns in coalesce exist 
> in a parquet file
> ---
>
> Key: DRILL-6962
> URL: https://issues.apache.org/jira/browse/DRILL-6962
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> As Drill is schema-free, COALESCE function is expected to return a result and 
> not error out even if none of the columns being referred to exists in files 
> being queried.
> Here is an example for 2 columns, `unk_col` and `unk_col2`, which do not 
> exist in the parquet files
> {code:java}
> select coalesce(unk_col, unk_col2) from dfs.`/tmp/parquetfiles`;
> java.lang.IndexOutOfBoundsException: index (0) must be less than size (0)
> at 
> org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1196)
> Fragment 1:0
> [Error Id: 7b9193fb-289b-4fbf-a52a-2b93b01f0cd0 on dkvm2c:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6976) SchemaChangeException happens when using split function in subquery if it returns empty result.

2019-01-18 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16746132#comment-16746132
 ] 

Bohdan Kazydub commented on DRILL-6976:
---

The issue is fixed by DRILL-6962

> SchemaChangeException happens when using split function in subquery if it 
> returns empty result.
> ---
>
> Key: DRILL-6976
> URL: https://issues.apache.org/jira/browse/DRILL-6976
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select substr(col, 2, 3) 
> from (select split(n_comment, ' ') [3] col 
>   from cp.`tpch/nation.parquet` 
>   where n_nationkey = -1 
>   group by n_comment 
>   order by n_comment 
>   limit 5);
> {code}
> *Expected result:*
> {noformat}
> +-+
> | EXPR$0  |
> +-+
> +-+
> {noformat}
> *Actual result:*
> {noformat}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: 86515d74-7b9c-4949-8ece-c9c17e00afc3 on userf87d-pc:31010]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while 
> trying to materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():498
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748 (state=,code=0)
> {noformat}
> *Note:* Filter "where n_nationkey = -1" doesn't return any rows. In case of " 
> = 1", for example, the query will return result without error.
> *Workaround:* Use cast on the split function, like
> {code:sql}
> cast(split(n_comment, ' ') [3] as varchar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6962) Function coalesce returns an Error when none of the columns in coalesce exist in a parquet file

2019-01-18 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6962:
--
Description: 
As Drill is schema-free, COALESCE function is expected to return a result and 
not error out even if none of the columns being referred to exists in files 
being queried.

Here is an example for 2 columns, `unk_col` and `unk_col2`, which do not exist 
in the parquet files
{code:java}
select coalesce(unk_col, unk_col2) from dfs.`/tmp/parquetfiles`;
java.lang.IndexOutOfBoundsException: index (0) must be less than size (0)
at 
org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1196)

Fragment 1:0

[Error Id: 7b9193fb-289b-4fbf-a52a-2b93b01f0cd0 on dkvm2c:31010] (state=,code=0)
{code}

  was:
As Drill is schema-free, COALESCE function is expected to return a result and 
not error out even if none of the columns being referred to exists in files 
being queried.

Here is an example for 2 columns, `unk_col` and `unk_col2`, which do not exist 
in the parquet files
{code:java}
select coalesce(unk_col, unk_col2) from dfs.`/tmp/parquetfiles`;
Error: SYSTEM ERROR: CompileException: Line 56, Column 27: Assignment 
conversion not possible from type 
“org.apache.drill.exec.expr.holders.NullableIntHolder” to type 
“org.apache.drill.exec.vector.UntypedNullHolder”

Fragment 1:0

[Error Id: 7b9193fb-289b-4fbf-a52a-2b93b01f0cd0 on dkvm2c:31010] (state=,code=0)
{code}


> Function coalesce returns an Error when none of the columns in coalesce exist 
> in a parquet file
> ---
>
> Key: DRILL-6962
> URL: https://issues.apache.org/jira/browse/DRILL-6962
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>
> As Drill is schema-free, COALESCE function is expected to return a result and 
> not error out even if none of the columns being referred to exists in files 
> being queried.
> Here is an example for 2 columns, `unk_col` and `unk_col2`, which do not 
> exist in the parquet files
> {code:java}
> select coalesce(unk_col, unk_col2) from dfs.`/tmp/parquetfiles`;
> java.lang.IndexOutOfBoundsException: index (0) must be less than size (0)
> at 
> org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1196)
> Fragment 1:0
> [Error Id: 7b9193fb-289b-4fbf-a52a-2b93b01f0cd0 on dkvm2c:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6928) Update description for exec.query.return_result_set_for_ddl option to reflect it affects JDBC connections only

2019-01-10 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6928:
--
Summary: Update description for exec.query.return_result_set_for_ddl option 
to reflect it affects JDBC connections only  (was: 
exec.query.return_result_set_for_ddl does not affect Web-UI query results)

> Update description for exec.query.return_result_set_for_ddl option to reflect 
> it affects JDBC connections only
> --
>
> Key: DRILL-6928
> URL: https://issues.apache.org/jira/browse/DRILL-6928
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Option {{exec.query.return_result_set_for_ddl}} was designed to work with 
> JDBC connections only, but this is not clearly indicated in the option's name 
> or its description. Thus option's description should be updated to reflect 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6928) exec.query.return_result_set_for_ddl does not affect Web-UI query results

2019-01-10 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6928:
--
Description: Option {{exec.query.return_result_set_for_ddl}} was designed 
to work with JDBC connections only, but this is not clearly indicated in the 
option's name or its description. Thus option's description should be updated 
to reflect this.  (was: For the case when 
{{exec.query.return_result_set_for_ddl}} is set to {{false}} at the system 
level, it should affect also results returned by Web-UI, but independently of 
the value of this option, the result is always displayed for non-"select" 
queries.)

> exec.query.return_result_set_for_ddl does not affect Web-UI query results
> -
>
> Key: DRILL-6928
> URL: https://issues.apache.org/jira/browse/DRILL-6928
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Option {{exec.query.return_result_set_for_ddl}} was designed to work with 
> JDBC connections only, but this is not clearly indicated in the option's name 
> or its description. Thus option's description should be updated to reflect 
> this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6962) Function coalesce returns an Error when none of the columns in coalesce exist in a parquet file

2019-01-10 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6962:
-

 Summary: Function coalesce returns an Error when none of the 
columns in coalesce exist in a parquet file
 Key: DRILL-6962
 URL: https://issues.apache.org/jira/browse/DRILL-6962
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


As Drill is schema-free, COALESCE function is expected to return a result and 
not error out even if none of the columns being referred to exists in files 
being queried.

Here is an example for 2 columns, `unk_col` and `unk_col2`, which do not exist 
in the parquet files
{code:java}
select coalesce(unk_col, unk_col2) from dfs.`/tmp/parquetfiles`;
Error: SYSTEM ERROR: CompileException: Line 56, Column 27: Assignment 
conversion not possible from type 
“org.apache.drill.exec.expr.holders.NullableIntHolder” to type 
“org.apache.drill.exec.vector.UntypedNullHolder”

Fragment 1:0

[Error Id: 7b9193fb-289b-4fbf-a52a-2b93b01f0cd0 on dkvm2c:31010] (state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6962) Function coalesce returns an Error when none of the columns in coalesce exist in a parquet file

2019-01-10 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6962:
--
Affects Version/s: 1.13.0

> Function coalesce returns an Error when none of the columns in coalesce exist 
> in a parquet file
> ---
>
> Key: DRILL-6962
> URL: https://issues.apache.org/jira/browse/DRILL-6962
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.13.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>
> As Drill is schema-free, COALESCE function is expected to return a result and 
> not error out even if none of the columns being referred to exists in files 
> being queried.
> Here is an example for 2 columns, `unk_col` and `unk_col2`, which do not 
> exist in the parquet files
> {code:java}
> select coalesce(unk_col, unk_col2) from dfs.`/tmp/parquetfiles`;
> Error: SYSTEM ERROR: CompileException: Line 56, Column 27: Assignment 
> conversion not possible from type 
> “org.apache.drill.exec.expr.holders.NullableIntHolder” to type 
> “org.apache.drill.exec.vector.UntypedNullHolder”
> Fragment 1:0
> [Error Id: 7b9193fb-289b-4fbf-a52a-2b93b01f0cd0 on dkvm2c:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6928) exec.query.return_result_set_for_ddl does not affect Web-UI query results

2019-01-08 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736904#comment-16736904
 ] 

Bohdan Kazydub commented on DRILL-6928:
---

[~arina], my bad. I thought I did mention that it works for JDBC connection 
only in the description. Now I see that it is not included. The description can 
be updated in the scope of the issue. 
The behaviour could be supported for REST for consistence, but not sure if it 
is mandatory as there are no prerequisites for doing so (yet). 

> exec.query.return_result_set_for_ddl does not affect Web-UI query results
> -
>
> Key: DRILL-6928
> URL: https://issues.apache.org/jira/browse/DRILL-6928
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> For the case when {{exec.query.return_result_set_for_ddl}} is set to 
> {{false}} at the system level, it should affect also results returned by 
> Web-UI, but independently of the value of this option, the result is always 
> displayed for non-"select" queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6922) QUERY-level options are shown on Profiles tab

2018-12-21 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6922:
-

 Summary: QUERY-level options are shown on Profiles tab
 Key: DRILL-6922
 URL: https://issues.apache.org/jira/browse/DRILL-6922
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


Option `exec.return_result_set_for_ddl` is shown on Web UI's Profiles even when 
it was not set explicitly. The issue is because the option is being set on 
query level internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6768) Improve to_date, to_time and to_timestamp and corresponding cast functions to handle empty string when `drill.exec.functions.cast_empty_string_to_null` option is enable

2018-12-21 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726646#comment-16726646
 ] 

Bohdan Kazydub commented on DRILL-6768:
---

Hi [~bbevens],

your note looks good. Thanks! :)

> Improve to_date, to_time and to_timestamp and corresponding cast functions to 
> handle empty string when `drill.exec.functions.cast_empty_string_to_null` 
> option is enabled
> -
>
> Key: DRILL-6768
> URL: https://issues.apache.org/jira/browse/DRILL-6768
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-complete, ready-to-commit
> Fix For: 1.15.0
>
>
> When `drill.exec.functions.cast_empty_string_to_null` option is enabled
> `to_date`, `to_time` and `to_timestamp` functions while converting string to 
> according type in case if null or empty string values are passed will return 
> NULL (to avoid CASE clauses which are littering a query and will work in 
> accordance with their respective CAST counterparts) for both cases.
>  
>   
>   
> CASTs will  be handled in a similar way (uniformly with numeric types):
>  
> ||Value to cast||Now||Will be||
> |NULL|NULL|NULL|
> |'' (empty string)|Error in many cases (except numerical types)|NULL|
>  CAST empty string to null (in case of enabled option) will be supported by 
> DATE, TIME, TIMESTAMP, INTERVAL YEAR, INTERVAL MONTH and INTERVAL DAY 
> functions in addition to numeric types.
>  
> *For documentation*
> TBA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6768) Improve to_date, to_time and to_timestamp and corresponding cast functions to handle empty string when `drill.exec.functions.cast_empty_string_to_null` option is enable

2018-12-19 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725011#comment-16725011
 ] 

Bohdan Kazydub commented on DRILL-6768:
---

Hi [~bbevens],

Your note is OK. Regarding 
https://drill.apache.org/docs/text-files-csv-tsv-psv/#cast-data: one still 
needs to use CAST functions as you wrote.

> Improve to_date, to_time and to_timestamp and corresponding cast functions to 
> handle empty string when `drill.exec.functions.cast_empty_string_to_null` 
> option is enabled
> -
>
> Key: DRILL-6768
> URL: https://issues.apache.org/jira/browse/DRILL-6768
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> When `drill.exec.functions.cast_empty_string_to_null` option is enabled
> `to_date`, `to_time` and `to_timestamp` functions while converting string to 
> according type in case if null or empty string values are passed will return 
> NULL (to avoid CASE clauses which are littering a query and will work in 
> accordance with their respective CAST counterparts) for both cases.
>  
>   
>   
> CASTs will  be handled in a similar way (uniformly with numeric types):
>  
> ||Value to cast||Now||Will be||
> |NULL|NULL|NULL|
> |'' (empty string)|Error in many cases (except numerical types)|NULL|
>  CAST empty string to null (in case of enabled option) will be supported by 
> DATE, TIME, TIMESTAMP, INTERVAL YEAR, INTERVAL MONTH and INTERVAL DAY 
> functions in addition to numeric types.
>  
> *For documentation*
> TBA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6913) Excessive error output

2018-12-19 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724941#comment-16724941
 ] 

Bohdan Kazydub edited comment on DRILL-6913 at 12/19/18 12:12 PM:
--

This issue arises because org.apache.calcite.runtime.CalciteException and 
org.apache.calcite.sql.validate.SqlValidatorException log the exception being 
created. There is a bug filed for the issue 
([CALCITE-2463|https://issues.apache.org/jira/browse/CALCITE-2463]) in 
Calcite's JIRA which has an already [open 
PR|https://github.com/apache/calcite/pull/797].


was (Author: kazydubb):
This issue arises because org.apache.calcite.runtime.CalciteException and 
org.apache.calcite.sql.validate.SqlValidatorException log the exception being 
created. There is a [bug|https://issues.apache.org/jira/browse/CALCITE-2463] 
filed for the issue in Calcite's JIRA which has an already [open 
PR|https://github.com/apache/calcite/pull/797].

> Excessive error output
> --
>
> Key: DRILL-6913
> URL: https://issues.apache.org/jira/browse/DRILL-6913
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Vitalii Diravka
>Assignee: Arina Ielchiieva
>Priority: Blocker
> Fix For: 1.15.0
>
>
> There are redundant and duplicate error messages in query outputs in case of 
> some mistake in query syntax or in case of any other error from Calcite:
> {code:java}
> 0: jdbc:drill:zk=local> select * from dfs.`tpch/nation.parquet`;
> 19:23:09.335 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.sql.validate.SqlValidatorException: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.336 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to 
> line 1, column 17: Object 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.411 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.sql.validate.SqlValidatorException: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.411 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to 
> line 1, column 17: Object 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.432 [Client-1] ERROR o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.sql.validate.SqlValidatorException: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.432 [Client-1] ERROR o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to 
> line 1, column 17: Object 'tpch/nation.parquet' not found within 'dfs': 
> Object 'tpch/nation.parquet' not found within 'dfs'
> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> [Error Id: e3b6b9f6-1e8c-468f-954f-41b57defcf6a on vitalii-pc:31010] 
> (state=,code=0)
> {code}
> {code:java}
> 0: jdbc:drill:zk=local> slect * from cp.`tpch/nation.parquet` limit 1;
> 20:40:27.783 [23e6c0e4-6dd7-2f72-12dc-abdb9f2f7634:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteException: Non-query expression encountered 
> in illegal context
> 20:40:27.783 [23e6c0e4-6dd7-2f72-12dc-abdb9f2f7634:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 1 to 
> line 1, column 5: Non-query expression encountered in illegal context
> 20:40:27.787 [Client-1] ERROR o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteException: Non-query expression encountered 
> in illegal context
> Error: PARSE ERROR: Non-query expression encountered in illegal context
> SQL Query slect * from cp.`tpch/nation.parquet` limit 1
> ^
> [Error Id: c1bf1800-6b70-420b-b95c-907d11889a6f on vitalii-pc:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local>
> {code}
> Errors from drill code are fine:
> {code}
> 0: jdbc:drill:zk=local> select count(4,5) from cp.`tpch/nation.parquet` limit 
> 1;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while materializing 
> expression. 
> Error in expression at index -1.  Error: Missing function implementation: 
> [count(INT-REQUIRED, INT-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: de0abc5a-cda9-4ac7-b99f-58f0ef0c7a67 on vitalii-pc:31010] 
> (state=,code=0)
> {code}
> In drill Web UI the output is fine for all above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6913) Excessive error output

2018-12-19 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724941#comment-16724941
 ] 

Bohdan Kazydub commented on DRILL-6913:
---

This issue arises because org.apache.calcite.runtime.CalciteException and 
org.apache.calcite.sql.validate.SqlValidatorException log the exception being 
created. There is a [bug|https://issues.apache.org/jira/browse/CALCITE-2463] 
filed for the issue in Calcite's JIRA which has an already [open 
PR|https://github.com/apache/calcite/pull/797].

> Excessive error output
> --
>
> Key: DRILL-6913
> URL: https://issues.apache.org/jira/browse/DRILL-6913
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Vitalii Diravka
>Assignee: Bohdan Kazydub
>Priority: Blocker
> Fix For: 1.15.0
>
>
> There are redundant and duplicate error messages in query outputs in case of 
> some mistake in query syntax or in case of any other error from Calcite:
> {code:java}
> 0: jdbc:drill:zk=local> select * from dfs.`tpch/nation.parquet`;
> 19:23:09.335 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.sql.validate.SqlValidatorException: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.336 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to 
> line 1, column 17: Object 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.411 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.sql.validate.SqlValidatorException: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.411 [23e6d301-9bf4-2d4d-69e4-335a3674fd53:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to 
> line 1, column 17: Object 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.432 [Client-1] ERROR o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.sql.validate.SqlValidatorException: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> 19:23:09.432 [Client-1] ERROR o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to 
> line 1, column 17: Object 'tpch/nation.parquet' not found within 'dfs': 
> Object 'tpch/nation.parquet' not found within 'dfs'
> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object 
> 'tpch/nation.parquet' not found within 'dfs'
> [Error Id: e3b6b9f6-1e8c-468f-954f-41b57defcf6a on vitalii-pc:31010] 
> (state=,code=0)
> {code}
> {code:java}
> 0: jdbc:drill:zk=local> slect * from cp.`tpch/nation.parquet` limit 1;
> 20:40:27.783 [23e6c0e4-6dd7-2f72-12dc-abdb9f2f7634:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteException: Non-query expression encountered 
> in illegal context
> 20:40:27.783 [23e6c0e4-6dd7-2f72-12dc-abdb9f2f7634:foreman] ERROR 
> o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteContextException: From line 1, column 1 to 
> line 1, column 5: Non-query expression encountered in illegal context
> 20:40:27.787 [Client-1] ERROR o.a.calcite.runtime.CalciteException - 
> org.apache.calcite.runtime.CalciteException: Non-query expression encountered 
> in illegal context
> Error: PARSE ERROR: Non-query expression encountered in illegal context
> SQL Query slect * from cp.`tpch/nation.parquet` limit 1
> ^
> [Error Id: c1bf1800-6b70-420b-b95c-907d11889a6f on vitalii-pc:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local>
> {code}
> Errors from drill code are fine:
> {code}
> 0: jdbc:drill:zk=local> select count(4,5) from cp.`tpch/nation.parquet` limit 
> 1;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while materializing 
> expression. 
> Error in expression at index -1.  Error: Missing function implementation: 
> [count(INT-REQUIRED, INT-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: de0abc5a-cda9-4ac7-b99f-58f0ef0c7a67 on vitalii-pc:31010] 
> (state=,code=0)
> {code}
> In drill Web UI the output is fine for all above cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-6875) Drill doesn't try to update connection for S3 after session expired

2018-12-18 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub resolved DRILL-6875.
---
Resolution: Not A Bug

> Drill doesn't try to update connection for S3 after session expired
> ---
>
> Key: DRILL-6875
> URL: https://issues.apache.org/jira/browse/DRILL-6875
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: drillbit.log, not_a_bug_drillbit.log
>
>
> *Steps to reproduce:*
> - Drill has S3 storage plugin.
> - Open sqlline and run query to S3.
> - Leave sqlline opened for more than 12 hours.
> - In opened sqlline run query to S3.
> *Expected result:*
> Drill should update authorization session and successfully execute query.
> *Actual result:*
> Sqlline returns an error:
> *{color:#d04437}Error: VALIDATION ERROR: Forbidden (Service: Amazon S3; 
> Status Code: 403; Error Code: 403 Forbidden; Request ID: 4A94DD331A035625; S3 
> Extended Request ID: 
> uy94YdRpQ3ZriCz9xbnDi0yinB4O9kGrH7XPAURhjh8WZoxsbawojQA6v7mfvu920yOYbEI5WP8=)
> [Error Id: 4b44a83b-0e47-45a4-92e3-75f94f5a70cb on maprhost:31010] 
> (state=,code=0){color}*
> *Reopening sqlline doesn't help to get S3 access.*
> *Access problem can be solved only by restarting Drill.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6834) Introduce option to disable result set for DDL queries for JDBC connection

2018-12-11 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716432#comment-16716432
 ] 

Bohdan Kazydub commented on DRILL-6834:
---

[~bbevens], looks good.

> Introduce option to disable result set for DDL queries for JDBC connection
> --
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> {{`exec.return_result_set_for_ddl}}{{`}} is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. a result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE queries will not 
> return result set but {{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6816) NPE - Concurrent query execution using PreparedStatement

2018-12-10 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716397#comment-16716397
 ] 

Bohdan Kazydub commented on DRILL-6816:
---

[~khfaraaz], my fault, confused the method with String argument with the one 
used by you which is without arguments. So it works fine. Sorry!

> NPE - Concurrent query execution using PreparedStatement 
> -
>
> Key: DRILL-6816
> URL: https://issues.apache.org/jira/browse/DRILL-6816
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Vitalii Diravka
>Priority: Major
> Attachments: test_tbl.json
>
>
> Concurrent query execution from JDBC program using PreparedStatement results 
> in NPE.
> Queries that were executed concurrently are (part of a query file),
> {noformat}
> select id from `test_tbl.json`
> select count(id) from `test_tbl.json`
> select count(*) from `test_tbl.json`
> select * from `test_tbl.json`
> {noformat}
> Drill 1.14.0
>  git.commit.id=35a1ae23c9b280b9e73cb0f6f01808c996515454
>  MapR version => 6.1.0.20180911143226.GA (secure cluster)
> JDBC driver used was org.apache.drill.jdbc.Driver
> Executing the above queries concurrently using a Statement object results in 
> successful query execution.
> {noformat}
> Statement stmt = conn.createStatement();
> ResultSet rs = stmt.executeQuery(query);
> {noformat}
> However, when the same queries listed above are executed using a 
> PreparedStatement object we see an NPE
> {noformat}
> PreparedStatement prdstmnt = conn.prepareStatement(query);
> prdstmnt.executeUpdate();
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 17:04:32.941 [pool-1-thread-3] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@35757005
> 17:04:32.941 [pool-1-thread-2] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d4413b8
> 17:04:32.956 [pool-1-thread-1] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@5eb3b9ab
> 17:04:32.956 [pool-1-thread-4] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d9367d0
> java.lang.NullPointerException
>  at java.util.Objects.requireNonNull(Objects.java:203)
>  at org.apache.calcite.avatica.Meta$MetaResultSet.create(Meta.java:577)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1143)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1150)
>  at 
> org.apache.calcite.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:511)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeLargeUpdate(AvaticaPreparedStatement.java:146)
>  at 
> org.apache.drill.jdbc.impl.DrillPreparedStatementImpl.executeLargeUpdate(DrillPreparedStatementImpl.java:512)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeUpdate(AvaticaPreparedStatement.java:142)
>  at RunQuery.executeQuery(RunQuery.java:61)
>  at RunQuery.run(RunQuery.java:30)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6816) NPE - Concurrent query execution using PreparedStatement

2018-12-07 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713054#comment-16713054
 ] 

Bohdan Kazydub edited comment on DRILL-6816 at 12/7/18 4:30 PM:


[~khfaraaz] according to docs 
([https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String)|https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String))])
 executeQuery should not be called on PreparedStatement.


was (Author: kazydubb):
[~khfaraaz] according to docs 
([https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String))]
 executeQuery should not be called on PreparedStatement.

> NPE - Concurrent query execution using PreparedStatement 
> -
>
> Key: DRILL-6816
> URL: https://issues.apache.org/jira/browse/DRILL-6816
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Vitalii Diravka
>Priority: Major
> Attachments: test_tbl.json
>
>
> Concurrent query execution from JDBC program using PreparedStatement results 
> in NPE.
> Queries that were executed concurrently are (part of a query file),
> {noformat}
> select id from `test_tbl.json`
> select count(id) from `test_tbl.json`
> select count(*) from `test_tbl.json`
> select * from `test_tbl.json`
> {noformat}
> Drill 1.14.0
>  git.commit.id=35a1ae23c9b280b9e73cb0f6f01808c996515454
>  MapR version => 6.1.0.20180911143226.GA (secure cluster)
> JDBC driver used was org.apache.drill.jdbc.Driver
> Executing the above queries concurrently using a Statement object results in 
> successful query execution.
> {noformat}
> Statement stmt = conn.createStatement();
> ResultSet rs = stmt.executeQuery(query);
> {noformat}
> However, when the same queries listed above are executed using a 
> PreparedStatement object we see an NPE
> {noformat}
> PreparedStatement prdstmnt = conn.prepareStatement(query);
> prdstmnt.executeUpdate();
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 17:04:32.941 [pool-1-thread-3] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@35757005
> 17:04:32.941 [pool-1-thread-2] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d4413b8
> 17:04:32.956 [pool-1-thread-1] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@5eb3b9ab
> 17:04:32.956 [pool-1-thread-4] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d9367d0
> java.lang.NullPointerException
>  at java.util.Objects.requireNonNull(Objects.java:203)
>  at org.apache.calcite.avatica.Meta$MetaResultSet.create(Meta.java:577)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1143)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1150)
>  at 
> org.apache.calcite.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:511)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeLargeUpdate(AvaticaPreparedStatement.java:146)
>  at 
> org.apache.drill.jdbc.impl.DrillPreparedStatementImpl.executeLargeUpdate(DrillPreparedStatementImpl.java:512)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeUpdate(AvaticaPreparedStatement.java:142)
>  at RunQuery.executeQuery(RunQuery.java:61)
>  at RunQuery.run(RunQuery.java:30)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6816) NPE - Concurrent query execution using PreparedStatement

2018-12-07 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713054#comment-16713054
 ] 

Bohdan Kazydub edited comment on DRILL-6816 at 12/7/18 4:28 PM:


[~khfaraaz] according to docs 
([https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String))]
 executeQuery should not be called on PreparedStatement.


was (Author: kazydubb):
[~khfaraaz] according to docs 
([Statement#executeQuery(java.lang.String)|[https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String)]])
 should not be called on PreparedStatement.

> NPE - Concurrent query execution using PreparedStatement 
> -
>
> Key: DRILL-6816
> URL: https://issues.apache.org/jira/browse/DRILL-6816
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Vitalii Diravka
>Priority: Major
> Attachments: test_tbl.json
>
>
> Concurrent query execution from JDBC program using PreparedStatement results 
> in NPE.
> Queries that were executed concurrently are (part of a query file),
> {noformat}
> select id from `test_tbl.json`
> select count(id) from `test_tbl.json`
> select count(*) from `test_tbl.json`
> select * from `test_tbl.json`
> {noformat}
> Drill 1.14.0
>  git.commit.id=35a1ae23c9b280b9e73cb0f6f01808c996515454
>  MapR version => 6.1.0.20180911143226.GA (secure cluster)
> JDBC driver used was org.apache.drill.jdbc.Driver
> Executing the above queries concurrently using a Statement object results in 
> successful query execution.
> {noformat}
> Statement stmt = conn.createStatement();
> ResultSet rs = stmt.executeQuery(query);
> {noformat}
> However, when the same queries listed above are executed using a 
> PreparedStatement object we see an NPE
> {noformat}
> PreparedStatement prdstmnt = conn.prepareStatement(query);
> prdstmnt.executeUpdate();
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 17:04:32.941 [pool-1-thread-3] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@35757005
> 17:04:32.941 [pool-1-thread-2] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d4413b8
> 17:04:32.956 [pool-1-thread-1] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@5eb3b9ab
> 17:04:32.956 [pool-1-thread-4] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d9367d0
> java.lang.NullPointerException
>  at java.util.Objects.requireNonNull(Objects.java:203)
>  at org.apache.calcite.avatica.Meta$MetaResultSet.create(Meta.java:577)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1143)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1150)
>  at 
> org.apache.calcite.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:511)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeLargeUpdate(AvaticaPreparedStatement.java:146)
>  at 
> org.apache.drill.jdbc.impl.DrillPreparedStatementImpl.executeLargeUpdate(DrillPreparedStatementImpl.java:512)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeUpdate(AvaticaPreparedStatement.java:142)
>  at RunQuery.executeQuery(RunQuery.java:61)
>  at RunQuery.run(RunQuery.java:30)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6816) NPE - Concurrent query execution using PreparedStatement

2018-12-07 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713054#comment-16713054
 ] 

Bohdan Kazydub edited comment on DRILL-6816 at 12/7/18 4:25 PM:


[~khfaraaz] according to docs 
([Statement#executeQuery(java.lang.String)|[https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String)]])
 should not be called on PreparedStatement.


was (Author: kazydubb):
[~khfaraaz] according to docs 
([Statement#executeQuery(java.lang.String)|[https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String)])]
 should not be called on PreparedStatement.

> NPE - Concurrent query execution using PreparedStatement 
> -
>
> Key: DRILL-6816
> URL: https://issues.apache.org/jira/browse/DRILL-6816
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Vitalii Diravka
>Priority: Major
> Attachments: test_tbl.json
>
>
> Concurrent query execution from JDBC program using PreparedStatement results 
> in NPE.
> Queries that were executed concurrently are (part of a query file),
> {noformat}
> select id from `test_tbl.json`
> select count(id) from `test_tbl.json`
> select count(*) from `test_tbl.json`
> select * from `test_tbl.json`
> {noformat}
> Drill 1.14.0
>  git.commit.id=35a1ae23c9b280b9e73cb0f6f01808c996515454
>  MapR version => 6.1.0.20180911143226.GA (secure cluster)
> JDBC driver used was org.apache.drill.jdbc.Driver
> Executing the above queries concurrently using a Statement object results in 
> successful query execution.
> {noformat}
> Statement stmt = conn.createStatement();
> ResultSet rs = stmt.executeQuery(query);
> {noformat}
> However, when the same queries listed above are executed using a 
> PreparedStatement object we see an NPE
> {noformat}
> PreparedStatement prdstmnt = conn.prepareStatement(query);
> prdstmnt.executeUpdate();
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 17:04:32.941 [pool-1-thread-3] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@35757005
> 17:04:32.941 [pool-1-thread-2] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d4413b8
> 17:04:32.956 [pool-1-thread-1] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@5eb3b9ab
> 17:04:32.956 [pool-1-thread-4] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d9367d0
> java.lang.NullPointerException
>  at java.util.Objects.requireNonNull(Objects.java:203)
>  at org.apache.calcite.avatica.Meta$MetaResultSet.create(Meta.java:577)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1143)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1150)
>  at 
> org.apache.calcite.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:511)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeLargeUpdate(AvaticaPreparedStatement.java:146)
>  at 
> org.apache.drill.jdbc.impl.DrillPreparedStatementImpl.executeLargeUpdate(DrillPreparedStatementImpl.java:512)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeUpdate(AvaticaPreparedStatement.java:142)
>  at RunQuery.executeQuery(RunQuery.java:61)
>  at RunQuery.run(RunQuery.java:30)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6816) NPE - Concurrent query execution using PreparedStatement

2018-12-07 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713054#comment-16713054
 ] 

Bohdan Kazydub commented on DRILL-6816:
---

[~khfaraaz] according to docs 
([Statement#executeQuery(java.lang.String)|[https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String)])]
 should not be called on PreparedStatement.

> NPE - Concurrent query execution using PreparedStatement 
> -
>
> Key: DRILL-6816
> URL: https://issues.apache.org/jira/browse/DRILL-6816
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Vitalii Diravka
>Priority: Major
> Attachments: test_tbl.json
>
>
> Concurrent query execution from JDBC program using PreparedStatement results 
> in NPE.
> Queries that were executed concurrently are (part of a query file),
> {noformat}
> select id from `test_tbl.json`
> select count(id) from `test_tbl.json`
> select count(*) from `test_tbl.json`
> select * from `test_tbl.json`
> {noformat}
> Drill 1.14.0
>  git.commit.id=35a1ae23c9b280b9e73cb0f6f01808c996515454
>  MapR version => 6.1.0.20180911143226.GA (secure cluster)
> JDBC driver used was org.apache.drill.jdbc.Driver
> Executing the above queries concurrently using a Statement object results in 
> successful query execution.
> {noformat}
> Statement stmt = conn.createStatement();
> ResultSet rs = stmt.executeQuery(query);
> {noformat}
> However, when the same queries listed above are executed using a 
> PreparedStatement object we see an NPE
> {noformat}
> PreparedStatement prdstmnt = conn.prepareStatement(query);
> prdstmnt.executeUpdate();
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 17:04:32.941 [pool-1-thread-3] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@35757005
> 17:04:32.941 [pool-1-thread-2] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d4413b8
> 17:04:32.956 [pool-1-thread-1] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@5eb3b9ab
> 17:04:32.956 [pool-1-thread-4] DEBUG o.a.d.j.impl.DrillStatementRegistry - 
> Adding to open-statements registry: 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory$DrillJdbc41PreparedStatement@d9367d0
> java.lang.NullPointerException
>  at java.util.Objects.requireNonNull(Objects.java:203)
>  at org.apache.calcite.avatica.Meta$MetaResultSet.create(Meta.java:577)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1143)
>  at org.apache.drill.jdbc.impl.DrillMetaImpl.execute(DrillMetaImpl.java:1150)
>  at 
> org.apache.calcite.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:511)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeLargeUpdate(AvaticaPreparedStatement.java:146)
>  at 
> org.apache.drill.jdbc.impl.DrillPreparedStatementImpl.executeLargeUpdate(DrillPreparedStatementImpl.java:512)
>  at 
> org.apache.calcite.avatica.AvaticaPreparedStatement.executeUpdate(AvaticaPreparedStatement.java:142)
>  at RunQuery.executeQuery(RunQuery.java:61)
>  at RunQuery.run(RunQuery.java:30)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6874) CTAS from json to parquet is not working on S3 storage

2018-12-07 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712497#comment-16712497
 ] 

Bohdan Kazydub commented on DRILL-6874:
---

The issue is due to S3 connections not being released right after finishing 
processing a column, thus a connection pool does not have a free connection to 
start processing another column asynchronously. To set number of maximum 
connections use `fs.s3a.connection.maximum` configuration parameter (which is 
of integer type and is equal to 15 by default).

> CTAS from json to parquet is not working on S3 storage
> --
>
> Key: DRILL-6874
> URL: https://issues.apache.org/jira/browse/DRILL-6874
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: ctasjsontoparquet.zip, drillbit.log, 
> drillbit_queries.json, s3src.json, sqlline.log
>
>
> Json file "s3src.json" was uploaded to the s3 storage.
> Query from Json works fine:
> select * from s3.tmp.`s3src.json`;
> | id  |  first_name  |  last_name  |
> | 1   | first_name1  | last_name1  |
> | 2   | first_name2  | last_name2  |
> | 3   | first_name3  | last_name3  |
> | 4   | first_name4  | last_name4  |
> | 5   | first_name5  | last_name5  |
> 5 rows selected (2.803 seconds)
> CTAS from this json file returns successfully result:
> create table s3.tmp.`ctasjsontoparquet` as select * from s3.tmp.`s3src.json`;
> | Fragment  | Number of records written  |
> | 0_0   | 5  |
> 1 row selected (9.264 seconds)
> *Query from the created parquet table {color:#d04437}throws an error:{color}*
> select * from s3.tmp.`ctasjsontoparquet`;
> {code:java}
> Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
> Message: Failure in setting up reader
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
>   optional int64 id;
>   optional binary first_name (UTF8);
>   optional binary last_name (UTF8);
> }
> , metadata: {drill-writer.version=2, drill.version=1.15.0-SNAPSHOT}}, blocks: 
> [BlockMetaData{5, 360 [ColumnMetaData{UNCOMPRESSED [id] optional int64 id  
> [BIT_PACKED, RLE, PLAIN], 4}, ColumnMetaData{UNCOMPRESSED [first_name] 
> optional binary first_name (UTF8)  [BIT_PACKED, RLE, PLAIN], 111}, 
> ColumnMetaData{UNCOMPRESSED [last_name] optional binary last_name (UTF8)  
> [BIT_PACKED, RLE, PLAIN], 241}]}]}
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: 885723e4-8385-4fb0-87dd-c08b0570db95 on maprhost:31010] 
> (state=,code=0)
> {code}
> The same CTAS query works fine on MapRFS and FileSystem storages.
> Log files, json file and created parquet file from S3 are in the attachments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6875) Drill doesn't try to update connection for S3 after session expired

2018-12-05 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709853#comment-16709853
 ] 

Bohdan Kazydub commented on DRILL-6875:
---

As a workaround one may set `fs.s3a.impl.disable.cache` to true in Drill S3 
plugin configuration or any of core-site.xml files. Enabling this option 
ensures that S3AFileSystem is not cached and is created anew.

> Drill doesn't try to update connection for S3 after session expired
> ---
>
> Key: DRILL-6875
> URL: https://issues.apache.org/jira/browse/DRILL-6875
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: drillbit.log
>
>
> *Steps to reproduce:*
> - Drill has S3 storage plugin.
> - Open sqlline and run query to S3.
> - Leave sqlline opened for more than 12 hours.
> - In opened sqlline run query to S3.
> *Expected result:*
> Drill should update authorization session and successfully execute query.
> *Actual result:*
> Sqlline returns an error:
> *{color:#d04437}Error: VALIDATION ERROR: Forbidden (Service: Amazon S3; 
> Status Code: 403; Error Code: 403 Forbidden; Request ID: 4A94DD331A035625; S3 
> Extended Request ID: 
> uy94YdRpQ3ZriCz9xbnDi0yinB4O9kGrH7XPAURhjh8WZoxsbawojQA6v7mfvu920yOYbEI5WP8=)
> [Error Id: 4b44a83b-0e47-45a4-92e3-75f94f5a70cb on maprhost:31010] 
> (state=,code=0){color}*
> *Reopening sqlline doesn't help to get S3 access.*
> *Access problem can be solved only by restarting Drill.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6869) Drill allows to create views outside workspace

2018-11-28 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6869:
--
Description: 
Parameter 'allowAccessOutsideWorkspace' is false for tested workspaces.

On MaprFS and S3 storages Drill allows to create views outside workspace.

*Example on MapRFS:*

create view dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugonmfs' *created successfully in 'dfs.tmp' schema*|

1 row selected (0.93 seconds)

The file "testbugonmfs.view.drill" was *created* in the *root "/" folder,* but 
not in used workspace "/tmp" folder.

Select query works with root "/" folder {color:#d04437}*outside*{color} the 
dfs.tmp workspace:
 select count * from dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs`;
|EXPR$0|
|20|

1 row selected (1.813 seconds)

 

*Example on S3*:

create view s3.tmp.`{color:#d04437}*/*{color}testbugons3` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugons3' *created successfully in 's3.tmp' schema*|

1 row selected (3.455 seconds)

 

The file "testbugons3.view.drill" was *created* in the *root "/" bucket 
folder*, but not in used workspace "/tmp" folder.

Select query also works with root "/" bucket folder 
{color:#d04437}*outside*{color} the s3.tmp workspace:
 select count * from s3.tmp.`/testbugons3`;
|EXPR$0|
|20|

1 row selected (3.209 seconds)

 

*Expected result:* 

View should be created within workspace

On FileSystem storage plugin Drill doesn't allow to create views outside 
workspace.
 Query "create view dfs.tmp.`/testbugonfs` as SELECT * FROM cp.`employee.json` 
LIMIT 20;"
 Returns an error: "{color:#d04437}Error: SYSTEM ERROR: FileNotFoundException: 
/testbugonfs.view.drill (Permission denied){color}".

  was:
Parameter 'allowAccessOutsideWorkspace' is false for tested workspaces.

On MaprFS and S3 storages Drill allows to create views outside workspace.

*Example on MapRFS:*

create view dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugonmfs' *created successfully in 'dfs.tmp' schema*|

1 row selected (0.93 seconds)

The file "testbugonmfs.view.drill" was *created* in the *root "/" folder,* but 
not in used workspace "/tmp" folder.

Select query works with root "/" folder {color:#d04437}*outside*{color} the 
dfs.tmp workspace:
 select count * from dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs`;
|EXPR$0|
|20|

1 row selected (1.813 seconds)

 

*Example on S3*:

create view s3.tmp.`{color:#d04437}*/*{color}testbugons3` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugons3' *created successfully in 's3.tmp' schema*|

1 row selected (3.455 seconds)

 

The file "testbugons3.view.drill" was *created* in the *root "/" bucket 
folder*, but not in used workspace "/tmp" folder.

Select query also works with root "/" bucket folder 
{color:#d04437}*outside*{color} the s3.tmp workspace:
 select count * from s3.tmp.`/testbugons3`;
|EXPR$0|
|20|

1 row selected (3.209 seconds)

 

*Expected result:* should be like on FileSystem storage:

On FileSystem storage plugin Drill doesn't allow to create views outside 
workspace.
 Query "create view dfs.tmp.`/testbugonfs` as SELECT * FROM cp.`employee.json` 
LIMIT 20;"
 Returns an error: "{color:#d04437}Error: SYSTEM ERROR: FileNotFoundException: 
/testbugonfs.view.drill (Permission denied){color}".


> Drill allows to create views outside workspace
> --
>
> Key: DRILL-6869
> URL: https://issues.apache.org/jira/browse/DRILL-6869
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.16.0
>
> Attachments: Amazon_S3_FS_stor_plugin.json, 
> FileSystem_stor_plugin.json, MapR_FS_stor_plugin.json
>
>
> Parameter 'allowAccessOutsideWorkspace' is false for tested workspaces.
> On MaprFS and S3 storages Drill allows to create views outside workspace.
> *Example on MapRFS:*
> create view dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs` as SELECT * FROM 
> cp.`employee.json` LIMIT 20;
> |ok|summary|
> |true|View '/testbugonmfs' *created successfully in 'dfs.tmp' schema*|
> 1 row selected (0.93 seconds)
> The file "testbugonmfs.view.drill" was *created* in the *root "/" folder,* 
> but not in used workspace "/tmp" folder.
> Select query works with root "/" folder {color:#d04437}*outside*{color} the 
> dfs.tmp workspace:
>  select count * from dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs`;
> |EXPR$0|
> |20|
> 1 row selected (1.813 seconds)
>  
> *Example on S3*:
> create view s3.tmp.`{color:#d04437}*/*{color}testbugons3` as SELECT * FROM 
> cp.`employee.json` LIMIT 20;
> |ok|summary|

[jira] [Updated] (DRILL-6869) Drill allows to create views outside workspace

2018-11-28 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6869:
--
Description: 
Parameter 'allowAccessOutsideWorkspace' is false for tested workspaces.

On MaprFS and S3 storages Drill allows to create views outside workspace.

*Example on MapRFS:*

create view dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugonmfs' *created successfully in 'dfs.tmp' schema*|

1 row selected (0.93 seconds)

The file "testbugonmfs.view.drill" was *created* in the *root "/" folder,* but 
not in used workspace "/tmp" folder.

Select query works with root "/" folder {color:#d04437}*outside*{color} the 
dfs.tmp workspace:
 select count * from dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs`;
|EXPR$0|
|20|

1 row selected (1.813 seconds)

 

*Example on S3*:

create view s3.tmp.`{color:#d04437}*/*{color}testbugons3` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugons3' *created successfully in 's3.tmp' schema*|

1 row selected (3.455 seconds)

 

The file "testbugons3.view.drill" was *created* in the *root "/" bucket 
folder*, but not in used workspace "/tmp" folder.

Select query also works with root "/" bucket folder 
{color:#d04437}*outside*{color} the s3.tmp workspace:
 select count * from s3.tmp.`/testbugons3`;
|EXPR$0|
|20|

1 row selected (3.209 seconds)

 

*Expected result:* should be like on FileSystem storage:

On FileSystem storage plugin Drill doesn't allow to create views outside 
workspace.
 Query "create view dfs.tmp.`/testbugonfs` as SELECT * FROM cp.`employee.json` 
LIMIT 20;"
 Returns an error: "{color:#d04437}Error: SYSTEM ERROR: FileNotFoundException: 
/testbugonfs.view.drill (Permission denied){color}".

  was:
Parameter 'allowAccessOutsideWorkspace' is false for tested workspaces.

On MaprFS and S3 storages Drill allows to create views outside workspace.

*Example on MapRFS:*

create view dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugonmfs' *created successfully in 'dfs.tmp' schema*|

1 row selected (0.93 seconds)

The file "testbugonmfs.view.drill" was *created* in the *root "/" folder,* but 
not in used workspace "/tmp" folder.

Select query works with root "/" folder {color:#d04437}*outside*{color} the 
dfs.tmp workspace:
 select count * from dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs`;


|EXPR$0|
|20|

1 row selected (1.813 seconds)

 

*Example on S3*:

create view s3.tmp.`{color:#d04437}*/*{color}testbugons3` as SELECT * FROM 
cp.`employee.json` LIMIT 20;
|ok|summary|
|true|View '/testbugons3' *created successfully in 's3.tmp' schema*|

1 row selected (3.455 seconds)

 

The file "testbugons3.view.drill" was *created* in the *root "/" bucket 
folder*, but not in used workspace "/tmp" folder.

Select query also works with root "/" bucket folder 
{color:#d04437}*outside*{color} the s3.tmp workspace:
 select count * from s3.tmp.`/testbugons3`;
|EXPR$0|
|20|

1 row selected (3.209 seconds)

 

*Expected result* should be like on FileSystem storage:

On FileSystem storage plugin Drill doesn't allow to create views outside 
workspace.
 Query "create view dfs.tmp.`/testbugonfs` as SELECT * FROM cp.`employee.json` 
LIMIT 20;"
 Returns an error: "{color:#d04437}Error: SYSTEM ERROR: FileNotFoundException: 
/testbugonfs.view.drill (Permission denied){color}".


> Drill allows to create views outside workspace
> --
>
> Key: DRILL-6869
> URL: https://issues.apache.org/jira/browse/DRILL-6869
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.16.0
>
> Attachments: Amazon_S3_FS_stor_plugin.json, 
> FileSystem_stor_plugin.json, MapR_FS_stor_plugin.json
>
>
> Parameter 'allowAccessOutsideWorkspace' is false for tested workspaces.
> On MaprFS and S3 storages Drill allows to create views outside workspace.
> *Example on MapRFS:*
> create view dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs` as SELECT * FROM 
> cp.`employee.json` LIMIT 20;
> |ok|summary|
> |true|View '/testbugonmfs' *created successfully in 'dfs.tmp' schema*|
> 1 row selected (0.93 seconds)
> The file "testbugonmfs.view.drill" was *created* in the *root "/" folder,* 
> but not in used workspace "/tmp" folder.
> Select query works with root "/" folder {color:#d04437}*outside*{color} the 
> dfs.tmp workspace:
>  select count * from dfs.tmp.`{color:#d04437}*/*{color}testbugonmfs`;
> |EXPR$0|
> |20|
> 1 row selected (1.813 seconds)
>  
> *Example on S3*:
> create view s3.tmp.`{color:#d04437}*/*{color}testbugons3` as SELECT * FROM 
> cp.`employee.json` LIMIT 20;
> |ok|summary|
> 

[jira] [Commented] (DRILL-6863) Drop table is not working if path within workspace starts with '/'

2018-11-27 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16700172#comment-16700172
 ] 

Bohdan Kazydub commented on DRILL-6863:
---

[~denysord88], this happens because when you prepend a path within a workspace 
with '/' the path is considered absolute ignoring workspace (there was a 
similar issue DRILL-1130 for SELECT). To be consistent with other queries, no 
error should be returned and both queries should work the same (actually drop 
the table).

> Drop table is not working if path within workspace starts with '/'
> --
>
> Key: DRILL-6863
> URL: https://issues.apache.org/jira/browse/DRILL-6863
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> Drill works incorrectly if path to the table within workspace starts with '/'
> Request "drop table s3.tmp.`drill/transitive_closure/tab1`" works fine,
>  but if I add '/' in the begining of the tables path "drop table 
> s3.tmp.`{color:#d04437}/{color}drill/transitive_closure/tab1`", Drill is 
> trying to find table in the root directory but not in workspace path.
> *Actual result:*
>  Drill returns successfully response
>  "Table [/drill/transitive_closure/tab1] dropped"
>  but table was not dropped.
>  
> *Expected result:*
> Drill returns error message in the response.
> Bug can be reproduced on S3 and DFS storages. On FileSystem storage Drill 
> successfully returns error message if "drop table" query starts with '/' in 
> table path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6863) Drop table is not working if path within workspace starts with '/'

2018-11-27 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6863:
--
Affects Version/s: (was: 1.16.0)
   1.15.0

> Drop table is not working if path within workspace starts with '/'
> --
>
> Key: DRILL-6863
> URL: https://issues.apache.org/jira/browse/DRILL-6863
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.15.0
>
>
> Drill works incorrectly if path to the table within workspace starts with '/'
> Request "drop table s3.tmp.`drill/transitive_closure/tab1`" works fine,
>  but if I add '/' in the begining of the tables path "drop table 
> s3.tmp.`{color:#d04437}/{color}drill/transitive_closure/tab1`", Drill is 
> trying to find table in the root directory but not in workspace path.
> *Actual result:*
>  Drill returns successfully response
>  "Table [/drill/transitive_closure/tab1] dropped"
>  but table was not dropped.
>  
> *Expected result:*
> Drill returns error message in the response.
> Bug can be reproduced on S3 and DFS storages. On FileSystem storage Drill 
> successfully returns error message if "drop table" query starts with '/' in 
> table path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6863) Drop table is not working if path within workspace starts with '/'

2018-11-27 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6863:
--
Affects Version/s: (was: 1.15.0)
   1.16.0

> Drop table is not working if path within workspace starts with '/'
> --
>
> Key: DRILL-6863
> URL: https://issues.apache.org/jira/browse/DRILL-6863
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.15.0
>
>
> Drill works incorrectly if path to the table within workspace starts with '/'
> Request "drop table s3.tmp.`drill/transitive_closure/tab1`" works fine,
>  but if I add '/' in the begining of the tables path "drop table 
> s3.tmp.`{color:#d04437}/{color}drill/transitive_closure/tab1`", Drill is 
> trying to find table in the root directory but not in workspace path.
> *Actual result:*
>  Drill returns successfully response
>  "Table [/drill/transitive_closure/tab1] dropped"
>  but table was not dropped.
>  
> *Expected result:*
> Drill returns error message in the response.
> Bug can be reproduced on S3 and DFS storages. On FileSystem storage Drill 
> successfully returns error message if "drop table" query starts with '/' in 
> table path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6863) Drop table is not working if path within workspace starts with '/'

2018-11-27 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6863:
--
Fix Version/s: (was: 1.15.0)
   1.16.0

> Drop table is not working if path within workspace starts with '/'
> --
>
> Key: DRILL-6863
> URL: https://issues.apache.org/jira/browse/DRILL-6863
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> Drill works incorrectly if path to the table within workspace starts with '/'
> Request "drop table s3.tmp.`drill/transitive_closure/tab1`" works fine,
>  but if I add '/' in the begining of the tables path "drop table 
> s3.tmp.`{color:#d04437}/{color}drill/transitive_closure/tab1`", Drill is 
> trying to find table in the root directory but not in workspace path.
> *Actual result:*
>  Drill returns successfully response
>  "Table [/drill/transitive_closure/tab1] dropped"
>  but table was not dropped.
>  
> *Expected result:*
> Drill returns error message in the response.
> Bug can be reproduced on S3 and DFS storages. On FileSystem storage Drill 
> successfully returns error message if "drop table" query starts with '/' in 
> table path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6834) Introduce option to disable result set for CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP FUNCTION, USE schema, SET option, REFRESH METADATA TABLE for JD

2018-11-26 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6834:
--
Summary: Introduce option to disable result set for CTAS, CREATE VIEW, 
CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP FUNCTION, USE schema, SET option, 
REFRESH METADATA TABLE for JDBC connection  (was: Introduce option to disable 
result set on CTAS, create view and drop table/view etc. for JDBC connection)

> Introduce option to disable result set for CTAS, CREATE VIEW, CREATE 
> FUNCTION, DROP TABLE, DROP VIEW, DROP FUNCTION, USE schema, SET option, 
> REFRESH METADATA TABLE for JDBC connection
> ---
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> {{`exec.return_result_set_for_ddl}}{{`}} is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. a result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE queries will not 
> return result set but {{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6834) Introduce option to disable result set for DDL queries for JDBC connection

2018-11-26 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6834:
--
Summary: Introduce option to disable result set for DDL queries for JDBC 
connection  (was: Introduce option to disable result set for CTAS, CREATE VIEW, 
CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP FUNCTION, USE schema, SET option, 
REFRESH METADATA TABLE for JDBC connection)

> Introduce option to disable result set for DDL queries for JDBC connection
> --
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> {{`exec.return_result_set_for_ddl}}{{`}} is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. a result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE queries will not 
> return result set but {{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6834) Introduce option to disable result set on CTAS, create view and drop table/view etc. for JDBC connection

2018-11-26 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6834:
--
Description: 
There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
{{`exec.return_result_set_for_ddl}}{{`}} is introduced. If the option is 
enabled (set to `true`) Drill's behaviour will be unchanged, i.e. a result set 
will be returned for all queries. If the option is disabled (set to `false`), 
CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP FUNCTION, USE 
schema, SET option, REFRESH METADATA TABLE queries will not return result set 
but {{updateCount}} instead.

The option affects JDBC connections only.

  was:
There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
{{`drill.exec.fetch_resultset_for_ddl`}} is introduced. If the option is 
enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
set will be returned for all queries. If the option is disabled (set to 
`false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will not 
return result set but {{updateCount}} instead.

The option affects JDBC connections only.


> Introduce option to disable result set on CTAS, create view and drop 
> table/view etc. for JDBC connection
> 
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> {{`exec.return_result_set_for_ddl}}{{`}} is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. a result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE queries will not 
> return result set but {{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6834) Introduce option to disable result set on CTAS, create view and drop table/view etc. for JDBC connection

2018-11-21 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6834:
--
Summary: Introduce option to disable result set on CTAS, create view and 
drop table/view etc. for JDBC connection  (was: New option to disable result 
set on CTAS, create view and drop table/view)

> Introduce option to disable result set on CTAS, create view and drop 
> table/view etc. for JDBC connection
> 
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> {{`drill.exec.fetch_resultset_for_ddl`}} is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will 
> not return result set but {{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6834) New option to disable result set on CTAS, create view and drop table/view

2018-11-21 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6834:
--
Description: 
There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
`drill.exec.fetch_resultset_for_ddl` is introduced. If the option is 
enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
set will be returned for all queries. If the option is disabled (set to 
`false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will not 
return result set but \{{updateCount}} instead.

The option affects JDBC connections only.

  was:
There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
`exec.fetch_resultset` will be introduced. If the option is enabled (set to 
`true`) Drill's behaviour will be unchanged. If the option is disabled (set to 
`false`), CTAS, create view and drop table/view queries will not return result 
set but show messages instead.


> New option to disable result set on CTAS, create view and drop table/view
> -
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> `drill.exec.fetch_resultset_for_ddl` is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will 
> not return result set but \{{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6834) New option to disable result set on CTAS, create view and drop table/view

2018-11-21 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6834:
--
Description: 
There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
{{`drill.exec.fetch_resultset_for_ddl`}} is introduced. If the option is 
enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
set will be returned for all queries. If the option is disabled (set to 
`false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will not 
return result set but {{updateCount}} instead.

The option affects JDBC connections only.

  was:
There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
`drill.exec.fetch_resultset_for_ddl` is introduced. If the option is 
enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
set will be returned for all queries. If the option is disabled (set to 
`false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will not 
return result set but \{{updateCount}} instead.

The option affects JDBC connections only.


> New option to disable result set on CTAS, create view and drop table/view
> -
>
> Key: DRILL-6834
> URL: https://issues.apache.org/jira/browse/DRILL-6834
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
> result set on CTAS query. As a result the query gets canceled. Hive, on the 
> other hand, does not return result set for the query and these tools work 
> well.
> To improve Drill's integration with such tools a session option 
> {{`drill.exec.fetch_resultset_for_ddl`}} is introduced. If the option is 
> enabled (set to `true`) Drill's behaviour will be unchanged, i.e. all result 
> set will be returned for all queries. If the option is disabled (set to 
> `false`), CTAS, CREATE VIEW, CREATE FUNCTION, DROP TABLE, DROP VIEW, DROP 
> FUNCTION, USE schema, SET option, REFRESH METADATA TABLE etc. queries will 
> not return result set but {{updateCount}} instead.
> The option affects JDBC connections only.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6817) Update to_number function to be consistent with CAST function

2018-11-19 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692760#comment-16692760
 ] 

Bohdan Kazydub commented on DRILL-6817:
---

[~vvysotskyi], yes it was fixed in the scope of DRILL-6768. Marked this one as 
resolved.

Thank you for pointing this out.

> Update to_number function to be consistent with CAST function
> -
>
> Key: DRILL-6817
> URL: https://issues.apache.org/jira/browse/DRILL-6817
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> {{In case when `drill.exec.functions.cast_empty_string_to_null` is enabled 
> casting empty string ('') to numeric types will return NULL. If `to_number` 
> is used to convert empty string to a number, UnsupportedOperationException 
> will be thrown.}}
> The aim is to make these functions (CASTs and `to_number`) work consistently 
> as is done for date/time functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6834) New option to disable result set on CTAS, create view and drop table/view

2018-11-07 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6834:
-

 Summary: New option to disable result set on CTAS, create view and 
drop table/view
 Key: DRILL-6834
 URL: https://issues.apache.org/jira/browse/DRILL-6834
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


There are some tools (Unica, dBeaver, TalenD) that do not expect to obtain 
result set on CTAS query. As a result the query gets canceled. Hive, on the 
other hand, does not return result set for the query and these tools work well.

To improve Drill's integration with such tools a session option 
`exec.fetch_resultset` will be introduced. If the option is enabled (set to 
`true`) Drill's behaviour will be unchanged. If the option is disabled (set to 
`false`), CTAS, create view and drop table/view queries will not return result 
set but show messages instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6768) Improve to_date, to_time and to_timestamp and corresponding cast functions to handle empty string when `drill.exec.functions.cast_empty_string_to_null` option is enable

2018-10-31 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670007#comment-16670007
 ] 

Bohdan Kazydub commented on DRILL-6768:
---

[~vitalii], yes, it is possible.

One way is to introduce `NullHandling.NULL_IF_NULL_OR_EMPTY_STRING` strategy, 
which handles empty string input as if it was NULL.

The other way is to enhance `NullHandling.NULL_IF_NULL` (there is an open issue 
for this) to allow setting null values to output from function body methods. 
While there will still be a need to handle empty strings yourself there could 
be less function definitions because handling framework should take care of 
this cases.

> Improve to_date, to_time and to_timestamp and corresponding cast functions to 
> handle empty string when `drill.exec.functions.cast_empty_string_to_null` 
> option is enabled
> -
>
> Key: DRILL-6768
> URL: https://issues.apache.org/jira/browse/DRILL-6768
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> When `drill.exec.functions.cast_empty_string_to_null` option is enabled
> `to_date`, `to_time` and `to_timestamp` functions while converting string to 
> according type in case if null or empty string values are passed will return 
> NULL (to avoid CASE clauses which are littering a query and will work in 
> accordance with their respective CAST counterparts) for both cases.
>  
>   
>   
> CASTs will  be handled in a similar way (uniformly with numeric types):
>  
> ||Value to cast||Now||Will be||
> |NULL|NULL|NULL|
> |'' (empty string)|Error in many cases (except numerical types)|NULL|
>  CAST empty string to null (in case of enabled option) will be supported by 
> DATE, TIME, TIMESTAMP, INTERVAL YEAR, INTERVAL MONTH and INTERVAL DAY 
> functions in addition to numeric types.
>  
> *For documentation*
> TBA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6679) Error should be displayed when trying to connect Drill to unsupported version of Hive

2018-10-30 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6679:
--
Fix Version/s: (was: 1.15.0)
   Future

> Error should be displayed when trying to connect Drill to unsupported version 
> of Hive
> -
>
> Key: DRILL-6679
> URL: https://issues.apache.org/jira/browse/DRILL-6679
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: Future
>
>
> For example, there is no backward compatibility between Hive 2.3 and Hive 
> 2.1. But it is possible to connect Drill with Hive 2.3 client to Hive 2.1, it 
> just won't work correctly. So I suggest that enabling Hive storage plugin 
> should not be allowed if Hive version is unsupported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6817) Update to_number function to be consistent with CAST function

2018-10-30 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6817:
-

 Summary: Update to_number function to be consistent with CAST 
function
 Key: DRILL-6817
 URL: https://issues.apache.org/jira/browse/DRILL-6817
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


{{In case when `drill.exec.functions.cast_empty_string_to_null` is enabled 
casting empty string ('') to numeric types will return NULL. If `to_number` is 
used to convert empty string to a number, UnsupportedOperationException will be 
thrown.}}

The aim is to make these functions (CASTs and `to_number`) work consistently as 
is done for date/time functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5332) DateVector support uses questionable conversions to a time

2018-10-29 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667275#comment-16667275
 ] 

Bohdan Kazydub commented on DRILL-5332:
---

It should be noted that while there was a change for reading date type values 
(now using java.time package) shown by [~vitalii] there still are functions 
(generated by ToDateTypeFunctions and SqlToDateTypeFunctions templates at 
least) that write this values using Joda-time (org.joda.time).

> DateVector support uses questionable conversions to a time
> --
>
> Key: DRILL-5332
> URL: https://issues.apache.org/jira/browse/DRILL-5332
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>Priority: Major
>
> The following code in {{DateVector}} is worrisome:
> {code}
> @Override
> public DateTime getObject(int index) {
> org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), 
> org.joda.time.DateTimeZone.UTC);
> date = 
> date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
> return date;
> }
> {code}
> This code takes a date/time value stored in a value vector, converts it to 
> UTC, then strips the time zone and replaces it with local time. The result it 
> a "timestamp" in Java format (seconds since the epoch), but not really, it 
> really the time since the epoch, as if the epoch had started in the local 
> time zone rather than UTC.
> This is the kind of fun & games that people used to do in Java with the 
> {{Date}}  type before the advent of Joda time (and the migration of Joda into 
> Java 8.)
> It is, in short, very bad practice and nearly impossible to get right.
> Further, converting a pure date (since this is a {{DateVector}}) into a 
> date/time is fraught with peril. A date has no corresponding time. 1 AM on 
> Friday in one time zone might be 11 PM on Thursday in another. Converting 
> from dates to times is very difficult.
> If the {{DateVector}} corresponds to a date, then it should be simple date 
> with no implied time zone and no implied relationship to time. If there is to 
> be a mapping of time, it must be to a {{LocalTime}} (in Joda and Java 8) that 
> has no implied time zone.
> Note that this code directly contradicts the statement in [Drill 
> documentation|http://drill.apache.org/docs/date-time-and-timestamp/]: "Drill 
> stores values in Coordinated Universal Time (UTC)." Actually, even the 
> documentation is questionable: what does it mean to store a date in UTC 
> because of the above issues?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6815) Improve code generation to handle functions with NullHandling.NULL_IF_NULL better

2018-10-29 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6815:
--
Description: 
If a (simple) function is declared with NULL_IF_NULL null handling strategy 
(`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
which checks if any of the inputs is NULL (not set). In case if there is, 
output is set to be null otherwise function's code is executed and at the end 
output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).

The problem is, this behavior makes it impossible to make output value NULL 
from within function's evaluation body. Which may prove useful in certain 
situations, e.g. when input is an empty string and output should be NULL in the 
case etc. Sometimes it may result in creation of two separate functions with 
NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) instead of 
one with NULL_IF_NULL. It does not follow a Principle of Least Astonishment as 
effectively it behaves more like "null if and only if null" and documentation 
for NULL_IF_NULL is as follows:
{code}
enum NullHandling {
    ...

    /**
 * Null output if any null input:
 * Indicates that a method's associated logical operation returns NULL if
 * either input is NULL, and therefore that the method must not be called
 * with null inputs.  (The calling framework must handle NULLs.)
 */
    NULL_IF_NULL
}
{code}
It looks as if this behavior was not intended.

Intent of this improvement is to allow output NULL values based on function's 
eval() method when NULL_IF_NULL null handling strategy is chosen.

  was:
If a (simple) function is declared with NULL_IF_NULL null handling strategy 
(`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
which checks if any of the inputs is NULL (not set). In case if there is, 
output is set to be null otherwise function's code is executed and at the end 
output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
[https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).]

The problem is, this behavior makes it impossible to make output value NULL 
from within [function's evaluation 
body|https://github.com/apache/drill/blob/7b0c9034753a8c5035fd1c0f1f84a37b376e6748/exec/java-exec/src/main/java/org/apache/drill/exec/expr/DrillSimpleFunc.java#L22].
 Which may prove useful in certain situations, e.g. when input is an empty 
string and output should be NULL in the case etc. Sometimes it may result in 
two separate functions instead of one with NULL_IF_NULL. It does not follow a 
[Principle of Least 
Astonishment|https://en.wikipedia.org/wiki/Principle_of_least_astonishment] as 
effectively it behaves more like "null if and only if null" and documentation 
for NULL_IF_NULL is currently


> Improve code generation to handle functions with NullHandling.NULL_IF_NULL 
> better
> -
>
> Key: DRILL-6815
> URL: https://issues.apache.org/jira/browse/DRILL-6815
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Priority: Major
>
> If a (simple) function is declared with NULL_IF_NULL null handling strategy 
> (`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
> which checks if any of the inputs is NULL (not set). In case if there is, 
> output is set to be null otherwise function's code is executed and at the end 
> output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
> https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).
> The problem is, this behavior makes it impossible to make output value NULL 
> from within function's evaluation body. Which may prove useful in certain 
> situations, e.g. when input is an empty string and output should be NULL in 
> the case etc. Sometimes it may result in creation of two separate functions 
> with NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) 
> instead of one with NULL_IF_NULL. It does not follow a Principle of Least 
> Astonishment as effectively it behaves more like "null if and only if null" 
> and documentation for NULL_IF_NULL is as follows:
> {code}
> enum NullHandling {
>     ...
>     /**
>  * Null output if any null input:
>  * Indicates that a method's associated logical operation returns NULL if
>  * either input is NULL, and therefore that the method 

[jira] [Updated] (DRILL-6815) Improve code generation to handle functions with NullHandling.NULL_IF_NULL better

2018-10-29 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6815:
--
Priority: Minor  (was: Major)

> Improve code generation to handle functions with NullHandling.NULL_IF_NULL 
> better
> -
>
> Key: DRILL-6815
> URL: https://issues.apache.org/jira/browse/DRILL-6815
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Priority: Minor
>
> If a (simple) function is declared with NULL_IF_NULL null handling strategy 
> (`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
> which checks if any of the inputs is NULL (not set). In case if there is, 
> output is set to be null otherwise function's code is executed and at the end 
> output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
> https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).
> The problem is, this behavior makes it impossible to make output value NULL 
> from within function's evaluation body. Which may prove useful in certain 
> situations, e.g. when input is an empty string and output should be NULL in 
> the case etc. Sometimes it may result in creation of two separate functions 
> with NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) 
> instead of one with NULL_IF_NULL. It does not follow a Principle of Least 
> Astonishment as effectively it behaves more like "null if and only if null" 
> and documentation for NULL_IF_NULL is as follows:
> {code}
> enum NullHandling {
>     ...
>     /**
>  * Null output if any null input:
>  * Indicates that a method's associated logical operation returns NULL if
>  * either input is NULL, and therefore that the method must not be called
>  * with null inputs.  (The calling framework must handle NULLs.)
>  */
>     NULL_IF_NULL
> }
> {code}
> It looks as if this behavior was not intended.
> Intent of this improvement is to allow output NULL values based on function's 
> eval() method when NULL_IF_NULL null handling strategy is chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6815) Improve code generation to handle functions with NullHandling.NULL_IF_NULL better

2018-10-29 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6815:
-

 Summary: Improve code generation to handle functions with 
NullHandling.NULL_IF_NULL better
 Key: DRILL-6815
 URL: https://issues.apache.org/jira/browse/DRILL-6815
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Bohdan Kazydub


If a (simple) function is declared with NULL_IF_NULL null handling strategy 
(`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
which checks if any of the inputs is NULL (not set). In case if there is, 
output is set to be null otherwise function's code is executed and at the end 
output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
[https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).]

The problem is, this behavior makes it impossible to make output value NULL 
from within [function's evaluation 
body|https://github.com/apache/drill/blob/7b0c9034753a8c5035fd1c0f1f84a37b376e6748/exec/java-exec/src/main/java/org/apache/drill/exec/expr/DrillSimpleFunc.java#L22].
 Which may prove useful in certain situations, e.g. when input is an empty 
string and output should be NULL in the case etc. Sometimes it may result in 
two separate functions instead of one with NULL_IF_NULL. It does not follow a 
[Principle of Least 
Astonishment|https://en.wikipedia.org/wiki/Principle_of_least_astonishment] as 
effectively it behaves more like "null if and only if null" and documentation 
for NULL_IF_NULL is currently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-6771) Queries on Hive 2.3.x fails with SYSTEM ERROR: ArrayIndexOutOfBoundsException

2018-10-29 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub resolved DRILL-6771.
---
Resolution: Fixed

> Queries on Hive 2.3.x fails with SYSTEM ERROR: ArrayIndexOutOfBoundsException
> -
>
> Key: DRILL-6771
> URL: https://issues.apache.org/jira/browse/DRILL-6771
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization, Storage - Hive
>Affects Versions: 1.15.0
> Environment: Hive 2.3.3
> MapR 6.1.0
>Reporter: Abhishek Girish
>Assignee: Bohdan Kazydub
>Priority: Critical
> Fix For: 1.15.0
>
>
> Query: Functional/partition_pruning/hive/general/plan/orc1.q
> {code}
> select * from hive.orc_create_people_dp where state = 'Ca'
> java.sql.SQLException: SYSTEM ERROR: ArrayIndexOutOfBoundsException: 6
> {code}
> Stack Trace:
> {code}
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Error while applying rule Prel.ScanPrule, 
> args [rel#2103503:DrillScanRel.LOGICAL.ANY([]).[](table=[hive, 
> orc_create_people_dp],groupscan=HiveScan [table=Table(dbName:default, 
> tableName:orc_create_people_dp), columns=[`id`, `first_name`, `last_name`, 
> `address`, `state`, `**`], numPartitions=1, partitions= 
> [Partition(values:[Ca])], 
> inputDirectories=[maprfs:/drill/testdata/hive_storage/orc_create_people_dp/state=Ca],
>  confProperties={}])]
> org.apache.drill.exec.work.foreman.Foreman.run():300
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.RuntimeException) Error while applying rule 
> Prel.ScanPrule, args 
> [rel#2103503:DrillScanRel.LOGICAL.ANY([]).[](table=[hive, 
> orc_create_people_dp],groupscan=HiveScan [table=Table(dbName:default, 
> tableName:orc_create_people_dp), columns=[`id`, `first_name`, `last_name`, 
> `address`, `state`, `**`], numPartitions=1, partitions= 
> [Partition(values:[Ca])], 
> inputDirectories=[maprfs:/drill/testdata/hive_storage/orc_create_people_dp/state=Ca],
>  confProperties={}])]
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():648
> org.apache.calcite.tools.Programs$RuleSetProgram.run():339
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():425
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():455
> org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan():68
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
> org.apache.drill.exec.work.foreman.Foreman.runSQL():584
> org.apache.drill.exec.work.foreman.Foreman.run():272
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (org.apache.drill.common.exceptions.DrillRuntimeException) Failed 
> to get InputSplits
> org.apache.drill.exec.store.hive.HiveMetadataProvider.getInputSplits():182
> org.apache.drill.exec.store.hive.HiveScan.getInputSplits():288
> org.apache.drill.exec.store.hive.HiveScan.getMaxParallelizationWidth():197
> org.apache.drill.exec.planner.physical.ScanPrule.onMatch():42
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():648
> org.apache.calcite.tools.Programs$RuleSetProgram.run():339
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():425
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():455
> org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan():68
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
> org.apache.drill.exec.work.foreman.Foreman.runSQL():584
> org.apache.drill.exec.work.foreman.Foreman.run():272
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.RuntimeException) ORC split generation failed with 
> exception: java.lang.ArrayIndexOutOfBoundsException: 6
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo():1579
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits():1665
> 
> org.apache.drill.exec.store.hive.HiveMetadataProvider.lambda$splitInputWithUGI$2():258
> java.security.AccessController.doPrivileged():-2
> 

[jira] [Commented] (DRILL-6771) Queries on Hive 2.3.x fails with SYSTEM ERROR: ArrayIndexOutOfBoundsException

2018-10-29 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667066#comment-16667066
 ] 

Bohdan Kazydub commented on DRILL-6771:
---

Fixed in commit 0a3cfde. 

> Queries on Hive 2.3.x fails with SYSTEM ERROR: ArrayIndexOutOfBoundsException
> -
>
> Key: DRILL-6771
> URL: https://issues.apache.org/jira/browse/DRILL-6771
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization, Storage - Hive
>Affects Versions: 1.15.0
> Environment: Hive 2.3.3
> MapR 6.1.0
>Reporter: Abhishek Girish
>Assignee: Bohdan Kazydub
>Priority: Critical
> Fix For: 1.15.0
>
>
> Query: Functional/partition_pruning/hive/general/plan/orc1.q
> {code}
> select * from hive.orc_create_people_dp where state = 'Ca'
> java.sql.SQLException: SYSTEM ERROR: ArrayIndexOutOfBoundsException: 6
> {code}
> Stack Trace:
> {code}
>   (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
> during fragment initialization: Error while applying rule Prel.ScanPrule, 
> args [rel#2103503:DrillScanRel.LOGICAL.ANY([]).[](table=[hive, 
> orc_create_people_dp],groupscan=HiveScan [table=Table(dbName:default, 
> tableName:orc_create_people_dp), columns=[`id`, `first_name`, `last_name`, 
> `address`, `state`, `**`], numPartitions=1, partitions= 
> [Partition(values:[Ca])], 
> inputDirectories=[maprfs:/drill/testdata/hive_storage/orc_create_people_dp/state=Ca],
>  confProperties={}])]
> org.apache.drill.exec.work.foreman.Foreman.run():300
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.RuntimeException) Error while applying rule 
> Prel.ScanPrule, args 
> [rel#2103503:DrillScanRel.LOGICAL.ANY([]).[](table=[hive, 
> orc_create_people_dp],groupscan=HiveScan [table=Table(dbName:default, 
> tableName:orc_create_people_dp), columns=[`id`, `first_name`, `last_name`, 
> `address`, `state`, `**`], numPartitions=1, partitions= 
> [Partition(values:[Ca])], 
> inputDirectories=[maprfs:/drill/testdata/hive_storage/orc_create_people_dp/state=Ca],
>  confProperties={}])]
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():236
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():648
> org.apache.calcite.tools.Programs$RuleSetProgram.run():339
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():425
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():455
> org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan():68
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
> org.apache.drill.exec.work.foreman.Foreman.runSQL():584
> org.apache.drill.exec.work.foreman.Foreman.run():272
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (org.apache.drill.common.exceptions.DrillRuntimeException) Failed 
> to get InputSplits
> org.apache.drill.exec.store.hive.HiveMetadataProvider.getInputSplits():182
> org.apache.drill.exec.store.hive.HiveScan.getInputSplits():288
> org.apache.drill.exec.store.hive.HiveScan.getMaxParallelizationWidth():197
> org.apache.drill.exec.planner.physical.ScanPrule.onMatch():42
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():212
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():648
> org.apache.calcite.tools.Programs$RuleSetProgram.run():339
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():425
> 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel():455
> org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan():68
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():145
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():83
> org.apache.drill.exec.work.foreman.Foreman.runSQL():584
> org.apache.drill.exec.work.foreman.Foreman.run():272
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.RuntimeException) ORC split generation failed with 
> exception: java.lang.ArrayIndexOutOfBoundsException: 6
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo():1579
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits():1665
> 
> org.apache.drill.exec.store.hive.HiveMetadataProvider.lambda$splitInputWithUGI$2():258
> 

[jira] [Commented] (DRILL-6810) Disable NULL_IF_NULL NullHandling for functions with ComplexWriter

2018-10-24 Thread Bohdan Kazydub (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662465#comment-16662465
 ] 

Bohdan Kazydub commented on DRILL-6810:
---

[~Paul.Rogers], actually by 'Disable' I meant 'fail to load the function' (as 
is currently done for aggregate functions). Should I rephrase description text, 
to indicate it better?

Handling for the types is hard, because root may be either list or a map.

> Disable NULL_IF_NULL NullHandling for functions with ComplexWriter
> --
>
> Key: DRILL-6810
> URL: https://issues.apache.org/jira/browse/DRILL-6810
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently NullHandling.NULL_IF_NULL is allowed for UDFs with @Output of type 
> org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter but no 
> null handling is performed for the kind of functions which leads to 
> confusion. The problem is ComplexWriter holds list/map values and Drill does 
> not yet support NULL values for the types (there is an issue to allow null 
> maps/lists in [DRILL-4824|https://issues.apache.org/jira/browse/DRILL-4824]).
> For such functions support for NULL_IF_NULL will be disabled, as it is done 
> for aggregate functions, and NullHandling.INTERNAL should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6810) Disable NULL_IF_NULL NullHandling for functions with ComplexWriter

2018-10-23 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6810:
-

 Summary: Disable NULL_IF_NULL NullHandling for functions with 
ComplexWriter
 Key: DRILL-6810
 URL: https://issues.apache.org/jira/browse/DRILL-6810
 Project: Apache Drill
  Issue Type: Bug
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub


Currently NullHandling.NULL_IF_NULL is allowed for UDFs with @Output of type 
org.apache.drill.exec.vector.complex.writer.BaseWriter.ComplexWriter but no 
null handling is performed for the kind of functions which leads to confusion. 
The problem is ComplexWriter holds list/map values and Drill does not yet 
support NULL values for the types (there is an issue to allow null maps/lists 
in [DRILL-4824|https://issues.apache.org/jira/browse/DRILL-4824]).
For such functions support for NULL_IF_NULL will be disabled, as it is done for 
aggregate functions, and NullHandling.INTERNAL should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >