[jira] [Updated] (DRILL-6254) IllegalArgumentException: the requested size must be non-negative

2018-03-21 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6254:
-
Reviewer: Karthikeyan Manivannan

> IllegalArgumentException: the requested size must be non-negative
> -
>
> Key: DRILL-6254
> URL: https://issues.apache.org/jira/browse/DRILL-6254
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.13.0
>Reporter: Khurram Faraaz
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: genAllTypesJSN.py
>
>
> Flatten query fails due to IllegalArgumentException: the requested size must 
> be non-negative.
> Script to generate JSON data file is attached here.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_all_types_jsn_to_parquet AS 
> . . . . . . . . . . . . . . > SELECT
> . . . . . . . . . . . . . . > CAST( col_int AS INT) col_int, 
> . . . . . . . . . . . . . . > CAST( col_bigint AS BIGINT) col_bigint, 
> . . . . . . . . . . . . . . > CAST( col_char AS CHAR(10)) col_char, 
> . . . . . . . . . . . . . . > CAST( col_fxdln_str AS VARCHAR(256)) 
> col_fxdln_str, 
> . . . . . . . . . . . . . . > CAST( col_varln_str AS VARCHAR(256)) 
> col_varln_str, 
> . . . . . . . . . . . . . . > CAST( col_float AS FLOAT) col_float, 
> . . . . . . . . . . . . . . > CAST( col_double AS DOUBLE PRECISION) 
> col_double, 
> . . . . . . . . . . . . . . > CAST( col_date AS DATE) col_date, 
> . . . . . . . . . . . . . . > CAST( col_time AS TIME) col_time, 
> . . . . . . . . . . . . . . > CAST( col_tmstmp AS TIMESTAMP) col_tmstmp, 
> . . . . . . . . . . . . . . > CAST( col_boolean AS BOOLEAN) col_boolean, 
> . . . . . . . . . . . . . . > col_binary, 
> . . . . . . . . . . . . . . > array_of_ints from `all_supported_types.json`;
> +---++
> | Fragment | Number of records written |
> +---++
> | 0_0 | 9 |
> +---++
> 1 row selected (0.29 seconds)
> {noformat}
> Reset all options and set slice_target=1
> alter system reset all;
> alter system set `planner.slice_target`=1;
> output_batch_size was set to its default value
> drill.exec.memory.operator.output_batch_size = 16777216
>  
> {noformat}
> select *, flatten(array_of_ints) from tbl_all_types_jsn_to_parquet;
> Error: SYSTEM ERROR: IllegalArgumentException: the requested size must be 
> non-negative
> Fragment 0:0
> [Error Id: 480bae96-ae89-45a7-b937-011c0f87c14d on qa102-45.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-03-15 12:19:43,916 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa: select *, flatten(array_of_ints) from 
> tbl_all_types_jsn_to_parquet
> 2018-03-15 12:19:43,952 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
> numFiles: 1
> 2018-03-15 12:19:43,953 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
> numFiles: 1
> 2018-03-15 12:19:43,966 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Time: 2ms total, 2.927366ms avg, 2ms max.
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Earliest start: 2.829000 μs, Latest start: 2.829000 μs, 
> Average start: 2.829000 μs .
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Took 3 ms to read file metadata
> 2018-03-15 12:19:44,000 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.fragment.FragmentExecutor - 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa:0:0: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2018-03-15 12:19:44,000 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.f.FragmentStatusReporter - 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa:0:0: State to report: RUNNING
> 2018-03-15 12:19:44,905 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.fragment.FragmentExecutor - 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa:0:0: State change requested RUNNING --> 
> FAILED
> 2018-03-15 12:19:44,927 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.fragment.FragmentExecutor - 
> 

[jira] [Closed] (DRILL-6059) Apply needed StoragePlugins's RuleSet to the planner

2018-03-21 Thread weijie.tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weijie.tong closed DRILL-6059.
--
   Resolution: Duplicate
Fix Version/s: (was: 1.14.0)
   1.12.0

> Apply needed StoragePlugins's RuleSet to the planner
> 
>
> Key: DRILL-6059
> URL: https://issues.apache.org/jira/browse/DRILL-6059
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Minor
> Fix For: 1.12.0
>
>
> Now once we configure Drill with more than one StoragePlugins, it will apply 
> all the plugins's rules to user's queries even the queries not contain 
> corresponding storage plugin. The reason is the method below of QueryContext
> {code:java}
>   public StoragePluginRegistry getStorage() {
> return drillbitContext.getStorage();
>   }
> {code}
>  
> From QueryContext's name , the method  should return the query involved 
> storage plugin registry not all the configured storage plugins. 
> So we need to identify the involved storage plugin at the parse stage, and 
> set the collected involved storage plugins to the QueryContext. This will 
> also benefit the work to do a schema level security control. Maybe a new 
> method with the name getInvolvedStorage will be more accurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6283) WebServer stores SPNEGO client principal without taking any conversion rule

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408934#comment-16408934
 ] 

ASF GitHub Bot commented on DRILL-6283:
---

Github user sohami commented on the issue:

https://github.com/apache/drill/pull/1180
  
@arina-ielchiieva - Please help to review this PR


> WebServer stores SPNEGO client principal without taking any conversion rule
> ---
>
> Key: DRILL-6283
> URL: https://issues.apache.org/jira/browse/DRILL-6283
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.14.0
>
>
> Drill's WebServer uses the exact client principal (us...@qa.lab) as the 
> stored username, it doesn't provide any configuration to specify rules which 
> can be used to extract desired username from client's principal.
> For example: default rule provided by HadoopKerberosName extracts only the 
> primary part (user1) in client principal. 
> Also while checking if authenticated client principal has admin privileges or 
> not it uses realm (e.g. QA.LAB) information to verify against configured 
> admin user/group list. To make it consistent with JDBC/ODBC kerberos path, it 
> should use the shortName in client principal to determine admin privileges.
> Basically server side should store the shortName from client principal 
> extracted based on configured rule and use that to determine the admin 
> privileges too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6283) WebServer stores SPNEGO client principal without taking any conversion rule

2018-03-21 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6283:
-
Reviewer: Arina Ielchiieva

> WebServer stores SPNEGO client principal without taking any conversion rule
> ---
>
> Key: DRILL-6283
> URL: https://issues.apache.org/jira/browse/DRILL-6283
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.14.0
>
>
> Drill's WebServer uses the exact client principal (us...@qa.lab) as the 
> stored username, it doesn't provide any configuration to specify rules which 
> can be used to extract desired username from client's principal.
> For example: default rule provided by HadoopKerberosName extracts only the 
> primary part (user1) in client principal. 
> Also while checking if authenticated client principal has admin privileges or 
> not it uses realm (e.g. QA.LAB) information to verify against configured 
> admin user/group list. To make it consistent with JDBC/ODBC kerberos path, it 
> should use the shortName in client principal to determine admin privileges.
> Basically server side should store the shortName from client principal 
> extracted based on configured rule and use that to determine the admin 
> privileges too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6053) Avoid excessive locking in LocalPersistentStore

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408912#comment-16408912
 ] 

ASF GitHub Bot commented on DRILL-6053:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1163
  
@arina-ielchiieva Addressed review comments. Please keep both commits (do 
not squash) during the merge.


> Avoid excessive locking in LocalPersistentStore
> ---
>
> Key: DRILL-6053
> URL: https://issues.apache.org/jira/browse/DRILL-6053
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>
> When query profiles are written to LocalPersistentStore, the write is 
> unnecessary serialized due to read/write lock that was introduced for 
> versioned PersistentStore. Only versioned access needs to be protected by 
> read/write lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6286) Regression: incorrect reference to shutdown in drillbit.log

2018-03-21 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-6286:
-

 Summary: Regression: incorrect reference to shutdown in 
drillbit.log
 Key: DRILL-6286
 URL: https://issues.apache.org/jira/browse/DRILL-6286
 Project: Apache Drill
  Issue Type: Bug
Reporter: Vlad Rozov
Assignee: Timothy Farkas


drillbit.log refers to shutdown even in cases when no shutdown sequence was 
initiated:
{noformat}
2018-03-16 11:55:52,693 [drill-executor-19] INFO  
o.apache.drill.exec.work.WorkManager - Waiting for 0 queries to complete before 
shutting down
2018-03-16 11:55:52,693 [drill-executor-19] INFO  
o.apache.drill.exec.work.WorkManager - Waiting for 3 running fragments to 
complete before shutting down
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6252) Foreman node is going down when the non foreman node is stopped

2018-03-21 Thread Vlad Rozov (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408854#comment-16408854
 ] 

Vlad Rozov commented on DRILL-6252:
---

[~vdonapati] How was it determined that the foreman node is going down? I don't 
see a normal shutdown sequence in the log file. Did JVM crash? If yes, please 
attach crash report. 

> Foreman node is going down when the non foreman node is stopped
> ---
>
> Key: DRILL-6252
> URL: https://issues.apache.org/jira/browse/DRILL-6252
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Venkata Jyothsna Donapati
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: foreman_drillbit.log, nonforeman_drillbit.log
>
>
> Two drillbits are running. I'm running a join query over parquet and tried to 
> stop the non-foreman node using drillbit.sh stop. The query fails with 
> *"Error: DATA_READ ERROR: Exception occurred while reading from disk".* The 
> non-foreman node goes down. The foreman node also goes down. When I looked at 
> the drillbit.log of both foreman and non-foreman I found that there is memory 
> leak  "Memory was leaked by query. Memory leaked: 
> (2097152)\nAllocator(op:2:0:0:HashPartitionSender) 
> 100/6291456/6832128/100 (res/actual/peak/limit)\n". Following are 
> the stack traces for memory leaks 
> {noformat} 
> [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Memory was leaked by query. Memory leaked: (3145728)
> Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 
> (res/actual/peak/limit)
>  
>  
> Fragment 2:1 
> [Error Id: 0d9a2799-7e97-46b3-953b-1f8d0dd87a04 on qa102-34.qa.lab:31010]
>         at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:297)
>  [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:266)
>  [drill-java-exec-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT]
>         at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.13.0-SNAPSHOT.jar:1.13.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>         at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Memory was leaked by query. 
> Memory leaked: (3145728)
> Allocator(op:2:1:0:HashPartitionSender) 100/6291456/6291456/100 
> (res/actual/peak/limit)
> {noformat} 
>  
> Ping me for the logs and more information.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6284) Add operator metrics for batch sizing for flatten

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408820#comment-16408820
 ] 

ASF GitHub Bot commented on DRILL-6284:
---

GitHub user ppadma opened a pull request:

https://github.com/apache/drill/pull/1181

DRILL-6284: Add operator metrics for batch sizing for flatten

@kkhatua please review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppadma/drill DRILL-6284

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1181.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1181


commit f0b7bed20aef64cdc9e025a5ca209e1ad6220aa6
Author: Padma Penumarthy 
Date:   2018-03-20T20:44:50Z

DRILL-6284: Add operator metrics for batch sizing for flatten




> Add operator metrics for batch sizing for flatten
> -
>
> Key: DRILL-6284
> URL: https://issues.apache.org/jira/browse/DRILL-6284
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 1.13.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
> Fix For: 1.14.0
>
>
> Add the following operator metrics for flatten.
> NUM_INCOMING_BATCHES,
> AVG_INPUT_BATCH_SIZE,
> AVG_INPUT_ROW_WIDTH,
> TOTAL_INPUT_RECORDS,
> NUM_OUTGOING_BATCHES,
> AVG_OUTPUT_BATCH_SIZE,
> AVG_OUTPUT_ROW_WIDTH,
> TOTAL_OUTPUT_RECORDS;
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6235) Flatten query leads to out of memory in RPC layer.

2018-03-21 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-6235:
--
Description: 
Flatten query leads to out of memory in RPC layer. Query profile is attached 
here.

Total number of JSON files = 4095
 Each JSON file has nine rows
 And each row in the JSON has an array with 1024 integer values, and there are 
other string values outside of the array.
 Two major fragments and eighty eight minor fragments were created

On a 4 node CentOS cluster
 number of CPU cores
 [root@qa102-45 ~]# grep -c ^processor /proc/cpuinfo
 32

Details of memory
{noformat}
0: jdbc:drill:schema=dfs.tmp> select * from sys.memory;
+--++---+-+-+-+-+
| hostname | user_port | heap_current | heap_max | direct_current | 
jvm_direct_current | direct_max |
+--++---+-+-+-+-+
| qa102-45.qa.lab | 31010 | 1130364912 | 4294967296 | 0 | 170528 | 8589934592 |
| qa102-47.qa.lab | 31010 | 171823104 | 4294967296 | 0 | 21912 | 8589934592 |
| qa102-48.qa.lab | 31010 | 201326576 | 4294967296 | 0 | 21912 | 8589934592 |
| qa102-46.qa.lab | 31010 | 214780896 | 4294967296 | 0 | 21912 | 8589934592 |
+--++---+-+-+-+-+
4 rows selected (0.166 seconds)
{noformat}
Steps to repro the failure.

Reset all options and set slice_target=1 and run the below SQL query.


 alter system reset all;
 alter system set `planner.slice_target`=1;
{noformat}
SELECT * , FLATTEN(arr) FROM many_json_files
...

Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the 
query.

Failure allocating buffer.
Fragment 1:38

[Error Id: cf4fd273-d8a2-45e8-8d72-15c738e53b0f on qa102-45.qa.lab:31010] 
(state=,code=0)
{noformat}
Stack trace from drillbit.log fir above failing query.
{noformat}
2018-03-12 11:52:33,849 [25593391-512d-23ab-7c84-3651006931e2:frag:0:0] INFO 
o.a.d.e.w.fragment.FragmentExecutor - 25593391-512d-23ab-7c84-3651006931e2:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2018-03-12 11:52:33,849 [25593391-512d-23ab-7c84-3651006931e2:frag:0:0] INFO 
o.a.d.e.w.f.FragmentStatusReporter - 25593391-512d-23ab-7c84-3651006931e2:0:0: 
State to report: RUNNING
2018-03-12 11:52:33,854 [25593391-512d-23ab-7c84-3651006931e2:frag:0:0] INFO 
o.a.d.e.c.ClassCompilerSelector - Java compiler policy: DEFAULT, Debug option: 
true
2018-03-12 11:52:35,929 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 92340224.
2018-03-12 11:52:35,929 [BitServer-3] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 92340224.
2018-03-12 11:52:35,930 [BitServer-3] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 11:52:35,930 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 11:52:35,930 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 83886080.
2018-03-12 11:52:35,930 [BitServer-3] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 83886080.
2018-03-12 11:52:35,930 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 11:52:35,930 [BitServer-3] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 11:52:35,931 [BitServer-3] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 83886080.
2018-03-12 11:52:35,931 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 83886080.
2018-03-12 11:52:35,931 [BitServer-3] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 11:52:35,931 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
...
...
2018-03-12 11:52:35,939 [BitServer-4] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 67174400.
2018-03-12 11:52:35,939 [BitServer-4] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 11:52:35,939 [BitServer-2] WARN o.a.d.exec.rpc.ProtobufLengthDecoder 
- Failure allocating buffer on incoming stream due to memory limits. Current 
Allocation: 84017152.
2018-03-12 11:52:35,939 [BitServer-2] ERROR o.a.drill.exec.rpc.data.DataServer 
- Out of memory in RPC layer.
2018-03-12 

[jira] [Updated] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread Vlad Rozov (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Rozov updated DRILL-6280:
--
Description: 
{{exec:java}} has a dependency on {{logback-classic}} that is only available in 
{{test}} scope.
{noformat}
[INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common ---
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
{noformat}

  was:{{exec:java}} has a dependency on {{logback-classic}} that is only 
available in {{test}} scope.


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.
> {noformat}
> [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common ---
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408759#comment-16408759
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1177
  
@vdiravka Please clarify your concern regarding logback-classic. Without 
the fix, logback-classic is not on the classpath during `exec:java` maven 
plugin execution. This causes the following issue:
```
[INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common ---
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
```


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6283) WebServer stores SPNEGO client principal without taking any conversion rule

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408733#comment-16408733
 ] 

ASF GitHub Bot commented on DRILL-6283:
---

GitHub user sohami opened a pull request:

https://github.com/apache/drill/pull/1180

DRILL-6283: WebServer stores SPNEGO client principal without taking a…

…ny conversion rule

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sohami/drill DRILL-6283

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1180


commit d80633021d8d81d786921051b92eda63476375de
Author: Sorabh Hamirwasia 
Date:   2018-03-21T22:53:25Z

DRILL-6283: WebServer stores SPNEGO client principal without taking any 
conversion rule




> WebServer stores SPNEGO client principal without taking any conversion rule
> ---
>
> Key: DRILL-6283
> URL: https://issues.apache.org/jira/browse/DRILL-6283
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.13.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.14.0
>
>
> Drill's WebServer uses the exact client principal (us...@qa.lab) as the 
> stored username, it doesn't provide any configuration to specify rules which 
> can be used to extract desired username from client's principal.
> For example: default rule provided by HadoopKerberosName extracts only the 
> primary part (user1) in client principal. 
> Also while checking if authenticated client principal has admin privileges or 
> not it uses realm (e.g. QA.LAB) information to verify against configured 
> admin user/group list. To make it consistent with JDBC/ODBC kerberos path, it 
> should use the shortName in client principal to determine admin privileges.
> Basically server side should store the shortName from client principal 
> extracted based on configured rule and use that to determine the admin 
> privileges too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408721#comment-16408721
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176263130
  
--- Diff: exec/java-exec/pom.xml ---
@@ -828,31 +828,9 @@
   
 
   
-   
+  
 org.codehaus.mojo
 exec-maven-plugin
-1.2.1
-
-  
-org.apache.drill
-drill-common
--- End diff --

`drill-common` and `drill-common-test` are included as part of `test` scope 
dependencies


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408714#comment-16408714
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176262322
  
--- Diff: common/pom.xml ---
@@ -113,20 +113,10 @@
   
 org.codehaus.mojo
 exec-maven-plugin
-1.2.1
-
-  
-process-classes
-
-  java
-
-  
-
 
-  
org.apache.drill.common.scanner.BuildTimeScan
-  
-${project.build.outputDirectory}
-  
+  
+
${project.basedir}/src/test/resources/
--- End diff --

The `logback-test.xml` was not in the classpath before, but it did not 
matter as slf4j did not bind to logback classic before anyway (it was not on 
classpath).


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408690#comment-16408690
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176260351
  
--- Diff: exec/java-exec/pom.xml ---
@@ -828,31 +828,9 @@
   
 
   
-   
+  
 org.codehaus.mojo
 exec-maven-plugin
-1.2.1
-
-  
-org.apache.drill
-drill-common
-${project.version}
-tests
-  
-
-
-  
-process-classes
-java
-  
-
-
-  
org.apache.drill.common.scanner.BuildTimeScan
-  true
--- End diff --

The plugin is now configured in the `PluginManagement` section in the drill 
root pom.


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6243) Alert box to confirm shutdown of drillbit after clicking shutdown button

2018-03-21 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6243:
-
Labels: ready-to-commit  (was: )

> Alert box to confirm shutdown of drillbit after clicking shutdown button 
> -
>
> Key: DRILL-6243
> URL: https://issues.apache.org/jira/browse/DRILL-6243
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6243) Alert box to confirm shutdown of drillbit after clicking shutdown button

2018-03-21 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6243:
-
Component/s: Web Server

> Alert box to confirm shutdown of drillbit after clicking shutdown button 
> -
>
> Key: DRILL-6243
> URL: https://issues.apache.org/jira/browse/DRILL-6243
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6243) Alert box to confirm shutdown of drillbit after clicking shutdown button

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408660#comment-16408660
 ] 

ASF GitHub Bot commented on DRILL-6243:
---

Github user sohami commented on the issue:

https://github.com/apache/drill/pull/1169
  
+1 LGTM. Thanks for making the changes.


> Alert box to confirm shutdown of drillbit after clicking shutdown button 
> -
>
> Key: DRILL-6243
> URL: https://issues.apache.org/jira/browse/DRILL-6243
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6284) Add operator metrics for batch sizing for flatten

2018-03-21 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6284:
---

 Summary: Add operator metrics for batch sizing for flatten
 Key: DRILL-6284
 URL: https://issues.apache.org/jira/browse/DRILL-6284
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow
Affects Versions: 1.13.0
Reporter: Padma Penumarthy
Assignee: Padma Penumarthy
 Fix For: 1.14.0


Add the following operator metrics for flatten.

NUM_INCOMING_BATCHES,
AVG_INPUT_BATCH_SIZE,
AVG_INPUT_ROW_WIDTH,
TOTAL_INPUT_RECORDS,
NUM_OUTGOING_BATCHES,
AVG_OUTPUT_BATCH_SIZE,
AVG_OUTPUT_ROW_WIDTH,
TOTAL_OUTPUT_RECORDS;

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6254) IllegalArgumentException: the requested size must be non-negative

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408615#comment-16408615
 ] 

ASF GitHub Bot commented on DRILL-6254:
---

Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1179
  
@bitblender can you please review this?


> IllegalArgumentException: the requested size must be non-negative
> -
>
> Key: DRILL-6254
> URL: https://issues.apache.org/jira/browse/DRILL-6254
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.13.0
>Reporter: Khurram Faraaz
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: genAllTypesJSN.py
>
>
> Flatten query fails due to IllegalArgumentException: the requested size must 
> be non-negative.
> Script to generate JSON data file is attached here.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_all_types_jsn_to_parquet AS 
> . . . . . . . . . . . . . . > SELECT
> . . . . . . . . . . . . . . > CAST( col_int AS INT) col_int, 
> . . . . . . . . . . . . . . > CAST( col_bigint AS BIGINT) col_bigint, 
> . . . . . . . . . . . . . . > CAST( col_char AS CHAR(10)) col_char, 
> . . . . . . . . . . . . . . > CAST( col_fxdln_str AS VARCHAR(256)) 
> col_fxdln_str, 
> . . . . . . . . . . . . . . > CAST( col_varln_str AS VARCHAR(256)) 
> col_varln_str, 
> . . . . . . . . . . . . . . > CAST( col_float AS FLOAT) col_float, 
> . . . . . . . . . . . . . . > CAST( col_double AS DOUBLE PRECISION) 
> col_double, 
> . . . . . . . . . . . . . . > CAST( col_date AS DATE) col_date, 
> . . . . . . . . . . . . . . > CAST( col_time AS TIME) col_time, 
> . . . . . . . . . . . . . . > CAST( col_tmstmp AS TIMESTAMP) col_tmstmp, 
> . . . . . . . . . . . . . . > CAST( col_boolean AS BOOLEAN) col_boolean, 
> . . . . . . . . . . . . . . > col_binary, 
> . . . . . . . . . . . . . . > array_of_ints from `all_supported_types.json`;
> +---++
> | Fragment | Number of records written |
> +---++
> | 0_0 | 9 |
> +---++
> 1 row selected (0.29 seconds)
> {noformat}
> Reset all options and set slice_target=1
> alter system reset all;
> alter system set `planner.slice_target`=1;
> output_batch_size was set to its default value
> drill.exec.memory.operator.output_batch_size = 16777216
>  
> {noformat}
> select *, flatten(array_of_ints) from tbl_all_types_jsn_to_parquet;
> Error: SYSTEM ERROR: IllegalArgumentException: the requested size must be 
> non-negative
> Fragment 0:0
> [Error Id: 480bae96-ae89-45a7-b937-011c0f87c14d on qa102-45.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-03-15 12:19:43,916 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa: select *, flatten(array_of_ints) from 
> tbl_all_types_jsn_to_parquet
> 2018-03-15 12:19:43,952 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
> numFiles: 1
> 2018-03-15 12:19:43,953 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
> numFiles: 1
> 2018-03-15 12:19:43,966 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Time: 2ms total, 2.927366ms avg, 2ms max.
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Earliest start: 2.829000 μs, Latest start: 2.829000 μs, 
> Average start: 2.829000 μs .
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Took 3 ms to read file metadata
> 2018-03-15 12:19:44,000 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.fragment.FragmentExecutor - 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa:0:0: State change requested 
> AWAITING_ALLOCATION --> RUNNING
> 2018-03-15 12:19:44,000 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.f.FragmentStatusReporter - 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa:0:0: State to report: RUNNING
> 2018-03-15 12:19:44,905 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:frag:0:0] INFO 
> o.a.d.e.w.fragment.FragmentExecutor - 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa:0:0: State change requested 

[jira] [Commented] (DRILL-6254) IllegalArgumentException: the requested size must be non-negative

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408569#comment-16408569
 ] 

ASF GitHub Bot commented on DRILL-6254:
---

GitHub user ppadma opened a pull request:

https://github.com/apache/drill/pull/1179

DRILL-6254: IllegalArgumentException: the requested size must be non-…

…negative

We should limit memory allocation to number of records that are going to be 
in the next batch, not the total number of records remaining. For very large 
remaining record count, when multiplied with high cardinality, integer 
overflows and is becoming negative.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppadma/drill DRILL-6254

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1179.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1179


commit c26e8f44b9c873d0d7acabac3623a3a8e19086eb
Author: Padma Penumarthy 
Date:   2018-03-21T20:39:43Z

DRILL-6254: IllegalArgumentException: the requested size must be 
non-negative




> IllegalArgumentException: the requested size must be non-negative
> -
>
> Key: DRILL-6254
> URL: https://issues.apache.org/jira/browse/DRILL-6254
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.13.0
>Reporter: Khurram Faraaz
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: genAllTypesJSN.py
>
>
> Flatten query fails due to IllegalArgumentException: the requested size must 
> be non-negative.
> Script to generate JSON data file is attached here.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_all_types_jsn_to_parquet AS 
> . . . . . . . . . . . . . . > SELECT
> . . . . . . . . . . . . . . > CAST( col_int AS INT) col_int, 
> . . . . . . . . . . . . . . > CAST( col_bigint AS BIGINT) col_bigint, 
> . . . . . . . . . . . . . . > CAST( col_char AS CHAR(10)) col_char, 
> . . . . . . . . . . . . . . > CAST( col_fxdln_str AS VARCHAR(256)) 
> col_fxdln_str, 
> . . . . . . . . . . . . . . > CAST( col_varln_str AS VARCHAR(256)) 
> col_varln_str, 
> . . . . . . . . . . . . . . > CAST( col_float AS FLOAT) col_float, 
> . . . . . . . . . . . . . . > CAST( col_double AS DOUBLE PRECISION) 
> col_double, 
> . . . . . . . . . . . . . . > CAST( col_date AS DATE) col_date, 
> . . . . . . . . . . . . . . > CAST( col_time AS TIME) col_time, 
> . . . . . . . . . . . . . . > CAST( col_tmstmp AS TIMESTAMP) col_tmstmp, 
> . . . . . . . . . . . . . . > CAST( col_boolean AS BOOLEAN) col_boolean, 
> . . . . . . . . . . . . . . > col_binary, 
> . . . . . . . . . . . . . . > array_of_ints from `all_supported_types.json`;
> +---++
> | Fragment | Number of records written |
> +---++
> | 0_0 | 9 |
> +---++
> 1 row selected (0.29 seconds)
> {noformat}
> Reset all options and set slice_target=1
> alter system reset all;
> alter system set `planner.slice_target`=1;
> output_batch_size was set to its default value
> drill.exec.memory.operator.output_batch_size = 16777216
>  
> {noformat}
> select *, flatten(array_of_ints) from tbl_all_types_jsn_to_parquet;
> Error: SYSTEM ERROR: IllegalArgumentException: the requested size must be 
> non-negative
> Fragment 0:0
> [Error Id: 480bae96-ae89-45a7-b937-011c0f87c14d on qa102-45.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-03-15 12:19:43,916 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 255538af-bcd5-98ee-32e0-68d98fc4a6fa: select *, flatten(array_of_ints) from 
> tbl_all_types_jsn_to_parquet
> 2018-03-15 12:19:43,952 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
> numFiles: 1
> 2018-03-15 12:19:43,953 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
> numFiles: 1
> 2018-03-15 12:19:43,966 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 
> 1 using 1 threads. Time: 2ms total, 2.927366ms avg, 2ms max.
> 2018-03-15 12:19:43,969 [255538af-bcd5-98ee-32e0-68d98fc4a6fa:foreman] INFO 
> 

[jira] [Commented] (DRILL-6123) Limit batch size for Merge Join based on memory

2018-03-21 Thread Bridget Bevens (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408548#comment-16408548
 ] 

Bridget Bevens commented on DRILL-6123:
---

Added content here: 
[https://drill.apache.org/docs/configuring-drill-memory/#modifying-memory-allocated-to-queries]
 

Removed the doc-impacting label. Please add the label back if doc coverage is 
not sufficient.


Thanks,
Bridget

> Limit batch size for Merge Join based on memory
> ---
>
> Key: DRILL-6123
> URL: https://issues.apache.org/jira/browse/DRILL-6123
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 1.12.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.13.0
>
>
> Merge join limits output batch size to 32K rows irrespective of row size. 
> This can create very large or very small batches (in terms of memory), 
> depending upon average row width. Change this to figure out output row count 
> based on memory specified with the new outputBatchSize option and average row 
> width of incoming left and right batches. Output row count will be minimum of 
> 1 and max of 64k. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6123) Limit batch size for Merge Join based on memory

2018-03-21 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens updated DRILL-6123:
--
Labels: ready-to-commit  (was: doc-impacting ready-to-commit)

> Limit batch size for Merge Join based on memory
> ---
>
> Key: DRILL-6123
> URL: https://issues.apache.org/jira/browse/DRILL-6123
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 1.12.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.13.0
>
>
> Merge join limits output batch size to 32K rows irrespective of row size. 
> This can create very large or very small batches (in terms of memory), 
> depending upon average row width. Change this to figure out output row count 
> based on memory specified with the new outputBatchSize option and average row 
> width of incoming left and right batches. Output row count will be minimum of 
> 1 and max of 64k. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6283) WebServer stores SPNEGO client principal without taking any conversion rule

2018-03-21 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-6283:


 Summary: WebServer stores SPNEGO client principal without taking 
any conversion rule
 Key: DRILL-6283
 URL: https://issues.apache.org/jira/browse/DRILL-6283
 Project: Apache Drill
  Issue Type: Bug
  Components: Web Server
Affects Versions: 1.13.0
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia
 Fix For: 1.14.0


Drill's WebServer uses the exact client principal (us...@qa.lab) as the stored 
username, it doesn't provide any configuration to specify rules which can be 
used to extract desired username from client's principal.

For example: default rule provided by HadoopKerberosName extracts only the 
primary part (user1) in client principal. 

Also while checking if authenticated client principal has admin privileges or 
not it uses realm (e.g. QA.LAB) information to verify against configured admin 
user/group list. To make it consistent with JDBC/ODBC kerberos path, it should 
use the shortName in client principal to determine admin privileges.

Basically server side should store the shortName from client principal 
extracted based on configured rule and use that to determine the admin 
privileges too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408395#comment-16408395
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176195052
  
--- Diff: common/pom.xml ---
@@ -113,20 +113,10 @@
   
 org.codehaus.mojo
 exec-maven-plugin
-1.2.1
-
-  
-process-classes
-
-  java
-
-  
-
 
-  
org.apache.drill.common.scanner.BuildTimeScan
-  
-${project.build.outputDirectory}
-  
+  
+
${project.basedir}/src/test/resources/
--- End diff --

How it was in classpath earlier?


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408396#comment-16408396
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176195852
  
--- Diff: exec/java-exec/pom.xml ---
@@ -828,31 +828,9 @@
   
 
   
-   
+  
 org.codehaus.mojo
 exec-maven-plugin
-1.2.1
-
-  
-org.apache.drill
-drill-common
--- End diff --

We don't need dependency onto `drill-common` or it is specified in some 
other place?


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408397#comment-16408397
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176178853
  
--- Diff: exec/java-exec/pom.xml ---
@@ -828,31 +828,9 @@
   
 
   
-   
+  
 org.codehaus.mojo
 exec-maven-plugin
-1.2.1
-
-  
-org.apache.drill
-drill-common
-${project.version}
-tests
-  
-
-
-  
-process-classes
-java
-  
-
-
-  
org.apache.drill.common.scanner.BuildTimeScan
-  true
--- End diff --

It is no more needed?


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408394#comment-16408394
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1177#discussion_r176165870
  
--- Diff: 
common/src/main/java/org/apache/drill/common/scanner/BuildTimeScan.java ---
@@ -118,10 +118,10 @@ private static void save(ScanResult scanResult, File 
file) {
*/
   public static void main(String[] args) throws Exception {
 if (args.length != 1) {
-  throw new IllegalArgumentException("Usage: java {cp} " + 
ClassPathScanner.class.getName() + " path/to/scan");
+  throw new IllegalArgumentException("Usage: java {cp} " + 
BuildTimeScan.class.getName() + " path/to/scan");
--- End diff --

Consider String.format


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6278) DRILL-5993 Made Debugging Generated Code Harder

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408340#comment-16408340
 ] 

ASF GitHub Bot commented on DRILL-6278:
---

Github user vvysotskyi commented on the issue:

https://github.com/apache/drill/pull/1178
  
@ilooner, I have noticed that `ClassBuilder.CODE_DIR_OPTION` is used in 
`TopNBatchTest` class. Shoul we also remove it?


> DRILL-5993 Made Debugging Generated Code Harder
> ---
>
> Key: DRILL-6278
> URL: https://issues.apache.org/jira/browse/DRILL-6278
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> DRILL-5993 made debugging generated code more difficult since it stored 
> generated code in unique directories in target. This required adding possibly 
> many tmp directories as source folders in order to be able to set break 
> points in generated code for different tests. This change should be reverted 
> to store generated code in the original default /tmp/drill/codegen directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6278) DRILL-5993 Made Debugging Generated Code Harder

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408329#comment-16408329
 ] 

ASF GitHub Bot commented on DRILL-6278:
---

GitHub user ilooner opened a pull request:

https://github.com/apache/drill/pull/1178

DRILL-6278: Removed temp codegen directory in testing framework.

@vvysotskyi Please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilooner/drill DRILL-6278

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1178


commit bde89d68b570eaa8792baa2517fab3c9765c28a4
Author: Timothy Farkas 
Date:   2018-03-21T06:00:22Z

DRILL-6278: Removed temp codegen directory in testing framework.




> DRILL-5993 Made Debugging Generated Code Harder
> ---
>
> Key: DRILL-6278
> URL: https://issues.apache.org/jira/browse/DRILL-6278
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> DRILL-5993 made debugging generated code more difficult since it stored 
> generated code in unique directories in target. This required adding possibly 
> many tmp directories as source folders in order to be able to set break 
> points in generated code for different tests. This change should be reverted 
> to store generated code in the original default /tmp/drill/codegen directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408216#comment-16408216
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1177
  
@vdiravka Please review


> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread Vlad Rozov (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Rozov updated DRILL-6280:
--
Reviewer: Vitalii Diravka

> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread Vlad Rozov (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Rozov updated DRILL-6280:
--
Description: {{exec:java}} has a dependency on {{logback-classic}} that is 
only available in {{test}} scope.  (was: {{exec:java}} requires {{test}} scope 
due to dependency on {{logback-classic}}.)

> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} has a dependency on {{logback-classic}} that is only available 
> in {{test}} scope.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6280) Cleanup execution of BuildTimeScan during maven build

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408195#comment-16408195
 ] 

ASF GitHub Bot commented on DRILL-6280:
---

GitHub user vrozov opened a pull request:

https://github.com/apache/drill/pull/1177

DRILL-6280: Cleanup execution of BuildTimeScan during maven build



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/drill DRILL-6280

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1177


commit 5305320bb3745102dae071a035648f377de84464
Author: Vlad Rozov 
Date:   2018-03-21T02:24:11Z

DRILL-6280: Cleanup execution of BuildTimeScan during maven build




> Cleanup execution of BuildTimeScan during maven build
> -
>
> Key: DRILL-6280
> URL: https://issues.apache.org/jira/browse/DRILL-6280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> {{exec:java}} requires {{test}} scope due to dependency on 
> {{logback-classic}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6282) Excluding io.dropwizard.metrics dependencies

2018-03-21 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-6282:
--

 Summary: Excluding io.dropwizard.metrics dependencies
 Key: DRILL-6282
 URL: https://issues.apache.org/jira/browse/DRILL-6282
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build  Test
Affects Versions: 1.13.0
Reporter: Vitalii Diravka
Assignee: Vitalii Diravka
 Fix For: 1.14.0


There are three types of metrics-core in Drill: 
1. _com.yammer.metrics_, 
2. _com.codahale.metrics_, 
3. _io.dropwizard.metrics_
Drill uses only 1 and 2. The last 3 one is used by Hive.
1st one has different class full identifiers, but the 2 and 3 ones have the 
same class full identifiers and maven doesn't know which library to use 
([https://github.com/dropwizard/metrics/issues/1044]).
But I found that 3 one library is used by Hive only for tests, therefore it is 
not required for Drill and could be excluded from hive-metastore and hive-exec.
The dependencies conflict is related not only to metrics-core, but to 
metrics-servlets and metrics-json as well.

All these metrics should be organized with proper excluding and dependency 
management blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-6241) Saffron properties config has the excessive permissions

2018-03-21 Thread Anton Gozhiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-6241.
---

Verified with Drill version 1.14.0-SNAPSHOT (commit id: 
b4c599e33606f3e2fef132dbd38ee69b516e681e)

> Saffron properties config  has the excessive permissions
> 
>
> Key: DRILL-6241
> URL: https://issues.apache.org/jira/browse/DRILL-6241
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Current Drill config permissions:
> {noformat}
> ls -al ./drill-1.13.0/conf/saffron.properties
> -rw-rw-r-- 1 mapr mapr 1191 Mar 12 09:36 saffron.properties
> {noformat}
> *Expected result:*
> It should have permissions 0640
> *Actual result:*
> It has the permissions 0664



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-4364) Image Metadata Format Plugin

2018-03-21 Thread Akihiko Kusanagi (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akihiko Kusanagi updated DRILL-4364:

Description: 
Support querying of metadata in various image formats. This plugin leverages 
[metadata-extractor|https://github.com/drewnoakes/metadata-extractor]. This 
plugin is especially useful when querying on a large number of image files 
stored in a distributed file system without building metadata repository in 
advance.

This plugin supports the following file formats.
 * JPEG, TIFF, PSD, PNG, BMP, GIF, ICO, PCX, WAV, AVI, WebP, MOV, MP4, EPS
 * Camera Raw: ARW (Sony), CRW/CR2 (Canon), NEF (Nikon), ORF (Olympus), RAF 
(FujiFilm), RW2 (Panasonic), RWL (Leica), SRW (Samsung), X3F (Foveon)

This plugin enables to read the following metadata.
 * Exif, IPTC, XMP, JFIF / JFXX, ICC Profiles, Photoshop fields, PNG 
properties, BMP properties, GIF properties, ICO properties, PCX properties, WAV 
properties, AVI properties, WebP properties, QuickTime properties, MP4 
properties, EPS properties

Since each type of metadata has a different set of fields, the plugin returns a 
set of commonly-used fields such as the image width, height and bits per pixels 
for ease of use.

*Examples:*

Querying on a JPEG file with the property descriptive: true
{noformat}
0: jdbc:drill:zk=local> select FileName, * from 
dfs.`4349313028_f69ffa0257_o.jpg`;
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
| FileName | FileSize | FileDateTime | Format | PixelWidth | PixelHeight | 
BitsPerPixel | DPIWidth | DPIHeight | Orientaion | ColorMode | HasAlpha | 
Duration | VideoCodec | FrameRate | AudioCodec | AudioSampleSize | 
AudioSampleRate | JPEG | JFIF | ExifIFD0 | ExifSubIFD | Interoperability | GPS 
| ExifThumbnail | Photoshop | IPTC | Huffman | FileType |
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
| 4349313028_f69ffa0257_o.jpg | 257213 bytes | Fri Mar 09 12:09:34 +08:00 2018 
| JPEG | 1199 | 800 | 24 | 96 | 96 | Unknown (0) | RGB | false | 00:00:00 | 
Unknown | 0 | Unknown | 0 | 0 | 
{"CompressionType":"Baseline","DataPrecision":"8 bits","ImageHeight":"800 
pixels","ImageWidth":"1199 pixels","NumberOfComponents":"3","Component1":"Y 
component: Quantization table 0, Sampling factors 2 horiz/2 
vert","Component2":"Cb component: Quantization table 1, Sampling factors 1 
horiz/1 vert","Component3":"Cr component: Quantization table 1, Sampling 
factors 1 horiz/1 vert"} | 
{"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 
dots","YResolution":"96 
dots","ThumbnailWidthPixels":"0","ThumbnailHeightPixels":"0"} | 
{"Software":"Picasa 3.0"} | 
{"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | 
{"InteroperabilityIndex":"Unknown ()","InteroperabilityVersion":"1.00"} | 
{"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 
15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 
6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | 
{"Compression":"JPEG (old-style)","XResolution":"72 dots per 
inch","YResolution":"72 dots per 
inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 
bytes","ThumbnailLength":"7213 bytes"} | {} | 
{"Keywords":"135;2002;issaquah;police car;wa;washington"} | 
{"NumberOfTables":"4 Huffman tables"} | 
{"DetectedFileTypeName":"JPEG","DetectedFileTypeLongName":"Joint Photographic 
Experts 
Group","DetectedMIMEType":"image/jpeg","ExpectedFileNameExtension":"jpg"} |
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+
1 row selected (0.229 seconds)
{noformat}
Querying on a JPEG file with the property descriptive: false
{noformat}
0: jdbc:drill:zk=local> select FileName, * from 
dfs.`4349313028_f69ffa0257_o.jpg`;
+--+--+--+++-+--+--+---++---+--+--++---++-+-+--+--+--++--+-+---+---+--+-+--+

[jira] [Commented] (DRILL-4364) Image Metadata Format Plugin

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407820#comment-16407820
 ] 

ASF GitHub Bot commented on DRILL-4364:
---

Github user nagix commented on the issue:

https://github.com/apache/drill/pull/367
  
I have updated the dependent library and rebased the code. Now, Image 
Metadata Format Plugin supports:
- Querying metadata in JPEG, TIFF, PSD, PNG, BMP, GIF, ICO, PCX, WAV, AVI, 
WebP, MOV, MP4, EPS, ARW (Sony), CRW/CR2 (Canon), NEF (Nikon), ORF (Olympus), 
RAF (FujiFilm), RW2 (Panasonic), RWL (Leica), SRW (Samsung) and X3F (Foveon) 
files
- Retrieval of both a human-readable descriptive string and a 
machine-readable typed value from each tag
- A set of commonly-used fields such as the image width, height and bits 
per pixels for ease of use
- Querying directories of image files to create your own metadata DB

See also the original JIRA 
([DRILL-4364](https://issues.apache.org/jira/browse/DRILL-4364)) and the 
[reference](https://gist.github.com/nagix/6cac019b7bec698a1b8a234a5268d09d) 
(still incomplete).

@cgivre, @kkhatua, please can you review this?


> Image Metadata Format Plugin
> 
>
> Key: DRILL-4364
> URL: https://issues.apache.org/jira/browse/DRILL-4364
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Reporter: Akihiko Kusanagi
>Priority: Major
>
> Support querying of metadata in various image formats. This plugin leverages 
> [metadata-extractor|https://github.com/drewnoakes/metadata-extractor]. This 
> plugin is especially useful when querying on a large number of image files 
> stored in a distributed file system without building metadata repository in 
> advance.
> This plugin supports the following file formats.
> * JPEG, TIFF, WebP, PSD, PNG, BMP, GIF, ICO, PCX
> * Camera Raw: NEF (Nikon), CR2 (Canon), ORF (Olympus), ARW (Sony), RW2 
> (Panasonic), RWL (Leica), SRW (Samsung)
> This plugin enables to read the following metadata.
> * Exif, IPTC, XMP, JFIF / JFXX, ICC Profiles, Photoshop fields, WebP 
> properties, PNG properties, BMP properties, GIF properties, ICO properties, 
> PCX properties
> Since each type of metadata has a different set of fields, the plugin returns 
> a set of commonly-used fields such as the image width, height and bits per 
> pixels for ease of use.
> *Examples:*
> Querying on a JPEG file with the property descriptive: true
> {noformat}
> 0: jdbc:drill:zk=local> select * from dfs.`4349313028_f69ffa0257_o.jpg`;
> +--+--+--++--+---++-+--++---+--+-+---+--+--+--++--+--+
> | FileName | FileSize | FileDateTime | Format | DPIWidth | DPIHeight | 
> PixelWidth | PixelHeight | BitsPerPixel | Orientaion | ColorMode | HasAlpha | 
> GPS | ExifThumbnail | JFIF | IPTC | JPEG | ExifSubIFD | ExifIFD0 | 
> Interoperability |
> +--+--+--++--+---++-+--++---+--+-+---+--+--+--++--+--+
> | 4349313028_f69ffa0257_o.jpg | 257213 bytes | Mon Feb 01 18:00:56 JST 2016 | 
> JPEG | 96.0 | 96.0 | 1199 | 800 | 24 | Unknown (0) | RGB | false | 
> {"GPSVersionID":".022","GPSLatitudeRef":"N","GPSLatitude":"47° 32' 
> 15.98\"","GPSLongitudeRef":"W","GPSLongitude":"-122° 2' 
> 6.37\"","GPSAltitudeRef":"Sea level","GPSAltitude":"0 metres"} | 
> {"ThumbnailCompression":"JPEG (old-style)","XResolution":"72 dots per 
> inch","YResolution":"72 dots per 
> inch","ResolutionUnit":"Inch","ThumbnailOffset":"414 
> bytes","ThumbnailLength":"7213 bytes"} | 
> {"Version":"1.1","ResolutionUnits":"inch","XResolution":"96 
> dots","YResolution":"96 dots"} | {"Keywords":"135;2002;issaquah;police 
> car;wa;washington"} | {"CompressionType":"Baseline","DataPrecision":"8 
> bits","ImageHeight":"800 pixels","ImageWidth":"1199 
> pixels","NumberOfComponents":"3","Component1":"Y component: Quantization 
> table 0, Sampling factors 2 horiz/2 vert","Component2":"Cb component: 
> Quantization table 1, Sampling factors 1 horiz/1 vert","Component3":"Cr 
> component: Quantization table 1, Sampling factors 1 horiz/1 vert"} | 
> {"ExifVersion":"2.10","UniqueImageID":"d65e93b836d15a0c5e041e6b7258c76e"} | 
> {"Software":"Picasa 3.0"} | {"InteroperabilityIndex":"Unknown (
> )","InteroperabilityVersion":"1.00"} |
> +--+--+--++--+---++-+--++---+--+-+---+--+--+--++--+--+
> 1 row selected (1.712 

[jira] [Updated] (DRILL-6145) Implement Hive MapR-DB JSON handler.

2018-03-21 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6145:

Labels: doc-impacting  (was: doc-impacting ready-to-commit)

> Implement Hive MapR-DB JSON handler. 
> -
>
> Key: DRILL-6145
> URL: https://issues.apache.org/jira/browse/DRILL-6145
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive, Storage - MapRDB
>Affects Versions: 1.12.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
>
> Similar to "hive-hbase-storage-handler" to support querying MapR-DB Hive's 
> external tables it is necessary to add "hive-maprdb-json-handler".
> Use case:
>  # Create a table MapR-DB JSON table:
> {code}
> _> mapr dbshell_
> _maprdb root:> create /tmp/table/json_  (make sure /tmp/table exists)
> {code}
> -- insert data
> {code}
> insert /tmp/table/json --value '\{"_id":"movie002" , "title":"Developers 
> on the Edge", "studio":"Command Line Studios"}'
> insert /tmp/table/json --id movie003 --value '\{"title":"The Golden 
> Master", "studio":"All-Nighter"}'
> {code} 
>  #  Create a Hive external table:
> {code}
> hive> CREATE EXTERNAL TABLE mapr_db_json_hive_tbl ( 
> > movie_id string, title string, studio string) 
> > STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' 
> > TBLPROPERTIES("maprdb.table.name" = 
> "/tmp/table/json","maprdb.column.id" = "movie_id");
> {code}
>  
>  #  Use hive schema to query this table via Drill:
> {code}
> 0: jdbc:drill:> select * from hive.mapr_db_json_hive_tbl;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6053) Avoid excessive locking in LocalPersistentStore

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407744#comment-16407744
 ] 

ASF GitHub Bot commented on DRILL-6053:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1163
  
@vrozov, please address code review comments so we can merge the Jira.


> Avoid excessive locking in LocalPersistentStore
> ---
>
> Key: DRILL-6053
> URL: https://issues.apache.org/jira/browse/DRILL-6053
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Major
> Fix For: 1.14.0
>
>
> When query profiles are written to LocalPersistentStore, the write is 
> unnecessary serialized due to read/write lock that was introduced for 
> versioned PersistentStore. Only versioned access needs to be protected by 
> read/write lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-21 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-6016:
-
Labels: ready-to-commit  (was: )

> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407690#comment-16407690
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1166
  
+1. LGTM


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)