date:20160429

[jira] [Created] (DRILL-4647) C++ client is not propagating a connection failed error when a drillbit goes down

2016-04-29 Thread Parth Chandra (JIRA)

Parth Chandra created DRILL-4647:


 Summary: C++ client is not propagating a connection failed error 
when a drillbit goes down
 Key: DRILL-4647
 URL: https://issues.apache.org/jira/browse/DRILL-4647
 Project: Apache Drill
  Issue Type: Bug
Reporter: Parth Chandra


When a drillbit goes down, there are two conditions under which the client is 
not propagating the error back to the application -
1) The application is in a submitQuery call: the ODBC driver is expecting that 
the error be reported thru the query results listener which hasn't been 
registered at the point the error is encountered.
2) A submitQuery call succeeded but never reached the drillbit because it was 
shutdown. In this case the application has a handle to a query and is listening 
for results which will never arrive. The heartbeat mechanism detects the 
failure, but is not propagating the error to the query results listener.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4544) Improve error messages for REFRESH TABLE METADATA command

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4544:
-
Reviewer: Rahul Challapalli

> Improve error messages for REFRESH TABLE METADATA command
> -
>
> Key: DRILL-4544
> URL: https://issues.apache.org/jira/browse/DRILL-4544
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.7.0
>
>
> Improve the error messages thrown by REFRESH TABLE METADATA command:
> In the first case below, the error is maprfs.abc doesn't exist. It should 
> throw a Object not found or workspace not found. It is currently throwing a 
> non helpful message;
> 0: jdbc:drill:> refresh table metadata maprfs.abc.`my_table`;
> +
> oksummary
> +
> false Error: null
> +
> 1 row selected (0.355 seconds)
> In the second case below, it says refresh table metadata is supported only 
> for single-directory based Parquet tables. But the command works for nested 
> multi-directory Parquet files.
> 0: jdbc:drill:> refresh table metadata maprfs.vnaranammalpuram.`rfm_sales_vw`;
> ---+
> oksummary
> ---+
> false Table rfm_sales_vw does not support metadata refresh. Support is 
> currently limited to single-directory-based Parquet tables.
> ---+
> 1 row selected (0.418 seconds)
> 0: jdbc:drill:>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4237) Skew in hash distribution

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4237:
-
Reviewer: Dechang Gu  (was: Aman Sinha)

> Skew in hash distribution
> -
>
> Key: DRILL-4237
> URL: https://issues.apache.org/jira/browse/DRILL-4237
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.4.0
>Reporter: Aman Sinha
>Assignee: Chunhui Shi
> Fix For: 1.7.0
>
>
> Apparently, the fix in DRILL-4119 did not fully resolve the data skew issue.  
> It worked fine on the smaller sample of the data set but on another sample of 
> the same data set, it still produces skewed values - see below the hash 
> values which are all odd numbers. 
> {noformat}
> 0: jdbc:drill:zk=local> select columns[0], hash32(columns[0]) from `test.csv` 
> limit 10;
> +---+--+
> |  EXPR$0   |EXPR$1|
> +---+--+
> | f71aaddec3316ae18d43cb1467e88a41  | 1506011089   |
> | 3f3a13bb45618542b5ac9d9536704d3a  | 1105719049   |
> | 6935afd0c693c67bba482cedb7a2919b  | -18137557|
> | ca2a938d6d7e57bda40501578f98c2a8  | -1372666789  |
> | fab7f08402c8836563b0a5c94dbf0aec  | -1930778239  |
> | 9eb4620dcb68a84d17209da279236431  | -970026001   |
> | 16eed4a4e801b98550b4ff504242961e  | 356133757|
> | a46f7935fea578ce61d8dd45bfbc2b3d  | -94010449|
> | 7fdf5344536080c15deb2b5a2975a2b7  | -141361507   |
> | b82560a06e2e51b461c9fe134a8211bd  | -375376717   |
> +---+--+
> {noformat}
> This indicates an underlying issue with the XXHash64 java implementation, 
> which is Drill's implementation of the C version.  One of the key difference 
> as pointed out by [~jnadeau] was the use of unsigned int64 in the C version 
> compared to the Java version which uses (signed) long.  I created an XXHash 
> version using com.google.common.primitives.UnsignedLong.  However, 
> UnsignedLong does not have bit-wise operations that are needed for XXHash 
> such as rotateLeft(),  XOR etc.  One could write wrappers for these but at 
> this point, the question is: should we think of an alternative hash function 
> ? 
> The alternative approach could be the murmur hash for numeric data types that 
> we were using earlier and the Mahout version of hash function for string 
> types 
> (https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/HashHelper.java#L28).
>   As a test, I reverted to this function and was getting good hash 
> distribution for the test data. 
> I could not find any performance comparisons of our perf tests (TPC-H or DS) 
> with the original and newer (XXHash) hash functions.  If performance is 
> comparable, should we revert to the original function ?  
> As an aside, I would like to remove the hash64 versions of the functions 
> since these are not used anywhere. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4549) Add support for more truncation units in date_trunc function

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4549:
-
Reviewer: Khurram Faraaz

> Add support for more truncation units in date_trunc function
> 
>
> Key: DRILL-4549
> URL: https://issues.apache.org/jira/browse/DRILL-4549
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
> Fix For: 1.7.0
>
>
> Currently we support only {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND}} truncate 
> units for types {{TIME, TIMESTAMP and DATE}}. Extend the functions to support 
> {{YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEK, QUARTER, DECADE, CENTURY, 
> MILLENNIUM}} truncate units for types {{TIME, TIMESTAMP, DATE, INTERVAL DAY, 
> INTERVAL YEAR}}.
> Also get rid of the if-and-else (on truncation unit) implementation. Instead 
> resolve to a direct function based on the truncation unit in Calcite -> Drill 
> (DrillOptiq) expression conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4592) Explain plan statement should show plan in WebUi

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4592:
-
Reviewer: Krystal

> Explain plan statement should show plan in WebUi
> 
>
> Key: DRILL-4592
> URL: https://issues.apache.org/jira/browse/DRILL-4592
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
> Fix For: 1.7.0
>
>
> When explain plan statement is run, the physical plan is generated and 
> returned. However, the plan is not put in the profile and does not show up in 
> the physical plan / visual plan tab in WebUI. If someone wants to look at the 
> visual plan, the only way is to execute the query, which sometime requires 
> long execution time. This makes a bit hard to analyze the plan for a 
> problematic query.  
> Similar as regular query and CTAS statement, we should put the plan for 
> EXPLAIN PLAN statement, and display properly in WebUi.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4529) SUM() with windows function result in mismatch nullability

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4529:
-
Reviewer: Krystal

> SUM() with windows function result in mismatch nullability
> --
>
> Key: DRILL-4529
> URL: https://issues.apache.org/jira/browse/DRILL-4529
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Krystal
>Assignee: Sean Hsuan-Yi Chu
>  Labels: limit0
> Fix For: 1.7.0
>
>
> git.commit.id.abbrev=cee5317
> select 
>   sum(1)  over w sum1, 
>   sum(5)  over w sum5,
>   sum(10) over w sum10
> from 
>   j1_v
> where 
>   c_date is not null
> window w as (partition by c_date);
> Output from test:
> limit 0: [columnNoNulls, columnNoNulls, columnNoNulls]
> regular: [columnNullable, columnNullable, columnNullable]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4472) Pushing Filter past Union All fails: DRILL-3257 regressed DRILL-2746 but unit test update break test goal

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4472:
-
Reviewer: Krystal

> Pushing Filter past Union All fails: DRILL-3257 regressed DRILL-2746 but unit 
> test update break test goal
> -
>
> Key: DRILL-4472
> URL: https://issues.apache.org/jira/browse/DRILL-4472
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Jacques Nadeau
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.7.0
>
>
> While reviewing DRILL-4467, I discovered this test. 
> https://github.com/apache/drill/blame/master/exec/java-exec/src/test/java/org/apache/drill/TestUnionAll.java#L560
> As you can see, the test is checking that test name confirms that filter is 
> pushed below union all. However, as you can see, the expected result in 
> DRILL-3257 was updated to a plan which doesn't push the in clause below the 
> filter. I'm disabling the test since 4467 happens to remove what becomes a 
> trivial project. However, we really should fix the core problem (a regression 
> of DRILL-2746.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3745) Hive CHAR not supported

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-3745:
-
Reviewer: Krystal

> Hive CHAR not supported
> ---
>
> Key: DRILL-3745
> URL: https://issues.apache.org/jira/browse/DRILL-3745
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Nathaniel Auvil
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting
> Fix For: 1.7.0
>
>
> It doesn’t look like Drill 1.1.0 supports the Hive CHAR type?
> In Hive:
> create table development.foo
> (
>   bad CHAR(10)
> );
> And then in sqlline:
> > use `hive.development`;
> > select * from foo;
> Error: PARSE ERROR: Unsupported Hive data type CHAR.
> Following Hive data types are supported in Drill INFORMATION_SCHEMA:
> BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP,
> BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION
> [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] 
> (state=,code=0)
> This was originally found when getting failures trying to connect via JDBS 
> using Squirrel.  We have the Hive plugin enabled with tables using CHAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4476:
-
Reviewer: Khurram Faraaz

> Enhance Union-All operator for dealing with empty left input or empty both 
> inputs
> -
>
> Key: DRILL-4476
> URL: https://issues.apache.org/jira/browse/DRILL-4476
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.7.0
>
>
> Union-All operator does not deal with the situation where left side comes 
> from empty source.
> Due to DRILL-2288's enhancement for empty sources, Union-All operator now can 
> be allowed to support this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4459) SchemaChangeException while querying hive json table

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4459:
-
Reviewer: Krystal

> SchemaChangeException while querying hive json table
> 
>
> Key: DRILL-4459
> URL: https://issues.apache.org/jira/browse/DRILL-4459
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill, Functions - Hive
>Affects Versions: 1.4.0
> Environment: MapR-Drill 1.4.0
> Hive-1.2.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
> Fix For: 1.7.0
>
>
> getting the SchemaChangeException while querying json documents stored in 
> hive table.
> {noformat}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castBIT(VAR16CHAR-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
> {noformat}
> minimum reproduce
> {noformat}
> created sample json documents using the attached script(randomdata.sh)
> hive>create table simplejson(json string);
> hive>load data local inpath '/tmp/simple.json' into table simplejson;
> now query it through Drill.
> Drill Version
> select * from sys.version;
> +---++-+-++
> | commit_id | commit_message | commit_time | build_email | build_time |
> +---++-+-++
> | eafe0a245a0d4c0234bfbead10c6b2d7c8ef413d | DRILL-3901:  Don't do early 
> expansion of directory in the non-metadata-cache case because it already 
> happens during ParquetGroupScan's metadata gathering operation. | 07.10.2015 
> @ 17:12:57 UTC | Unknown | 07.10.2015 @ 17:36:16 UTC |
> +---++-+-++
> 0: jdbc:drill:zk=> select * from hive.`default`.simplejson where 
> GET_JSON_OBJECT(simplejson.json, '$.DocId') = 'DocId2759947' limit 1;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castBIT(VAR16CHAR-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 1:1
> [Error Id: 74f054a8-6f1d-4ddd-9064-3939fcc82647 on ip-10-0-0-233:31010] 
> (state=,code=0)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4484) NPE when querying empty directory

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4484:
-
Reviewer: Krystal

> NPE when querying  empty directory 
> ---
>
> Key: DRILL-4484
> URL: https://issues.apache.org/jira/browse/DRILL-4484
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.5.0
>Reporter: Victoria Markman
>Assignee: Deneche A. Hakim
> Fix For: 1.7.0
>
>
> {code}
> 0: jdbc:drill:drillbit=localhost> select count(*) from 
> dfs.`/drill/xyz/201604*`;
> Error: VALIDATION ERROR: null
> SQL Query null
> 0: jdbc:drill:drillbit=localhost> select count(*) from 
> dfs.`/drill/xyz/20160401`;
> Error: VALIDATION ERROR: null
> SQL Query null
> [Error Id: 87366a2d-fc90-42f3-a076-aed5efdd27cb on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> 0: jdbc:drill:drillbit=localhost> select count(*) from 
> dfs.`/drill/xyz/20160401/`;
> Error: VALIDATION ERROR: null
> SQL Query null
> [Error Id: ac122243-488e-4fb8-b89f-dc01c7e5c63a on atsqa4-133.qa.lab:31010] 
> (state=,code=0)
> {code}
> {code}
> [Mon Mar 07 15:00:19 root@/drill/xyz ] # ls -lR
> .:
> total 5
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> drwxr-xr-x 2 root root   2 Feb 26 16:31 20160101
> drwxr-xr-x 2 root root   2 Feb 26 16:31 20160102
> drwxr-xr-x 2 root root   2 Feb 26 16:31 20160103
> drwxr-xr-x 2 root root   2 Feb 26 16:31 20160104
> drwxr-xr-x 2 root root   2 Feb 26 16:31 20160105
> drwxr-xr-x 2 root root   1 Feb 26 16:31 20160201
> drwxr-xr-x 2 root root   3 Feb 26 16:31 20160202
> drwxr-xr-x 2 root root   4 Feb 26 16:31 20160301
> drwxr-xr-x 2 root root   0 Feb 26 16:31 20160401
> ./20160101:
> total 1
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> ./20160102:
> total 1
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> ./20160103:
> total 1
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> ./20160104:
> total 1
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> ./20160105:
> total 1
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> ./20160201:
> total 0
> ./20160202:
> total 1
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> -rw-r--r-- 1 root root 395 Feb 26 16:31 1_0_0.parquet
> ./20160301:
> total 2
> -rw-r--r-- 1 root root 395 Feb 26 16:31 0_0_0.parquet
> -rw-r--r-- 1 root root 395 Feb 26 16:31 1_0_0.parquet
> -rw-r--r-- 1 root root 395 Feb 26 16:31 2_0_0.parquet
> ./20160401:
> total 0
> {code}
> Hakim's analysis:
> {code}
> More details about the NPE, actually it's an IllegalArgumentException: what 
> happens is that during planing no file meets the wildcard selection and the 
> query should fail during planing with a "Table not found" message, instead 
> execution starts and the scanner fail because no file was assigned to them
> {code}
> Drill version:
> {code}
> #Generated by Git-Commit-Id-Plugin
> #Mon Mar 07 19:38:24 UTC 2016
> git.commit.id.abbrev=a2fec78
> git.commit.user.email=adene...@gmail.com
> git.commit.message.full=DRILL-4457\: Difference in results returned by window 
> function over BIGINT data\n\nthis closes \#410\n
> git.commit.id=a2fec78695df979e240231cb9d32c7f18274a333
> git.commit.message.short=DRILL-4457\: Difference in results returned by 
> window function over BIGINT data
> git.commit.user.name=adeneche
> git.build.user.name=Unknown
> git.commit.id.describe=0.9.0-625-ga2fec78-dirty
> git.build.user.email=Unknown
> git.branch=master
> git.commit.time=07.03.2016 @ 17\:38\:42 UTC
> git.build.time=07.03.2016 @ 19\:38\:24 UTC
> git.remote.origin.url=https\://github.com/apache/drill
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4376) Wrong results when doing a count(*) on part of directories with metadata cache

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4376:
-
Reviewer: Rahul Challapalli

> Wrong results when doing a count(*) on part of directories with metadata cache
> --
>
> Key: DRILL-4376
> URL: https://issues.apache.org/jira/browse/DRILL-4376
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.4.0
>Reporter: Deneche A. Hakim
>Assignee: Deneche A. Hakim
>Priority: Critical
> Fix For: 1.7.0
>
>
> First create some parquet tables in multiple subfolders:
> {noformat}
> create table dfs.tmp.`test/201501` as select employee_id, full_name from 
> cp.`employee.json` limit 2;
> create table dfs.tmp.`test/201502` as select employee_id, full_name from 
> cp.`employee.json` limit 2;
> create table dfs.tmp.`test/201601` as select employee_id, full_name from 
> cp.`employee.json` limit 2;
> create table dfs.tmp.`test/201602` as select employee_id, full_name from 
> cp.`employee.json` limit 2;
> {noformat}
> Running the following query gives the expected count:
> {noformat}
> select count(*) from dfs.tmp.`test/20160*`;
> +-+
> | EXPR$0  |
> +-+
> | 4   |
> +-+
> {noformat}
> But once you create the metadata cache files, the query no longer returns the 
> correct results:
> {noformat}
> refresh table metadata dfs.tmp.`test`;
> select count(*) from dfs.tmp.`test/20160*`;
> +-+
> | EXPR$0  |
> +-+
> | 2   |
> +-+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3743) query hangs on sqlline once Drillbit on foreman node is killed

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-3743:
-
Reviewer: Khurram Faraaz

> query hangs on sqlline once Drillbit on foreman node is killed
> --
>
> Key: DRILL-3743
> URL: https://issues.apache.org/jira/browse/DRILL-3743
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Sudheesh Katkam
>Priority: Critical
> Fix For: 1.7.0
>
>
> sqlline/query hangs once Drillbit (on Foreman node) is killed. (kill -9 )
> query was issued from the Foreman node. The query returns many records, and 
> it is a long running query.
> Steps to reproduce the problem.
> set planner.slice_target=1
> 1.  clush -g khurram service mapr-warden stop
> 2.  clush -g khurram service mapr-warden start
> 3.  ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 200;
> 4.  Immediately from another console do a jps and kill the Drillbit process 
> (in this case foreman) while the query is being run on sqlline. You will 
> notice that sqlline just hangs, we do not see any exceptions or errors being 
> reported on sqlline prompt or in drillbit.log or drillbit.out
> I do see this Exception in sqlline.log on the node from where sqlline was 
> started
> {code}
> 2015-09-04 18:45:12,069 [Client-1] INFO  o.a.d.e.rpc.user.QueryResultHandler 
> - User Error Occurred
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: 
> Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) 
> closed unexpectedly.
> [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-09-04 18:45:12,069 [Client-1] INFO  
> o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
> org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: 
> Connection /10.10.100.201:53425 <--> /10.10.100.201:31010 (user client) 
> closed unexpectedly.
> [Error Id: ec316cfd-c9a5-4905-98e3-da20cb799ba5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
>  ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler$SubmissionListener$ChannelClosedListener.operationComplete(QueryResultHandler.java:298)
>  [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
> at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:254) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
>

[jira] [Updated] (DRILL-4523) Disallow using loopback address in distributed mode

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4523:
-
Reviewer: Krystal

> Disallow using loopback address in distributed mode
> ---
>
> Key: DRILL-4523
> URL: https://issues.apache.org/jira/browse/DRILL-4523
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
> Fix For: 1.7.0
>
>
> If we enable debug for org.apache.drill.exec.coord.zk in logback.xml, we only 
> get the hostname and ports information. For example:
> {code}
> 2015-11-04 19:47:02,927 [ServiceCache-0] DEBUG 
> o.a.d.e.c.zk.ZKClusterCoordinator - Cache changed, updating.
> 2015-11-04 19:47:02,932 [ServiceCache-0] DEBUG 
> o.a.d.e.c.zk.ZKClusterCoordinator - Active drillbit set changed.  Now 
> includes 2 total bits.  New active drillbits:
>  h3.poc.com:31010:31011:31012
>  h2.poc.com:31010:31011:31012
> {code}
> We need to know the IP address of each hostname to do further troubleshooting.
> Imagine if any drillbit registers itself as "localhost.localdomain" in 
> zookeeper, we will never know where it comes from. Enabling IP address 
> tracking can help this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-4317) Exceptions on SELECT and CTAS with large CSV files

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-4317:
-
Reviewer: Krystal

> Exceptions on SELECT and CTAS with large CSV files
> --
>
> Key: DRILL-4317
> URL: https://issues.apache.org/jira/browse/DRILL-4317
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text & CSV
>Affects Versions: 1.4.0, 1.5.0, 1.6.0
> Environment: 4 node cluster, Hadoop 2.7.0, 14.04.1-Ubuntu
>Reporter: Matt Keranen
>Assignee: Deneche A. Hakim
> Fix For: 1.7.0
>
>
> Selecting from a CSV file or running a CTAS into Parquet generates exceptions.
> Source file is ~650MB, a table of 4 key columns followed by 39 numeric data 
> columns, otherwise a fairly simple format. Example:
> {noformat}
> 2015-10-17 
> 00:00,f5e9v8u2,err,fr7,226020793,76.094,26307,226020793,76.094,26307,
> 2015-10-17 
> 00:00,c3f9x5z2,err,mi1,1339159295,216.004,177690,1339159295,216.004,177690,
> 2015-10-17 
> 00:00,r5z2f2i9,err,mi1,7159994629,39718.011,65793,6142021303,30687.811,64630,143777403,40.521,146,75503742,41.905,89,170771174,168.165,198,192565529,370.475,222,97577280,318.068,120,62631452,288.253,68,32371173,189.527,39,41712265,299.184,46,39046408,363.418,47,34182318,465.343,43,127834582,6485.341,145
> 2015-10-17 
> 00:00,j9s6i8t2,err,fr7,20580443899,277445.055,67826,2814893469,85447.816,54275,2584757097,608.001,2044,1395571268,769.113,1051,3070616988,3000.005,2284,3413811671,6489.060,2569,1772235156,5806.214,1339,1097879284,5064.120,858,691884865,4035.397,511,672967845,4815.875,518,789163614,7306.684,599,813910495,10632.464,627,1462752147,143470.306,1151
> {noformat}
> A "SELECT from `/path/to/file.csv`" runs for 10's of minutes and eventually 
> results in:
> {noformat}
> java.lang.IndexOutOfBoundsException: index: 547681, length: 1 (expected: 
> range(0, 547681))
> at 
> io.netty.buffer.AbstractByteBuf.checkIndex(AbstractByteBuf.java:1134)
> at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.getBytes(PooledUnsafeDirectByteBuf.java:136)
> at io.netty.buffer.WrappedByteBuf.getBytes(WrappedByteBuf.java:289)
> at 
> io.netty.buffer.UnsafeDirectLittleEndian.getBytes(UnsafeDirectLittleEndian.java:26)
> at io.netty.buffer.DrillBuf.getBytes(DrillBuf.java:586)
> at io.netty.buffer.DrillBuf.getBytes(DrillBuf.java:586)
> at io.netty.buffer.DrillBuf.getBytes(DrillBuf.java:586)
> at io.netty.buffer.DrillBuf.getBytes(DrillBuf.java:586)
> at 
> org.apache.drill.exec.vector.VarCharVector$Accessor.get(VarCharVector.java:443)
> at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getBytes(VarCharAccessor.java:125)
> at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getString(VarCharAccessor.java:146)
> at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getObject(VarCharAccessor.java:136)
> at 
> org.apache.drill.exec.vector.accessor.VarCharAccessor.getObject(VarCharAccessor.java:94)
> at 
> org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:148)
> at 
> org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getObject(TypeConvertingSqlAccessor.java:795)
> at 
> org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:179)
> at 
> net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351)
> at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:420)
> at sqlline.Rows$Row.(Rows.java:157)
> at sqlline.IncrementalRows.hasNext(IncrementalRows.java:63)
> at 
> sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
> at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
> at sqlline.SqlLine.print(SqlLine.java:1593)
> at sqlline.Commands.execute(Commands.java:852)
> at sqlline.Commands.sql(Commands.java:751)
> at sqlline.SqlLine.dispatch(SqlLine.java:746)
> at sqlline.SqlLine.begin(SqlLine.java:621)
> at sqlline.SqlLine.start(SqlLine.java:375)
> at sqlline.SqlLine.main(SqlLine.java:268)
> {noformat}
> A CTAS on the same file with storage as Parquet results in:
> {noformat}
> Error: SYSTEM ERROR: IllegalArgumentException: length: -260 (expected: >= 0)
> Fragment 1:2
> [Error Id: 1807615e-4385-4f85-8402-5900aaa568e9 on es07:31010]
>   (java.lang.IllegalArgumentException) length: -260 (expected: >= 0)
> io.netty.buffer.AbstractByteBuf.checkIndex():1131
> io.netty.buffer.PooledUnsafeDirectByteBuf.nioBuffer():344
> io.netty.buffer.WrappedByteBuf.nioBuffer():727
>

[jira] [Updated] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

2016-04-29 Thread Suresh Ollala (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Ollala updated DRILL-3714:
-
Reviewer: Khurram Faraaz

> Query runs out of memory and remains in CANCELLATION_REQUESTED state until 
> drillbit is restarted
> 
>
> Key: DRILL-3714
> URL: https://issues.apache.org/jira/browse/DRILL-3714
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Victoria Markman
>Assignee: Jacques Nadeau
>Priority: Critical
> Fix For: 1.7.0
>
> Attachments: Screen Shot 2015-08-26 at 10.36.33 AM.png, drillbit.log, 
> jstack.txt, query_profile_2a2210a7-7a78-c774-d54c-c863d0b77bb0.json
>
>
> This is a variation of DRILL-3705 with the difference of drill behavior when 
> hitting OOM condition.
> Query runs out of memory during execution and remains in 
> "CANCELLATION_REQUESTED" state until drillbit is bounced.
> Client (sqlline in this case) never gets a response from the server.
> Reproduction details:
> Single node drillbit installation.
> DRILL_MAX_DIRECT_MEMORY="8G"
> DRILL_HEAP="4G"
> Run this query on TPCDS SF100 data set
> {code}
> SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) AS 
> TotalSpend FROM store_sales ss WHERE ss.ss_store_sk IS NOT NULL ORDER BY 1 
> LIMIT 10;
> {code}
> drillbit.log
> {code}
> 2015-08-26 16:54:58,469 [2a2210a7-7a78-c774-d54c-c863d0b77bb0:frag:3:22] INFO 
>  o.a.d.e.w.f.FragmentStatusReporter - 
> 2a2210a7-7a78-c774-d54c-c863d0b77bb0:3:22: State to report: RUNNING
> 2015-08-26 16:55:50,498 [BitServer-5] WARN  
> o.a.drill.exec.rpc.data.DataServer - Message of mode REQUEST of rpc type 3 
> took longer than 500ms.  Actual duration was 2569ms.
> 2015-08-26 16:56:31,086 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.88.133:31012 <--> /10.10.88.133:54554 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
> buffer memory
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
>  [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
> [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
> at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_71]
> at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_71]
> at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
>

[jira] [Commented] (DRILL-4132) Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution

2016-04-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264557#comment-15264557
 ] 

ASF GitHub Bot commented on DRILL-4132:
---

Github user yufeldman commented on the pull request:

https://github.com/apache/drill/pull/368#issuecomment-215850679
  
Addressed review comments


> Ability to submit simple type of physical plan directly to EndPoint DrillBit 
> for execution
> --
>
> Key: DRILL-4132
> URL: https://issues.apache.org/jira/browse/DRILL-4132
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Flow, Execution - RPC, Query Planning & 
> Optimization
>Reporter: Yuliya Feldman
>Assignee: Yuliya Feldman
>
> Today Drill Query execution is optimistic and stateful (at least due to data 
> exchanges) - if any of the stages of query execution fails whole query fails. 
> If query is just simple scan, filter push down and project where no data 
> exchange happens between DrillBits there is no need to fail whole query when 
> one DrillBit fails, as minor fragments running on that DrillBit can be rerun 
> on the other DrillBit. There are probably multiple ways to achieve this. This 
> JIRA is to open discussion on: 
> 1. agreement that we need to support above use case 
> 2. means of achieving it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-4132) Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution

2016-04-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264493#comment-15264493
 ] 

ASF GitHub Bot commented on DRILL-4132:
---

Github user yufeldman commented on a diff in the pull request:

https://github.com/apache/drill/pull/368#discussion_r61624612
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -321,6 +329,38 @@ public void close() {
 return listener.getResults();
   }
 
+  public DrillRpcFuture planQuery(QueryType type, 
String query, boolean isSplitPlan) {
+GetQueryPlanFragments runQuery = 
GetQueryPlanFragments.newBuilder().setQuery(query).setType(type).setSplitPlan(isSplitPlan).build();
+return client.submitPlanQuery(runQuery);
+  }
+
+  public void runQuery(QueryType type, List planFragments, 
UserResultsListener resultsListener)
+  throws RpcException {
+// QueryType can be only executional
+checkArgument((QueryType.EXECUTION == type), "Only EXECUTIONAL type 
query is supported with PlanFragments");
+// setting Plan on RunQuery will be used for logging purposes and 
therefore can not be null
+// since there is no Plan string provided we will create a JsonArray 
out of individual fragment Plans
+ArrayNode jsonArray = objectMapper.createArrayNode();
+for (PlanFragment fragment : planFragments) {
+  try {
+jsonArray.add(objectMapper.readTree(fragment.getFragmentJson()));
+  } catch (IOException e) {
+logger.error("Exception while trying to read PlanFragment JSON for 
%s", fragment.getHandle().getQueryId(), e);
+throw new RpcException(e);
+  }
+}
+final String fragmentsToJsonString;
+try {
+  fragmentsToJsonString = objectMapper.writeValueAsString(jsonArray);
--- End diff --

It is not that easy, as on DrillClient there is no knowledge of 
DrillContext and subsequently PhysicalPlanReader. 


> Ability to submit simple type of physical plan directly to EndPoint DrillBit 
> for execution
> --
>
> Key: DRILL-4132
> URL: https://issues.apache.org/jira/browse/DRILL-4132
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Flow, Execution - RPC, Query Planning & 
> Optimization
>Reporter: Yuliya Feldman
>Assignee: Yuliya Feldman
>
> Today Drill Query execution is optimistic and stateful (at least due to data 
> exchanges) - if any of the stages of query execution fails whole query fails. 
> If query is just simple scan, filter push down and project where no data 
> exchange happens between DrillBits there is no need to fail whole query when 
> one DrillBit fails, as minor fragments running on that DrillBit can be rerun 
> on the other DrillBit. There are probably multiple ways to achieve this. This 
> JIRA is to open discussion on: 
> 1. agreement that we need to support above use case 
> 2. means of achieving it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-4577) Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in

2016-04-29 Thread Sean Hsuan-Yi Chu (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264337#comment-15264337
 ] 

Sean Hsuan-Yi Chu commented on DRILL-4577:
--

[~vkorukanti],
Here are a few points. Can you help see if they make sense for checking in the 
code?
1. [Perf-wise] Without bulk loading, the performance is just NOT acceptable.
2. [Protection by adding an option] By default, users would see the behavior 
which they used to have.
3. [Comparisons with other systems] I am not sure of whether showing just the 
table names is very harmful. For instance, when you type in "show tables" in 
hive, hive will give all the table names, regardless of the permissions.

> Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in
> ---
>
> Key: DRILL-4577
> URL: https://issues.apache.org/jira/browse/DRILL-4577
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.7.0
>
>
> A query such as 
> {code}
> select * from INFORMATION_SCHEMA.`TABLES` 
> {code}
> is converted as calls to fetch all tables from storage plugins. 
> When users have Hive, the calls to hive metadata storage would be: 
> 1) get_table
> 2) get_partitions
> However, the information regarding partitions is not used in this type of 
> queries. Beside, a more efficient way is to fetch tables is to use 
> get_multi_table call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-4573) Zero copy LIKE, REGEXP_MATCHES, SUBSTR

2016-04-29 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264313#comment-15264313
 ] 

ASF GitHub Bot commented on DRILL-4573:
---

Github user hsuanyi commented on the pull request:

https://github.com/apache/drill/pull/458#issuecomment-215801663
  
Hi @jcmcote,
I see. But can you do us a favor. Regex_replace()'s behavior changes after 
your patch
https://issues.apache.org/jira/browse/DRILL-4645

Can you help fix this? And would you mind merging the missing piece with 
the new patch?


> Zero copy LIKE, REGEXP_MATCHES, SUBSTR
> --
>
> Key: DRILL-4573
> URL: https://issues.apache.org/jira/browse/DRILL-4573
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: jean-claude
>Priority: Minor
> Fix For: 1.7.0
>
> Attachments: DRILL-4573.patch.txt
>
>
> All the functions using the java.util.regex.Matcher are currently creating 
> Java string objects to pass into the matcher.reset().
> However this creates unnecessary copy of the bytes and a Java string object.
> The matcher uses a CharSequence, so instead of making a copy we can create an 
> adapter from the DrillBuffer to the CharSequence interface.
> Gains of 25% in execution speed are possible when going over VARCHAR of 36 
> chars. The gain will be proportional to the size of the VARCHAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4646) Make Hive impersonation work consistently under two modes for INFORMATION_SCHEMA query

2016-04-29 Thread Jinfeng Ni (JIRA)

Jinfeng Ni created DRILL-4646:
-

 Summary: Make Hive impersonation work consistently under two modes 
for INFORMATION_SCHEMA query
 Key: DRILL-4646
 URL: https://issues.apache.org/jira/browse/DRILL-4646
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jinfeng Ni


Per the discussion of DRILL-4577, for INFORMATION_SCHEMA query 

{code}
select TABLE_CATALOG, TABLE_SCHEMA , TABLE_NAME, TABLE_TYPE from
INFORMATION_SCHEMA.`TABLES` 
...
{code}

Drill's current behavior under Storage-based and Sql-standard impersonation is 
different. 
1. Under storage-based mode, the above query will only return the tables the 
user has access.
2. Under sql-standard mode, the above query will return tables the user does 
not have access.

Either Drill should correct the behavior of sql-standard mode, or Drill should 
do something to make the two modes behavior consistently.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (DRILL-3474) Add implicit file columns support

2016-04-29 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264048#comment-15264048
 ] 

Arina Ielchiieva edited comment on DRILL-3474 at 4/29/16 1:31 PM:
--

Implementation will add support to four implicit file columns: filename, 
suffix, fqn, dirname.
They will be available during querying file or directory if called explicitly.


was (Author: arina):
Implementation will add support to four implicit file columns: filename, 
suffix, dfqn, dirname.
They will be available during querying file or directory if called explicitly.

> Add implicit file columns support
> -
>
> Key: DRILL-3474
> URL: https://issues.apache.org/jira/browse/DRILL-3474
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.1.0
>Reporter: Jim Scott
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> I could not find another ticket which talks about this ...
> The file name should be a column which can be selected or filtered when 
> querying a directory just like dir0, dir1 are available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-3474) Add implicit file columns support

2016-04-29 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264048#comment-15264048
 ] 

Arina Ielchiieva commented on DRILL-3474:
-

Implementation will add support to four implicit file columns: filename, 
suffix, dfqn, dirname.
They will be available during querying file or directory if called explicitly.

> Add implicit file columns support
> -
>
> Key: DRILL-3474
> URL: https://issues.apache.org/jira/browse/DRILL-3474
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.1.0
>Reporter: Jim Scott
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> I could not find another ticket which talks about this ...
> The file name should be a column which can be selected or filtered when 
> querying a directory just like dir0, dir1 are available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3474) Add implicit file columns support

2016-04-29 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-3474:

Summary: Add implicit file columns support  (was: Filename should be an 
available column when querying a directory)

> Add implicit file columns support
> -
>
> Key: DRILL-3474
> URL: https://issues.apache.org/jira/browse/DRILL-3474
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.1.0
>Reporter: Jim Scott
>Assignee: Arina Ielchiieva
> Fix For: Future
>
>
> I could not find another ticket which talks about this ...
> The file name should be a column which can be selected or filtered when 
> querying a directory just like dir0, dir1 are available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4647) C++ client is not propagating a connection failed error when a drillbit goes down

[jira] [Updated] (DRILL-4544) Improve error messages for REFRESH TABLE METADATA command

[jira] [Updated] (DRILL-4237) Skew in hash distribution

[jira] [Updated] (DRILL-4549) Add support for more truncation units in date_trunc function

[jira] [Updated] (DRILL-4592) Explain plan statement should show plan in WebUi

[jira] [Updated] (DRILL-4529) SUM() with windows function result in mismatch nullability

[jira] [Updated] (DRILL-4472) Pushing Filter past Union All fails: DRILL-3257 regressed DRILL-2746 but unit test update break test goal

[jira] [Updated] (DRILL-3745) Hive CHAR not supported

[jira] [Updated] (DRILL-4476) Enhance Union-All operator for dealing with empty left input or empty both inputs

[jira] [Updated] (DRILL-4459) SchemaChangeException while querying hive json table

[jira] [Updated] (DRILL-4484) NPE when querying empty directory

[jira] [Updated] (DRILL-4376) Wrong results when doing a count(*) on part of directories with metadata cache

[jira] [Updated] (DRILL-3743) query hangs on sqlline once Drillbit on foreman node is killed

[jira] [Updated] (DRILL-4523) Disallow using loopback address in distributed mode

[jira] [Updated] (DRILL-4317) Exceptions on SELECT and CTAS with large CSV files

[jira] [Updated] (DRILL-3714) Query runs out of memory and remains in CANCELLATION_REQUESTED state until drillbit is restarted

[jira] [Commented] (DRILL-4132) Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution

[jira] [Commented] (DRILL-4132) Ability to submit simple type of physical plan directly to EndPoint DrillBit for execution

[jira] [Commented] (DRILL-4577) Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in

[jira] [Commented] (DRILL-4573) Zero copy LIKE, REGEXP_MATCHES, SUBSTR

[jira] [Created] (DRILL-4646) Make Hive impersonation work consistently under two modes for INFORMATION_SCHEMA query

[jira] [Comment Edited] (DRILL-3474) Add implicit file columns support

[jira] [Commented] (DRILL-3474) Add implicit file columns support

[jira] [Updated] (DRILL-3474) Add implicit file columns support

24 matches

Site Navigation

Mail list logo

Footer information