[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-23 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552425#comment-16552425
 ] 

Junjie Chen commented on HIVE-17593:


[~Ferd], can we merge this now?

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, 
> HIVE-17593.4.patch, HIVE-17593.5.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-15 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544817#comment-16544817
 ] 

Junjie Chen commented on HIVE-17593:


[~Ferd], All unit tests passed,  the checkstyle issue is from original code. 
Could you please have a look?

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, 
> HIVE-17593.4.patch, HIVE-17593.5.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-14 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: HIVE-17593.5.patch

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, 
> HIVE-17593.4.patch, HIVE-17593.5.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-12 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: HIVE-17593.4.patch

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, 
> HIVE-17593.4.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-12 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542455#comment-16542455
 ] 

Junjie Chen commented on HIVE-17593:


The previous unit test failure (vectorized_parquet_types.q) is because of 
different length UDF used for CHAR.  

When performing query in non-vectorized mode, GenericUDFLength is used to 
calculate length of column, it converts the primitive value to string by using 
PrimitiveObjectInspectorUtil.getString, in which the tailing spaces is ignored 
for CHAR type.
However, when performing query in vectorized mode, StringLength is used to 
calculate the length of column, it treats column as byte array and doesn't 
consider the column type. 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-05 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533302#comment-16533302
 ] 

Junjie Chen commented on HIVE-17593:


[~Ferd], I haven't perform fully unit tests locally,  let me delete it firstly 
since it will trigger hive build test.

As for HIVE-17261, it depends on this issue.




> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-05 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: (was: HIVE-17593.4.patch)

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-04 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: HIVE-17593.4.patch

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, 
> HIVE-17593.4.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-04 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533260#comment-16533260
 ] 

Junjie Chen commented on HIVE-17593:


[~Ferd], I may understand the definition in wrong way.  As I listed definition 
in above, length, comparison, and hashcode should be ignored for HiveChar, so 
we should not change LENGTH(column) all to 5 in qtest result. Furthermore, I 
checked HiveChar conversion in other places, such as 
PrimitiveObjectInspectorConverter.java and PrimitiveObjectInspectorOrUtils.java 
in hive serder2 package, they use stripped value explicitly. 

So I think the easy way is to change ConvertAstToSeachArgs.java to use stripped 
value for HiveChar as well.

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-03 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532163#comment-16532163
 ] 

Junjie Chen commented on HIVE-17593:


It's my fault, will update llap side as well.

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-03 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532155#comment-16532155
 ] 

Junjie Chen commented on HIVE-17593:


Thanks [~Ferd], I think the last test report already cover my latest 
HIVE-17593.3.patch. 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-03 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532138#comment-16532138
 ] 

Junjie Chen commented on HIVE-17593:


[~Ferd], yes, previous qtest result uses stripped value for char type 
verification, and I change to use padding value according to char definition. 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-03 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532092#comment-16532092
 ] 

Junjie Chen commented on HIVE-17593:


the failed tests are not related. 

[~Ferd], take a look?


> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-02 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: HIVE-17593.3.patch

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.3.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-02 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Comment: was deleted

(was: [~Ferd], I updated qtest result, do you know why it still failed? 
)

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-02 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530650#comment-16530650
 ] 

Junjie Chen commented on HIVE-17593:


[~Ferd], I updated qtest result, do you know why it still failed? 


> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-02 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Attachment: HIVE-17593.2.patch

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.2.patch, HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-07-02 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16529513#comment-16529513
 ] 

Junjie Chen commented on HIVE-17593:


vectorized_parquet_types.q failed due to my patch, the qtest.out should be 
changed also according to types definition:
Char types are similar to Varchar but they are fixed-length meaning that values 
shorter than the specified length value are padded with spaces but trailing 
spaces are not important during comparisons. The maximum length is fixed at 255.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-CharcharChar

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528362#comment-16528362
 ] 

Junjie Chen commented on HIVE-17593:


Thanks [~Ferd] to response so quickly.

It depends on how HiveChar defined and used in other place or other format, 
Hive should have unified usage on HiveChar.  According to 
HiveChar/HiveCharWritable definition in HiveChar/HiveCharWriable.java as below:
/**
 * HiveChar.
 * String values will be padded to full char length.
 * Character legnth, comparison, hashCode should ignore trailing spaces.
 */

We can know the original value of HiveChar should include padding spaces. So in 
ConvertAstToSearchArg.java#boxLiteral return padding value.


> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527519#comment-16527519
 ] 

Junjie Chen edited comment on HIVE-17593 at 6/29/18 11:55 AM:
--

In ConvertAstToSeachArg.java we can find that Hive is using padding string of 
HiveChar as Search argument, while in parquet DataWritableWriter it stripes 
HiveChar spaces, and thus lead to search failed.  Actually hive should not 
strip tail spaces for parquet since parquet could do encoding, such as RLE, to 
deal with this. So update to using padding value.

[~Ferd], please take a look on this.


was (Author: junjie):
In ConvertAstToSeachArg.java we can find that Hive is using padding string of 
HiveChar as Search argument, while in parquet DataWritableWriter it stripes 
HiveChar spaces, and thus lead to search failed.  Actually hive should not 
strip tail spaces for parquet since parquet could do encoding, such as RLE, to 
deal with this. So update to using padding value.

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread Junjie Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527519#comment-16527519
 ] 

Junjie Chen commented on HIVE-17593:


In ConvertAstToSeachArg.java we can find that Hive is using padding string of 
HiveChar as Search argument, while in parquet DataWritableWriter it stripes 
HiveChar spaces, and thus lead to search failed.  Actually hive should not 
strip tail spaces for parquet since parquet could do encoding, such as RLE, to 
deal with this. So update to using padding value.

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Fix Version/s: 3.1.0
Affects Version/s: 2.3.0
   Attachment: HIVE-17593.patch
 Target Version/s: 3.1.0
   Status: Patch Available  (was: Open)

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-17593.patch
>
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2018-06-29 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-17593:
--

Assignee: Junjie Chen

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Major
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2017-09-25 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178697#comment-16178697
 ] 

Junjie Chen commented on HIVE-17593:


hive strip spaces for char(lengh) type,  and then store value to parquet.  
Other parquet reader may read striped value which is different from original.

   public void write(Object value) {
  String v = inspector.getPrimitiveJavaObject(value).getStrippedValue();
  recordConsumer.addBinary(Binary.fromString(v));
}

[~Ferd], do you think this is a valid case? Shouldn't it store the real value? 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Junjie Chen
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2017-09-24 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17593:
---
Description: 
DataWritableWriter strip spaces for CHAR type before writing. While when 
generating predicate, it does NOT do same striping which should cause data 
missing!

In current version, it doesn't cause data missing since predicate is not well 
push down to parquet due to HIVE-17261.

Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as same 
which will build a predicate with tail spaces.

  was:
DataWritableWriter strip spaces for CHAR type before writing. While when 
generating predicate, it does't do same striping which should cause data 
missing!

ConvertAstTosearchArg.java#getTypes treat CHAR and STRING as same.


> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Junjie Chen
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-15 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167454#comment-16167454
 ] 

Junjie Chen commented on HIVE-17261:


I think the length in create table should specify the maximum length for 
column.  Looks like hive does not write cast values to parquet. 
Following are parquet file dump, no tail spaces in the end.
c = hello
v = world
d = ACvU
da = 57

c = apple
v = bee
d = AADc
da = 50

c = hello
v = world
d = ACvU
da = 57

c = apple
v = bee
d = AADc
da = 50


> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-17261.10.patch, HIVE-17261.11.patch, 
> HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, 
> HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.7.patch, 
> HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-15 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167401#comment-16167401
 ] 

Junjie Chen edited comment on HIVE-17261 at 9/15/17 6:33 AM:
-

The insert statement following store values in parquet without tail spaces. 
insert overwrite table newtypestbl select * from (select cast("apple" as 
char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from 
src src1 union all select cast("hello" as char(10)), cast("world" as 
varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) 
uniontbl;

However hive pass predicate {noformat}"eq(c, Binary{"apple "})"{noformat} 
to parquet, so the records are filtered in RecordReader#nextKeyValue().

So hive should also remove spaces in tail for predicate.


was (Author: junjie):
The insert statement following store values in parquet without tail spaces. 
insert overwrite table newtypestbl select * from (select cast("apple" as 
char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from 
src src1 union all select cast("hello" as char(10)), cast("world" as 
varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) 
uniontbl;

However hive pass predicate "eq(c, Binary{"apple"})" to parquet, so the 
records are filtered in RecordReader#nextKeyValue().

So hive should also remove spaces in tail for predicate.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-17261.10.patch, HIVE-17261.11.patch, 
> HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, 
> HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.7.patch, 
> HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-15 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167401#comment-16167401
 ] 

Junjie Chen edited comment on HIVE-17261 at 9/15/17 6:29 AM:
-

The insert statement following store values in parquet without tail spaces. 
insert overwrite table newtypestbl select * from (select cast("apple" as 
char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from 
src src1 union all select cast("hello" as char(10)), cast("world" as 
varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) 
uniontbl;

However hive pass predicate "eq(c, Binary{"apple"})" to parquet, so the 
records are filtered in RecordReader#nextKeyValue().

So hive should also remove spaces in tail for predicate.


was (Author: junjie):
The insert statement following store values in parquet without tail spaces. 
insert overwrite table newtypestbl select * from (select cast("apple" as 
char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from 
src src1 union all select cast("hello" as char(10)), cast("world" as 
varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) 
uniontbl;

However hive pass predicate "eq(c, Binary{"apple "})" to parquet, so the 
records are filtered in RecordReader#nextKeyValue().

So hive should also remove spaces in tail for predicate.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-17261.10.patch, HIVE-17261.11.patch, 
> HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, 
> HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.7.patch, 
> HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-15 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167401#comment-16167401
 ] 

Junjie Chen commented on HIVE-17261:


The insert statement following store values in parquet without tail spaces. 
insert overwrite table newtypestbl select * from (select cast("apple" as 
char(10)), cast("bee" as varchar(10)), 0.22, cast("1970-02-20" as date) from 
src src1 union all select cast("hello" as char(10)), cast("world" as 
varchar(10)), 11.22, cast("1970-02-27" as date) from src src2 limit 10) 
uniontbl;

However hive pass predicate "eq(c, Binary{"apple "})" to parquet, so the 
records are filtered in RecordReader#nextKeyValue().

So hive should also remove spaces in tail for predicate.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-17261.10.patch, HIVE-17261.11.patch, 
> HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, 
> HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.7.patch, 
> HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-12 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.11.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.10.patch, HIVE-17261.11.patch, 
> HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.4.patch, 
> HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.7.patch, 
> HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-12 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.10.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.10.patch, HIVE-17261.2.patch, 
> HIVE-17261.3.patch, HIVE-17261.4.patch, HIVE-17261.5.patch, 
> HIVE-17261.6.patch, HIVE-17261.7.patch, HIVE-17261.8.patch, HIVE-17261.diff, 
> HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.8.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, 
> HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, 
> HIVE-17261.7.patch, HIVE-17261.8.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.7.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, 
> HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, 
> HIVE-17261.7.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-10 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.6.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, 
> HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, 
> HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-06 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156286#comment-16156286
 ] 

Junjie Chen commented on HIVE-17261:


Thanks [~Ferd]
As for 4, since jobconf is a member variable, so it doesn't need to explicit 
transfer.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, 
> HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-06 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.5.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, 
> HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-09-05 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.4.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, 
> HIVE-17261.4.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-11 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122944#comment-16122944
 ] 

Junjie Chen edited comment on HIVE-17261 at 8/11/17 7:07 AM:
-

[~Ferd], Updated original unit tests to apply filter by using new APIs.


was (Author: junjie):
[~Ferd], Updated original unit tests to apply filter from parquet side.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.diff, 
> HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-11 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122944#comment-16122944
 ] 

Junjie Chen commented on HIVE-17261:


[~Ferd], Updated original unit tests to apply filter from parquet side.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.diff, 
> HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.3.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, HIVE-17261.diff, 
> HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: (was: HIVE-17261.3.patch)

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.3.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
> Attachments: HIVE-17261.2.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-10 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122708#comment-16122708
 ] 

Junjie Chen commented on HIVE-17261:


Actually, Hive use two deprecated parquet APIs, one is ParquetInputSplit, 
another is filterRowGroup. This is because parquet introduce new dictionary 
filter. The key point here is how to leverage both statistics filter and 
dictionary filter, in existing code, hive explicitly apply statistic filter in 
Hive side. 

To apply both statistics and dictionary filter, we can either explicitly 
changed filterRowGroup API or pass predicate statement through job 
configuration to parquet and filter at parquet side. The patch I provide is to 
pass predicate statement and skip explicitly filter at hive side.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.2.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-10 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.2.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.2.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-10 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: (was: HIVE-17261.2.patch)

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-10 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.2.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.2.patch, HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-10 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122594#comment-16122594
 ] 

Junjie Chen commented on HIVE-17261:


[~Ferd], I don't understand what you means, end() is private member function 
used by deprecated constructor, why I should use it in new one?

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-10 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121126#comment-16121126
 ] 

Junjie Chen commented on HIVE-17261:


Yes, 
[ParquetFileReader@filterRowGroups|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileReader.java#L647]
 does the job.

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121082#comment-16121082
 ] 

Junjie Chen commented on HIVE-17261:


Hive convert search argument to FilterPredicate and push down to Parquet. 
Please see here: 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L151

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.patch

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff, HIVE-17261.patch
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121006#comment-16121006
 ] 

Junjie Chen commented on HIVE-17261:


Hi [~kellyzly]
Could you please have a look?

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120996#comment-16120996
 ] 

Junjie Chen edited comment on HIVE-17261 at 8/10/17 3:42 AM:
-

Just update one function for parquet, so no unit test.


was (Author: junjie):
--- 
a/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
+++ 
b/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
@@ -131,15 +131,14 @@ protected ParquetInputSplit getSplit(
 filtedBlocks = splitGroup;
   }

+
   split = new ParquetInputSplit(finalPath,
-splitStart,
-splitLength,
-oldSplit.getLocations(),
-filtedBlocks,
-readContext.getRequestedSchema().toString(),
-fileMetaData.getSchema().toString(),
-fileMetaData.getKeyValueMetaData(),
-readContext.getReadSupportMetadata());
+  splitStart,
+  splitStart + splitLength,
+  splitLength,
+  oldSplit.getLocations(),
+  null);
+
   return split;
 } else {
   throw new IllegalArgumentException("Unknown split type: " + oldSplit);


> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Attachment: HIVE-17261.diff

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
> Attachments: HIVE-17261.diff
>
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-17261:
---
Target Version/s: 2.3.0
  Status: Patch Available  (was: Open)

--- 
a/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
+++ 
b/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java
@@ -131,15 +131,14 @@ protected ParquetInputSplit getSplit(
 filtedBlocks = splitGroup;
   }

+
   split = new ParquetInputSplit(finalPath,
-splitStart,
-splitLength,
-oldSplit.getLocations(),
-filtedBlocks,
-readContext.getRequestedSchema().toString(),
-fileMetaData.getSchema().toString(),
-fileMetaData.getKeyValueMetaData(),
-readContext.getReadSupportMetadata());
+  splitStart,
+  splitStart + splitLength,
+  splitLength,
+  oldSplit.getLocations(),
+  null);
+
   return split;
 } else {
   throw new IllegalArgumentException("Unknown split type: " + oldSplit);


> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter

2017-08-09 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-17261:
--

Assignee: Junjie Chen

> Hive use deprecated ParquetInputSplit constructor which blocked parquet 
> dictionary filter
> -
>
> Key: HIVE-17261
> URL: https://issues.apache.org/jira/browse/HIVE-17261
> Project: Hive
>  Issue Type: Improvement
>  Components: Database/Schema
>Affects Versions: 2.2.0
>Reporter: Junjie Chen
>Assignee: Junjie Chen
>Priority: Minor
>
> Hive use deprecated ParquetInputSplit in 
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128]
> Please see interface definition in 
> [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80]
> Old interface set rowgroupoffset values which will lead to skip dictionary 
> filter in parquet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2017-05-21 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-11555:
--

Assignee: (was: Junjie Chen)

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HIVE-14372) Odd behavior with Beeline parsing server principal in Kerberized environment

2016-09-08 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473060#comment-15473060
 ] 

Junjie Chen edited comment on HIVE-14372 at 9/8/16 7:23 AM:


Hi [~vihangk1]

Ether the JDK API createSaslClient do not accept the realm parameter,  see: 
createSaslClient(String[] mechanisms, String authorizationId, String protocol, 
String serverName, Map props, CallbackHandler cbh) or underlying 
security provider com.sun.security.sasl.Provider (GssKrb5Client.java in 
com.sun.security.sasl.gsskerb.GssKrb5Client) do not accept realm parameter, 
since Kerberos V5 mechanism will map hostname to canonical principal format in 
three ways (refer to [1] and [2]). For example,  the underlying security 
provider will read your kerberos configuration krb5.conf to generate a realm 
through the [domain_realm] section. 

Currently, though the hive code check whether there is a realm part, it doesn't 
use it at all. I think the realm check should be removed according to java API 
definition, and user could configure realm in krb5.conf.  what do you think?

[1]: https://web.mit.edu/kerberos/krb5-1.13/doc/admin/realm_config.html 
[2]: 
https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/single-signon.html


was (Author: junjie):
Hi [~vihangk1]

Ether the JDK API createSaslClient do not accept the realm parameter,  see: 
createSaslClient(String[] mechanisms, String authorizationId, String protocol, 
String serverName, Map props, CallbackHandler cbh) or underlying 
security provider com.sun.security.sasl.Provider (GssKrb5Client.java in 
com.sun.security.sasl.gsskerb.GssKrb5Client) do not accept realm parameter, 
Since Kerberos V5 mechanism will map hostname to canonical principal format in 
three ways (refer to [1] and [2]). For example,  the underlying security 
provider will read your kerberos configuration krb5.conf to generate a realm 
through the [domain_realm] section. 

Currently, though the hive code check whether there is a realm part, it doesn't 
use it at all. I think the realm check should be remove according to java API 
definition, and user could configure realm in krb5.conf.  what do you think?

[1]: https://web.mit.edu/kerberos/krb5-1.13/doc/admin/realm_config.html 
[2]: 
https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/single-signon.html

> Odd behavior with Beeline parsing server principal in Kerberized environment
> 
>
> Key: HIVE-14372
> URL: https://issues.apache.org/jira/browse/HIVE-14372
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Junjie Chen
>
> Case 1:
> I can replace the realm with any garbage realm, and it still works.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
>  
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://c62-n3.intuit.test:1/> show tables;
> ---
> tab_name
> ---
> t1
> t2
> test
> ---
> 3 rows selected (1.749 seconds)
> 0: jdbc:hive2://c62-n3.intuit.test:1/>
> {code}
> Case 2:
> I can keep the garbage realm, but if I use a different hostname (notice I've 
> truncated it to c62-n3.intuit instead of c62-n3.intuit.test), it fails (as it 
> should) but the error message is not at all user-friendly.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC 
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> 13/06/10 08:34:29 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - UNKNOWN_SERVER)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
> at 
> 

[jira] [Comment Edited] (HIVE-14372) Odd behavior with Beeline parsing server principal in Kerberized environment

2016-09-08 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473060#comment-15473060
 ] 

Junjie Chen edited comment on HIVE-14372 at 9/8/16 7:21 AM:


Hi [~vihangk1]

Ether the JDK API createSaslClient do not accept the realm parameter,  see: 
createSaslClient(String[] mechanisms, String authorizationId, String protocol, 
String serverName, Map props, CallbackHandler cbh) or underlying 
security provider com.sun.security.sasl.Provider (GssKrb5Client.java in 
com.sun.security.sasl.gsskerb.GssKrb5Client) do not accept realm parameter, 
Since Kerberos V5 mechanism will map hostname to canonical principal format in 
three ways (refer to [1] and [2]). For example,  the underlying security 
provider will read your kerberos configuration krb5.conf to generate a realm 
through the [domain_realm] section. 

Currently, though the hive code check whether there is a realm part, it doesn't 
use it at all. I think the realm check should be remove according to java API 
definition, and user could configure realm in krb5.conf.  what do you think?

[1]: https://web.mit.edu/kerberos/krb5-1.13/doc/admin/realm_config.html 
[2]: 
https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/single-signon.html


was (Author: junjie):
Hi ~Vihang Karajgaonkar

Ether the JDK API createSaslClient do not accept the realm parameter,  see: 
createSaslClient(String[] mechanisms, String authorizationId, String protocol, 
String serverName, Map props, CallbackHandler cbh) or underlying 
security provider com.sun.security.sasl.Provider (GssKrb5Client.java in 
com.sun.security.sasl.gsskerb.GssKrb5Client) do not accept realm parameter, 
Since Kerberos V5 mechanism will map hostname to canonical principal format in 
three ways (refer to [1] and [2]). For example,  the underlying security 
provider will read your kerberos configuration krb5.conf to generate a realm 
through the [domain_realm] section. 

Currently, though the hive code check whether there is a realm part, it doesn't 
use it at all. I think the realm check should be remove according to java API 
definition, and user could configure realm in krb5.conf.  what do you think?

[1]: https://web.mit.edu/kerberos/krb5-1.13/doc/admin/realm_config.html 
[2]: 
https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/single-signon.html

> Odd behavior with Beeline parsing server principal in Kerberized environment
> 
>
> Key: HIVE-14372
> URL: https://issues.apache.org/jira/browse/HIVE-14372
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Junjie Chen
>
> Case 1:
> I can replace the realm with any garbage realm, and it still works.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
>  
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://c62-n3.intuit.test:1/> show tables;
> ---
> tab_name
> ---
> t1
> t2
> test
> ---
> 3 rows selected (1.749 seconds)
> 0: jdbc:hive2://c62-n3.intuit.test:1/>
> {code}
> Case 2:
> I can keep the garbage realm, but if I use a different hostname (notice I've 
> truncated it to c62-n3.intuit instead of c62-n3.intuit.test), it fails (as it 
> should) but the error message is not at all user-friendly.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC 
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> 13/06/10 08:34:29 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - UNKNOWN_SERVER)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
> at 
> 

[jira] [Commented] (HIVE-14372) Odd behavior with Beeline parsing server principal in Kerberized environment

2016-09-08 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15473060#comment-15473060
 ] 

Junjie Chen commented on HIVE-14372:


Hi ~Vihang Karajgaonkar

Ether the JDK API createSaslClient do not accept the realm parameter,  see: 
createSaslClient(String[] mechanisms, String authorizationId, String protocol, 
String serverName, Map props, CallbackHandler cbh) or underlying 
security provider com.sun.security.sasl.Provider (GssKrb5Client.java in 
com.sun.security.sasl.gsskerb.GssKrb5Client) do not accept realm parameter, 
Since Kerberos V5 mechanism will map hostname to canonical principal format in 
three ways (refer to [1] and [2]). For example,  the underlying security 
provider will read your kerberos configuration krb5.conf to generate a realm 
through the [domain_realm] section. 

Currently, though the hive code check whether there is a realm part, it doesn't 
use it at all. I think the realm check should be remove according to java API 
definition, and user could configure realm in krb5.conf.  what do you think?

[1]: https://web.mit.edu/kerberos/krb5-1.13/doc/admin/realm_config.html 
[2]: 
https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/single-signon.html

> Odd behavior with Beeline parsing server principal in Kerberized environment
> 
>
> Key: HIVE-14372
> URL: https://issues.apache.org/jira/browse/HIVE-14372
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Junjie Chen
>
> Case 1:
> I can replace the realm with any garbage realm, and it still works.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
>  
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://c62-n3.intuit.test:1/> show tables;
> ---
> tab_name
> ---
> t1
> t2
> test
> ---
> 3 rows selected (1.749 seconds)
> 0: jdbc:hive2://c62-n3.intuit.test:1/>
> {code}
> Case 2:
> I can keep the garbage realm, but if I use a different hostname (notice I've 
> truncated it to c62-n3.intuit instead of c62-n3.intuit.test), it fails (as it 
> should) but the error message is not at all user-friendly.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC 
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> 13/06/10 08:34:29 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - UNKNOWN_SERVER)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:156)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)
> at java.sql.DriverManager.getConnection(DriverManager.java:582)
> at java.sql.DriverManager.getConnection(DriverManager.java:185)
> at 
> 

[jira] [Commented] (HIVE-14372) Odd behavior with Beeline parsing server principal in Kerberized environment

2016-08-30 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450614#comment-15450614
 ] 

Junjie Chen commented on HIVE-14372:


Hi Vihang Karajgaonkar

I can reproduce case 1 and case 2, but cannot reproduce case 3. Can you run 
klist -k  to check whether you added server hostname to some principle? 
Or could you please dump klist -k ? 

Furthermore, if would be better if you can set beeline log level to debug and 
paste output for case 1 and case 2. 

> Odd behavior with Beeline parsing server principal in Kerberized environment
> 
>
> Key: HIVE-14372
> URL: https://issues.apache.org/jira/browse/HIVE-14372
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Junjie Chen
>
> Case 1:
> I can replace the realm with any garbage realm, and it still works.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
>  
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://c62-n3.intuit.test:1/> show tables;
> ---
> tab_name
> ---
> t1
> t2
> test
> ---
> 3 rows selected (1.749 seconds)
> 0: jdbc:hive2://c62-n3.intuit.test:1/>
> {code}
> Case 2:
> I can keep the garbage realm, but if I use a different hostname (notice I've 
> truncated it to c62-n3.intuit instead of c62-n3.intuit.test), it fails (as it 
> should) but the error message is not at all user-friendly.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC 
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> 13/06/10 08:34:29 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - UNKNOWN_SERVER)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:156)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)
> at java.sql.DriverManager.getConnection(DriverManager.java:582)
> at java.sql.DriverManager.getConnection(DriverManager.java:185)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:152)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:193)
> at org.apache.hive.beeline.Commands.connect(Commands.java:965)
> at org.apache.hive.beeline.Commands.connect(Commands.java:896)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:66)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:755)
> at 

[jira] [Assigned] (HIVE-14372) Odd behavior with Beeline parsing server principal in Kerberized environment

2016-08-29 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-14372:
--

Assignee: Junjie Chen

> Odd behavior with Beeline parsing server principal in Kerberized environment
> 
>
> Key: HIVE-14372
> URL: https://issues.apache.org/jira/browse/HIVE-14372
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Junjie Chen
>
> Case 1:
> I can replace the realm with any garbage realm, and it still works.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
>  
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit.t...@abc.xyz:
>  
> Connected to: Hive (version 0.10.0)
> Driver: Hive (version 0.10.0-cdh4.2.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> 0: jdbc:hive2://c62-n3.intuit.test:1/> show tables;
> ---
> tab_name
> ---
> t1
> t2
> test
> ---
> 3 rows selected (1.749 seconds)
> 0: jdbc:hive2://c62-n3.intuit.test:1/>
> {code}
> Case 2:
> I can keep the garbage realm, but if I use a different hostname (notice I've 
> truncated it to c62-n3.intuit instead of c62-n3.intuit.test), it fails (as it 
> should) but the error message is not at all user-friendly.
> {code}
> [root@c62-n3 ~]# beeline
> Beeline version 0.10.0-cdh4.2.0 by Apache Hive
> beeline> !connect 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC 
> scan complete in 4ms
> Connecting to 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC
> Enter username for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> Enter password for 
> jdbc:hive2://c62-n3.intuit.test:1/;principal=hive/c62-n3.intuit@ABC: 
> 13/06/10 08:34:29 ERROR transport.TSaslTransport: SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Server not 
> found in Kerberos database (7) - UNKNOWN_SERVER)]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
> at 
> org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> at 
> org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> at 
> org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
> at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:156)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)
> at java.sql.DriverManager.getConnection(DriverManager.java:582)
> at java.sql.DriverManager.getConnection(DriverManager.java:185)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:152)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:193)
> at org.apache.hive.beeline.Commands.connect(Commands.java:965)
> at org.apache.hive.beeline.Commands.connect(Commands.java:896)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:66)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:755)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:631)
> at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:380)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:364)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> 

[jira] [Assigned] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-12546:
--

Assignee: Junjie Chen

> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-12546 started by Junjie Chen.
--
> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work stopped] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-12546 stopped by Junjie Chen.
--
> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-10 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416378#comment-15416378
 ] 

Junjie Chen edited comment on HIVE-12546 at 8/11/16 2:22 AM:
-

Hi[~sershe]
I also tried with OS X(10.11.6) terminal, it works fine with ssh.

can we close this?


was (Author: junjie):
Hi[~sershe]
I also tried with OS X(10.11.6) terminal, it works fine with ssh. Could you 
please specifiy which version of beeline and what ENV you were using? it would 
be better it you can dump the full ENV, and elaborate reproduce steps. 

> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-10 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15416378#comment-15416378
 ] 

Junjie Chen commented on HIVE-12546:


Hi[~sershe]
I also tried with OS X(10.11.6) terminal, it works fine with ssh. Could you 
please specifiy which version of beeline and what ENV you were using? it would 
be better it you can dump the full ENV, and elaborate reproduce steps. 

> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2016-08-07 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411128#comment-15411128
 ] 

Junjie Chen commented on HIVE-11555:


Hi [~thejas][~bharathv]
Not sure what need to be done here? But I tried to connect mysql without useSSL 
option, and it shows following: 

WARN: Establishing SSL connection without server's identity verification is not 
recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL 
connection must be established by default if explicit option isn't set. For 
compliance with existing applications not using SSL the verifyServerCertificate 
property is set to 'false'. You need either to explicitly disable SSL by 
setting useSSL=false, or set useSSL=true and provide truststore for server 
certificate verification.

So I would propose to employ same policy like mysql. Are you OK with this? 

Or you were asking to build an secure way on http? like SASL?

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>Assignee: Junjie Chen
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2016-08-04 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen updated HIVE-11555:
---
Comment: was deleted

(was: It should be simple if the ssl option set to true by defualt.)

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>Assignee: Junjie Chen
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2016-08-02 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405253#comment-15405253
 ] 

Junjie Chen commented on HIVE-11555:


It should be simple if the ssl option set to true by defualt.

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>Assignee: Junjie Chen
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11555) Beeline sends password in clear text if we miss -ssl=true flag in the connect string

2016-08-02 Thread Junjie Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-11555:
--

Assignee: Junjie Chen

> Beeline sends password in clear text if we miss -ssl=true flag in the connect 
> string
> 
>
> Key: HIVE-11555
> URL: https://issues.apache.org/jira/browse/HIVE-11555
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.0
>Reporter: bharath v
>Assignee: Junjie Chen
>
> {code}
> I used tcpdump to display the network traffic: 
> [root@fe01 ~]# beeline 
> Beeline version 0.13.1-cdh5.3.2 by Apache Hive 
> beeline> !connect jdbc:hive2://fe01.sectest.poc:1/default 
> Connecting to jdbc:hive2://fe01.sectest.poc:1/default 
> Enter username for jdbc:hive2://fe01.sectest.poc:1/default: tdaranyi 
> Enter password for jdbc:hive2://fe01.sectest.poc:1/default: * 
> (I entered "cleartext" as the password) 
> The tcpdump in a different window 
> tdara...@fe01.sectest.poc:~$ sudo tcpdump -n -X -i lo port 1 
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode 
> listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes 
> (...) 
> 10:25:16.329974 IP 192.168.32.102.54322 > 192.168.32.102.ndmp: Flags [P.], 
> seq 11:35, ack 1, win 512, options [nop,nop,TS val 2412851969 ecr 
> 2412851969], length 24 
> 0x: 4500 004c 3dd3 4000 4006 3abc c0a8 2066 E..L=.@.@.:f 
> 0x0010: c0a8 2066 d432 2710 714c 0edc b45c 9268 ...f.2'.qL...\.h 
> 0x0020: 8018 0200 c25b  0101 080a 8fd1 3301 .[3. 
> 0x0030: 8fd1 3301 0500  1300 7464 6172 616e ..3...tdaran 
> 0x0040: 7969 0063 6c65 6172 7465 7874 yi.cleartext 
> (...) 
> {code}
> We rely on the user supplied configuration to decide whether to open an SSL 
> socket or a Plain one. Instead we can negotiate this information from the HS2 
> and connect accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)