[jira] [Updated] (HIVE-21902) HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty response header

2019-06-19 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21902:
--
Issue Type: Bug  (was: Improvement)

> HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty 
> response header
> 
>
> Key: HIVE-21902
> URL: https://issues.apache.org/jira/browse/HIVE-21902
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21902.patch
>
>
> some vulnerability are reported for webserver ui
> X-Frame-Options or Content-Security-Policy: frame-ancestors HTTP Headers 
> missing on port 10002. 
> {code}
> GET / HTTP/1.1 
> Host: HOSTNAME:10002 
> Connection: Keep-Alive 
> X-XSS-Protection HTTP Header missing on port 10002. 
> X-Content-Type-Options HTTP Header missing on port 10002. 
> {code}
> after the proposed changes
> {code}
> HTTP/1.1 200 OK
> Date: Thu, 20 Jun 2019 05:29:59 GMT
> Content-Type: text/html;charset=utf-8
> X-Content-Type-Options: nosniff
> X-FRAME-OPTIONS: SAMEORIGIN
> X-XSS-Protection: 1; mode=block
> Set-Cookie: JSESSIONID=15kscuow9cmy7qms6dzaxllqt;Path=/
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Content-Length: 3824
> Server: Jetty(9.3.25.v20180904)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21902) HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty response header

2019-06-19 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21902:
--
Attachment: HIVE-21902.patch
Status: Patch Available  (was: Open)

> HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty 
> response header
> 
>
> Key: HIVE-21902
> URL: https://issues.apache.org/jira/browse/HIVE-21902
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21902.patch
>
>
> some vulnerability are reported for webserver ui
> X-Frame-Options or Content-Security-Policy: frame-ancestors HTTP Headers 
> missing on port 10002. 
> {code}
> GET / HTTP/1.1 
> Host: HOSTNAME:10002 
> Connection: Keep-Alive 
> X-XSS-Protection HTTP Header missing on port 10002. 
> X-Content-Type-Options HTTP Header missing on port 10002. 
> {code}
> after the proposed changes
> {code}
> HTTP/1.1 200 OK
> Date: Thu, 20 Jun 2019 05:29:59 GMT
> Content-Type: text/html;charset=utf-8
> X-Content-Type-Options: nosniff
> X-FRAME-OPTIONS: SAMEORIGIN
> X-XSS-Protection: 1; mode=block
> Set-Cookie: JSESSIONID=15kscuow9cmy7qms6dzaxllqt;Path=/
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Content-Length: 3824
> Server: Jetty(9.3.25.v20180904)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21902) HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty response header

2019-06-19 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh reassigned HIVE-21902:
-

Assignee: Rajkumar Singh

> HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty 
> response header
> 
>
> Key: HIVE-21902
> URL: https://issues.apache.org/jira/browse/HIVE-21902
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>
> some vulnerability are reported for webserver ui
> X-Frame-Options or Content-Security-Policy: frame-ancestors HTTP Headers 
> missing on port 10002. 
> {code}
> GET / HTTP/1.1 
> Host: HOSTNAME:10002 
> Connection: Keep-Alive 
> X-XSS-Protection HTTP Header missing on port 10002. 
> X-Content-Type-Options HTTP Header missing on port 10002. 
> {code}
> after the proposed changes
> {code}
> HTTP/1.1 200 OK
> Date: Thu, 20 Jun 2019 05:29:59 GMT
> Content-Type: text/html;charset=utf-8
> X-Content-Type-Options: nosniff
> X-FRAME-OPTIONS: SAMEORIGIN
> X-XSS-Protection: 1; mode=block
> Set-Cookie: JSESSIONID=15kscuow9cmy7qms6dzaxllqt;Path=/
> Expires: Thu, 01 Jan 1970 00:00:00 GMT
> Content-Length: 3824
> Server: Jetty(9.3.25.v20180904)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Gidon Gershinsky (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868263#comment-16868263
 ] 

Gidon Gershinsky edited comment on HIVE-21848 at 6/20/19 5:23 AM:
--

Yep, agreed, this seems to be the minimum common set. In any case, it looks 
like we are talking about two layers: one of the Hive table properties, and the 
other of format-specific (ORC, Parquet) properties. They wont be the same, and 
Hive will need to translate from the former to the latter. Eg, most of Hive 
properties for Parquet encryption will be "parquet.encrypt.abcd", which should 
be changed into "encryption.abcd" when calling the Parquet API. For the 
question of "which column to be encrypted with which key", I personally prefer 
the approach #2, but it can be easily translated by Hive into #1 for ORC files. 
If #1 is adopted for Hive, it can be easily translated into #2 for Parquet 
files.


was (Author: gershinsky):
Yep, agreed, this seems to be the minimum common set. In any case, it looks 
like we are talking about two layers: one of the Hive table properties, and the 
other of format-specific (ORC, Parquet) properties. They wont be the same, and 
Hive will need to translate from the former to the latter. Eg, most of Hive 
properties for Parquet encryption will be "parquet.encrypt.abcd", which should 
be changed into "encryption.abcd" for Parquet files. For the question of "which 
column to be encrypted with which key", I personally prefer the approach #2, 
but it can be easily translated by Hive into #1 for ORC files. If #1 is adopted 
for Hive, it can be easily translated into #2 for Parquet files.

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Gidon Gershinsky (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868263#comment-16868263
 ] 

Gidon Gershinsky commented on HIVE-21848:
-

Yep, agreed, this seems to be the minimum common set. In any case, it looks 
like we are talking about two layers: one of the Hive table properties, and the 
other of format-specific (ORC, Parquet) properties. They wont be the same, and 
Hive will need to translate from the former to the latter. Eg, most of Hive 
properties for Parquet encryption will be "parquet.encrypt.abcd", which should 
be changed into "encryption.abcd" for Parquet files. For the question of "which 
column to be encrypted with which key", I personally prefer the approach #2, 
but it can be easily translated by Hive into #1 for ORC files. If #1 is adopted 
for Hive, it can be easily translated into #2 for Parquet files.

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21225) ACID: getAcidState() should cache a recursive dir listing locally

2019-06-19 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-21225:

Attachment: HIVE-21225.3.patch

> ACID: getAcidState() should cache a recursive dir listing locally
> -
>
> Key: HIVE-21225
> URL: https://issues.apache.org/jira/browse/HIVE-21225
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-21225.1.patch, HIVE-21225.2.patch, 
> HIVE-21225.3.patch, HIVE-21225.3.patch, async-pid-44-2.svg
>
>
> Currently getAcidState() makes 3 calls into the FS api which could be 
> answered by making a single recursive listDir call and reusing the same data 
> to check for isRawFormat() and isValidBase().
> All delta operations for a single partition can go against a single listed 
> directory snapshot instead of interacting with the NameNode or ObjectStore 
> within the inner loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21225) ACID: getAcidState() should cache a recursive dir listing locally

2019-06-19 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-21225:

Status: Open  (was: Patch Available)

> ACID: getAcidState() should cache a recursive dir listing locally
> -
>
> Key: HIVE-21225
> URL: https://issues.apache.org/jira/browse/HIVE-21225
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-21225.1.patch, HIVE-21225.2.patch, 
> HIVE-21225.3.patch, async-pid-44-2.svg
>
>
> Currently getAcidState() makes 3 calls into the FS api which could be 
> answered by making a single recursive listDir call and reusing the same data 
> to check for isRawFormat() and isValidBase().
> All delta operations for a single partition can go against a single listed 
> directory snapshot instead of interacting with the NameNode or ObjectStore 
> within the inner loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868258#comment-16868258
 ] 

Hive QA commented on HIVE-21787:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972279/HIVE-21787.14.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16170 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17665/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17665/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17665/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972279 - PreCommit-HIVE-Build

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.13.patch, HIVE-21787.14.patch, 
> HIVE-21787.2.patch, HIVE-21787.3.patch, HIVE-21787.4.patch, 
> HIVE-21787.5.patch, HIVE-21787.6.patch, HIVE-21787.7.patch, 
> HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21841) Leader election in HMS to run housekeeping tasks.

2019-06-19 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-21841:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

[^HIVE-21841.09.patch] committed to master. Thanks [~ashutosh.bapat].

> Leader election in HMS to run housekeeping tasks.
> -
>
> Key: HIVE-21841
> URL: https://issues.apache.org/jira/browse/HIVE-21841
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21841.01.patch, HIVE-21841.02.patch, 
> HIVE-21841.04.patch, HIVE-21841.05.patch, HIVE-21841.06.patch, 
> HIVE-21841.07.patch, HIVE-21841.08.patch, HIVE-21841.09.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> HMS performs housekeeping tasks. When there are multiple HMSes we need to 
> have a leader HMS elected which will carry out those housekeeping tasks. 
> These tasks include execution of compaction tasks, auto-discovering 
> partitions for external tables, generation of compaction tasks, repl thread 
> etc.
> Note that, though the code for compaction tasks, auto-discovery of partitions 
> etc. is in Hive, the actual tasks are initiated by an HMS configured to do 
> so. So, leader election is required only for HMS and not for HS2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21841) Leader election in HMS to run housekeeping tasks.

2019-06-19 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868256#comment-16868256
 ] 

mahesh kumar behera commented on HIVE-21841:


+1 

[^HIVE-21841.09.patch] looks fine to me

> Leader election in HMS to run housekeeping tasks.
> -
>
> Key: HIVE-21841
> URL: https://issues.apache.org/jira/browse/HIVE-21841
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21841.01.patch, HIVE-21841.02.patch, 
> HIVE-21841.04.patch, HIVE-21841.05.patch, HIVE-21841.06.patch, 
> HIVE-21841.07.patch, HIVE-21841.08.patch, HIVE-21841.09.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> HMS performs housekeeping tasks. When there are multiple HMSes we need to 
> have a leader HMS elected which will carry out those housekeeping tasks. 
> These tasks include execution of compaction tasks, auto-discovering 
> partitions for external tables, generation of compaction tasks, repl thread 
> etc.
> Note that, though the code for compaction tasks, auto-discovery of partitions 
> etc. is in Hive, the actual tasks are initiated by an HMS configured to do 
> so. So, leader election is required only for HMS and not for HS2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21899) Utils.getCanonicalHostName() may return IP address depending on DNS infra

2019-06-19 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated HIVE-21899:
---
Affects Version/s: 2.4.0
   3.0.0
   3.1.0
   3.1.1

> Utils.getCanonicalHostName() may return IP address depending on DNS infra
> -
>
> Key: HIVE-21899
> URL: https://issues.apache.org/jira/browse/HIVE-21899
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Affects Versions: 3.0.0, 2.4.0, 3.1.0, 3.1.1
>Reporter: KWON BYUNGCHANG
>Priority: Major
> Attachments: HIVE-21899.001.patch
>
>
> if there is not PTR record of hostname A in DNS, 
> org.apache.hive.jdbc.Utils.getCanonicalHostName(“A”) return IP Address.
> And failed connecting secured HS2 or HMS because cannot getting kerberos 
> service ticket of HS2 or HMS using ip address. 
> workaround is adding hostname A and IP to /etc/hosts,  it is uncomfortable.
> below is krb5 debug log.
> note that {{Server not found in Kerberos database}} and 
> {{hive/10.1@example.com}}
> {code}
> Picked up JAVA_TOOL_OPTIONS: -Dsun.security.krb5.debug=true
> Connecting to 
> jdbc:hive2://zk1.example.com:2181,zk2.example.com:2181,zk.example.com:2181/default;principal=hive/_h...@example.com;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
> Java config name: /etc/krb5.conf
> Loaded from Java config
> Java config name: /etc/krb5.conf
> Loaded from Java config
> >>> KdcAccessibility: reset
> >>> KdcAccessibility: reset
> >>>DEBUG   client principal is mag...@example.com
> >>>DEBUG  server principal is 
> >>>krbtgt/example@example.com
> >>>DEBUG  key type: 18
> >>>DEBUG  auth time: Thu Jun 20 12:46:45 JST 2019
> >>>DEBUG  start time: Thu Jun 20 12:46:45 JST 2019
> >>>DEBUG  end time: Fri Jun 21 12:46:43 JST 2019
> >>>DEBUG  renew_till time: Thu Jun 27 12:46:43 JST 2019
> >>> CCacheInputStream: readFlags()  FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH;
> Found ticket for mag...@example.com to go to krbtgt/example@example.com 
> expiring on Fri Jun 21 12:46:43 JST 2019
> Entered Krb5Context.initSecContext with state=STATE_NEW
> Found ticket for mag...@example.com to go to krbtgt/example@example.com 
> expiring on Fri Jun 21 12:46:43 JST 2019
> Service ticket not found in the subject
> >>> Credentials acquireServiceCreds: same realm
> Using builtin default etypes for default_tgs_enctypes
> default etypes for default_tgs_enctypes: 
> >>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
> >>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
> >>> KrbKdcReq send: kdc=kerberos.example.com UDP:88, timeout=3, number of 
> >>> retries =3, #bytes=661
> >>> KDCCommunication: kdc=kerberos.example.com UDP:88, timeout=3,Attempt 
> >>> =1, #bytes=661
> >>> KrbKdcReq send: #bytes read=171
> >>> KdcAccessibility: remove kerberos.example.com
> >>> KDCRep: init() encoding tag is 126 req type is 13
> >>>KRBError:
>  cTime is Wed Dec 16 00:15:05 JST 1998 913734905000
>  sTime is Thu Jun 20 12:50:30 JST 2019 156100263
>  suSec is 659395
>  error code is 7
>  error Message is Server not found in Kerberos database
>  cname is mag...@example.com
>  sname is hive/10.1@example.com
>  msgType is 30
> KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER
> at sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:73)
> at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
> at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
> at 
> sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
> at 
> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
> at 
> sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21899) Utils.getCanonicalHostName() may return IP address depending on DNS infra

2019-06-19 Thread KWON BYUNGCHANG (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KWON BYUNGCHANG updated HIVE-21899:
---
Attachment: HIVE-21899.001.patch
Status: Patch Available  (was: Open)

I have attached patch.  please review it. 

> Utils.getCanonicalHostName() may return IP address depending on DNS infra
> -
>
> Key: HIVE-21899
> URL: https://issues.apache.org/jira/browse/HIVE-21899
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore, Security
>Reporter: KWON BYUNGCHANG
>Priority: Major
> Attachments: HIVE-21899.001.patch
>
>
> if there is not PTR record of hostname A in DNS, 
> org.apache.hive.jdbc.Utils.getCanonicalHostName(“A”) return IP Address.
> And failed connecting secured HS2 or HMS because cannot getting kerberos 
> service ticket of HS2 or HMS using ip address. 
> workaround is adding hostname A and IP to /etc/hosts,  it is uncomfortable.
> below is krb5 debug log.
> note that {{Server not found in Kerberos database}} and 
> {{hive/10.1@example.com}}
> {code}
> Picked up JAVA_TOOL_OPTIONS: -Dsun.security.krb5.debug=true
> Connecting to 
> jdbc:hive2://zk1.example.com:2181,zk2.example.com:2181,zk.example.com:2181/default;principal=hive/_h...@example.com;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
> Java config name: /etc/krb5.conf
> Loaded from Java config
> Java config name: /etc/krb5.conf
> Loaded from Java config
> >>> KdcAccessibility: reset
> >>> KdcAccessibility: reset
> >>>DEBUG   client principal is mag...@example.com
> >>>DEBUG  server principal is 
> >>>krbtgt/example@example.com
> >>>DEBUG  key type: 18
> >>>DEBUG  auth time: Thu Jun 20 12:46:45 JST 2019
> >>>DEBUG  start time: Thu Jun 20 12:46:45 JST 2019
> >>>DEBUG  end time: Fri Jun 21 12:46:43 JST 2019
> >>>DEBUG  renew_till time: Thu Jun 27 12:46:43 JST 2019
> >>> CCacheInputStream: readFlags()  FORWARDABLE; RENEWABLE; INITIAL; PRE_AUTH;
> Found ticket for mag...@example.com to go to krbtgt/example@example.com 
> expiring on Fri Jun 21 12:46:43 JST 2019
> Entered Krb5Context.initSecContext with state=STATE_NEW
> Found ticket for mag...@example.com to go to krbtgt/example@example.com 
> expiring on Fri Jun 21 12:46:43 JST 2019
> Service ticket not found in the subject
> >>> Credentials acquireServiceCreds: same realm
> Using builtin default etypes for default_tgs_enctypes
> default etypes for default_tgs_enctypes: 
> >>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
> >>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
> >>> KrbKdcReq send: kdc=kerberos.example.com UDP:88, timeout=3, number of 
> >>> retries =3, #bytes=661
> >>> KDCCommunication: kdc=kerberos.example.com UDP:88, timeout=3,Attempt 
> >>> =1, #bytes=661
> >>> KrbKdcReq send: #bytes read=171
> >>> KdcAccessibility: remove kerberos.example.com
> >>> KDCRep: init() encoding tag is 126 req type is 13
> >>>KRBError:
>  cTime is Wed Dec 16 00:15:05 JST 1998 913734905000
>  sTime is Thu Jun 20 12:50:30 JST 2019 156100263
>  suSec is 659395
>  error code is 7
>  error Message is Server not found in Kerberos database
>  cname is mag...@example.com
>  sname is hive/10.1@example.com
>  msgType is 30
> KrbException: Server not found in Kerberos database (7) - LOOKING_UP_SERVER
> at sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:73)
> at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:251)
> at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:262)
> at 
> sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:308)
> at 
> sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:126)
> at 
> sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:458)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?focusedWorklogId=263516=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263516
 ]

ASF GitHub Bot logged work on HIVE-21869:
-

Author: ASF GitHub Bot
Created on: 20/Jun/19 03:55
Start Date: 20/Jun/19 03:55
Worklog Time Spent: 10m 
  Work Description: b-slim commented on issue #677: HIVE-21869 Clean up 
Kafka storage handler examples
URL: https://github.com/apache/hive/pull/677#issuecomment-503831538
 
 
   did quick pass and left minor comments. thanks for the contribution
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263516)
Time Spent: 50m  (was: 40m)

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch, HIVE-21869.2.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21894) Hadoop credential password storage for the Kafka Storage handler when security is SSL

2019-06-19 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868233#comment-16868233
 ] 

slim bouguerra commented on HIVE-21894:
---

Sounds very legit use case! would love to hear more how the hadoop credential 
can be added.

> Hadoop credential password storage for the Kafka Storage handler when 
> security is SSL
> -
>
> Key: HIVE-21894
> URL: https://issues.apache.org/jira/browse/HIVE-21894
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Priority: Major
> Fix For: 4.0.0
>
>
> The Kafka storage handler assumes that if the Hive service is configured with 
> Kerberos then the destination Kafka cluster is also secured with the same 
> Kerberos realm or trust of realms.  The security configuration of the Kafka 
> client can be overwritten due to the additive operations of the Kafka client 
> configs, but, the only way to specify SSL and the keystore/truststore 
> user/pass is via plain text table properties. 
> This ticket proposes adding Hadoop credential security to the Kafka storage 
> handler in support of SSL secured Kafka clusters.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?focusedWorklogId=263512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263512
 ]

ASF GitHub Bot logged work on HIVE-21869:
-

Author: ASF GitHub Bot
Created on: 20/Jun/19 03:53
Start Date: 20/Jun/19 03:53
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #677: HIVE-21869 Clean 
up Kafka storage handler examples
URL: https://github.com/apache/hive/pull/677#discussion_r295621867
 
 

 ##
 File path: kafka-handler/README.md
 ##
 @@ -47,71 +76,139 @@ In addition to the user defined payload schema Kafka 
Storage Handler will append
 
 List the table properties and all the partition/offsets information for the 
topic. 
 ```sql
-Describe extended kafka_table;
+DESCRIBE EXTENDED 
+  kafka_table;
 ```
 
-Count the number of records with Kafka record timestamp within the last 10 
minutes interval.
+Count the number of records with where the record timestamp is within the last 
10 minutes of query execution time.
 
 Review comment:
   with where does not sound correct to me
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263512)
Time Spent: 40m  (was: 0.5h)

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch, HIVE-21869.2.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?focusedWorklogId=263510=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263510
 ]

ASF GitHub Bot logged work on HIVE-21869:
-

Author: ASF GitHub Bot
Created on: 20/Jun/19 03:52
Start Date: 20/Jun/19 03:52
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #677: HIVE-21869 Clean 
up Kafka storage handler examples
URL: https://github.com/apache/hive/pull/677#discussion_r295621369
 
 

 ##
 File path: kafka-handler/README.md
 ##
 @@ -1,42 +1,71 @@
 # Kafka Storage Handler Module
 
-Storage Handler that allows user to Connect/Analyse/Transform Kafka topics.
-The workflow is as follow,  first the user will create an external table that 
is a view over one Kafka topic,
-then the user will be able to run any SQL query including write back to the 
same table or different kafka backed table.
+Storage Handler that allows users to connect/analyze/transform Kafka topics.
+The workflow is as follows:
+- First, the user will create an external table that is a view over one Kafka 
topic
+- Second, the user will be able to run any SQL query including write back to 
the same table or different Kafka backed table
+
+## Kafka Management
+
+Kafka Java client version: 2.x
+
+This handler does not commit offsets of topic partition reads either using the 
intrinsic Kafka capability or in an external
+storage.  This means a query over a Kafka topic backed table will be a full 
topic read unless partitions are filtered
+manually, via SQL, by the methods described below. In the ETL section, a 
method for storing topic offsets in Hive tables
+is provided for tracking consumer position but this is not a part of the 
handler itself.
 
 ## Usage
 
 ### Create Table
-Use following statement to create table:
+Use the following statement to create a table:
+
 ```sql
-CREATE EXTERNAL TABLE kafka_table
-(`timestamp` timestamp , `page` string,  `newPage` boolean,
- added int, deleted bigint, delta double)
-STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
-TBLPROPERTIES
-("kafka.topic" = "test-topic", "kafka.bootstrap.servers"="localhost:9092");
+CREATE EXTERNAL TABLE 
+  kafka_table (
+`timestamp` TIMESTAMP,
+`page` STRING,
+`newPage` BOOLEAN,
+`added` INT, 
+`deleted` BIGINT,
+`delta` DOUBLE)
+STORED BY 
+  'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
+TBLPROPERTIES ( 
+  "kafka.topic" = "test-topic",
+  "kafka.bootstrap.servers" = "localhost:9092");
 ```
-Table property `kafka.topic` is the Kafka Topic to connect to and 
`kafka.bootstrap.servers` is the Broker connection string.
+
+The table property `kafka.topic` is the Kafka topic to connect to and 
`kafka.bootstrap.servers` is the Kafka broker connection string.
 Both properties are mandatory.
-On the write path if such a topic does not exists the topic will be created if 
Kafka broker admin policy allow such operation.
+On the write path if such a topic does not exist the topic will be created if 
Kafka broker admin policy allows for 
+auto topic creation.
+
+By default the serializer and deserializer is JSON, specifically 
`org.apache.hadoop.hive.serde2.JsonSerDe`.
+
+If you want to change the serializer/deserializer classes you can update the 
TBLPROPERTIES with SQL syntax `ALTER TABLE`.
 
-By default the serializer and deserializer is Json 
`org.apache.hadoop.hive.serde2.JsonSerDe`.
-If you want to switch serializer/deserializer classes you can use alter table.
 ```sql
-ALTER TABLE kafka_table SET TBLPROPERTIES 
("kafka.serde.class"="org.apache.hadoop.hive.serde2.avro.AvroSerDe");
-``` 
-List of supported Serializer Deserializer:
+ALTER TABLE 
+  kafka_table 
+SET TBLPROPERTIES (
+  "kafka.serde.class" = "org.apache.hadoop.hive.serde2.avro.AvroSerDe");
+```
+ 
+List of supported serializers and deserializers:
 
-|Supported Serializer Deserializer|
+|Supported Serializers and Deserializers|
 |-|
 |org.apache.hadoop.hive.serde2.JsonSerDe|
 |org.apache.hadoop.hive.serde2.OpenCSVSerde|
-|org.apache.hadoop.hive.serde2.avro.AvroSerDe|
+|org.apache.hadoop.hive.serde2.avro.AvroSerDe*|
 |org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe|
 |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe|
 
- Table definition 
-In addition to the user defined payload schema Kafka Storage Handler will 
append additional columns allowing user to query the Kafka metadata fields:
+`*` This is just Apache Avro and not Confluent's serialization with schema 
registry integration.
 
 Review comment:
   nit...not sure if this is really needed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:

[jira] [Work logged] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?focusedWorklogId=263509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263509
 ]

ASF GitHub Bot logged work on HIVE-21869:
-

Author: ASF GitHub Bot
Created on: 20/Jun/19 03:51
Start Date: 20/Jun/19 03:51
Worklog Time Spent: 10m 
  Work Description: b-slim commented on pull request #677: HIVE-21869 Clean 
up Kafka storage handler examples
URL: https://github.com/apache/hive/pull/677#discussion_r295620689
 
 

 ##
 File path: kafka-handler/README.md
 ##
 @@ -1,42 +1,71 @@
 # Kafka Storage Handler Module
 
-Storage Handler that allows user to Connect/Analyse/Transform Kafka topics.
-The workflow is as follow,  first the user will create an external table that 
is a view over one Kafka topic,
-then the user will be able to run any SQL query including write back to the 
same table or different kafka backed table.
+Storage Handler that allows users to connect/analyze/transform Kafka topics.
+The workflow is as follows:
+- First, the user will create an external table that is a view over one Kafka 
topic
+- Second, the user will be able to run any SQL query including write back to 
the same table or different Kafka backed table
+
+## Kafka Management
+
+Kafka Java client version: 2.x
+
+This handler does not commit offsets of topic partition reads either using the 
intrinsic Kafka capability or in an external
+storage.  This means a query over a Kafka topic backed table will be a full 
topic read unless partitions are filtered
+manually, via SQL, by the methods described below. In the ETL section, a 
method for storing topic offsets in Hive tables
+is provided for tracking consumer position but this is not a part of the 
handler itself.
 
 ## Usage
 
 ### Create Table
-Use following statement to create table:
+Use the following statement to create a table:
+
 ```sql
-CREATE EXTERNAL TABLE kafka_table
-(`timestamp` timestamp , `page` string,  `newPage` boolean,
- added int, deleted bigint, delta double)
-STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
-TBLPROPERTIES
-("kafka.topic" = "test-topic", "kafka.bootstrap.servers"="localhost:9092");
+CREATE EXTERNAL TABLE 
+  kafka_table (
+`timestamp` TIMESTAMP,
+`page` STRING,
+`newPage` BOOLEAN,
+`added` INT, 
+`deleted` BIGINT,
+`delta` DOUBLE)
+STORED BY 
+  'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
+TBLPROPERTIES ( 
+  "kafka.topic" = "test-topic",
+  "kafka.bootstrap.servers" = "localhost:9092");
 ```
-Table property `kafka.topic` is the Kafka Topic to connect to and 
`kafka.bootstrap.servers` is the Broker connection string.
+
+The table property `kafka.topic` is the Kafka topic to connect to and 
`kafka.bootstrap.servers` is the Kafka broker connection string.
 Both properties are mandatory.
-On the write path if such a topic does not exists the topic will be created if 
Kafka broker admin policy allow such operation.
+On the write path if such a topic does not exist the topic will be created if 
Kafka broker admin policy allows for 
+auto topic creation.
+
+By default the serializer and deserializer is JSON, specifically 
`org.apache.hadoop.hive.serde2.JsonSerDe`.
+
+If you want to change the serializer/deserializer classes you can update the 
TBLPROPERTIES with SQL syntax `ALTER TABLE`.
 
-By default the serializer and deserializer is Json 
`org.apache.hadoop.hive.serde2.JsonSerDe`.
-If you want to switch serializer/deserializer classes you can use alter table.
 ```sql
-ALTER TABLE kafka_table SET TBLPROPERTIES 
("kafka.serde.class"="org.apache.hadoop.hive.serde2.avro.AvroSerDe");
-``` 
-List of supported Serializer Deserializer:
+ALTER TABLE 
+  kafka_table 
+SET TBLPROPERTIES (
+  "kafka.serde.class" = "org.apache.hadoop.hive.serde2.avro.AvroSerDe");
+```
+ 
+List of supported serializers and deserializers:
 
-|Supported Serializer Deserializer|
+|Supported Serializers and Deserializers|
 |-|
 |org.apache.hadoop.hive.serde2.JsonSerDe|
 |org.apache.hadoop.hive.serde2.OpenCSVSerde|
-|org.apache.hadoop.hive.serde2.avro.AvroSerDe|
+|org.apache.hadoop.hive.serde2.avro.AvroSerDe*|
 
 Review comment:
   not sure what is the `*` at the end
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263509)
Time Spent: 20m  (was: 10m)

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: 

[jira] [Updated] (HIVE-21764) REPL DUMP should detect and bootstrap any rename table events where old table was excluded but renamed table is included.

2019-06-19 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-21764:
---
Status: Patch Available  (was: Open)

> REPL DUMP should detect and bootstrap any rename table events where old table 
> was excluded but renamed table is included.
> -
>
> Key: HIVE-21764
> URL: https://issues.apache.org/jira/browse/HIVE-21764
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21764.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular 
> expression + inclusion/exclusion list. So, in case of rename table event, the 
> event will be ignored if old table doesn't match the pattern but the new 
> table should be bootstrapped. REPL DUMP should have a mechanism to detect 
> such tables and automatically bootstrap with incremental replication.Also, if 
> renamed table is excluded from replication policy, then need to drop the old 
> table at target as well. 
> There are 4 scenarios that needs to be handled.
>  # Both new name and old name satisfies the table name pattern filter.
>  ## No need to do anything. The incremental event for rename should take care 
> of the replication.
>  # Both the names does not satisfy the table name pattern filter.
>  ## Both the names are not in the scope of the policy and thus nothing needs 
> to be done.
>  # New name satisfies the pattern but the old name does not.
>  ## The table will not be present at the target.
>  ## Rename event handler for dump should detect this case and add the new 
> table name to the list of table for bootstrap.
>  ## All the events related to the table (new name) should be ignored.
>  ## If there is a drop event for the table (with new name), then remove the 
> table from the list of tables to be bootstrapped.
>  ## In case of rename (double rename)
>  ### If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ### If the new name does not satisfies then just removed the table name from 
> the list of tables to be bootstrapped.
>  # New name does not satisfies the pattern but the old name satisfies.
>  ## Change the rename event to a drop event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21764) REPL DUMP should detect and bootstrap any rename table events where old table was excluded but renamed table is included.

2019-06-19 Thread mahesh kumar behera (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-21764:
---
Attachment: HIVE-21764.01.patch

> REPL DUMP should detect and bootstrap any rename table events where old table 
> was excluded but renamed table is included.
> -
>
> Key: HIVE-21764
> URL: https://issues.apache.org/jira/browse/HIVE-21764
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21764.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular 
> expression + inclusion/exclusion list. So, in case of rename table event, the 
> event will be ignored if old table doesn't match the pattern but the new 
> table should be bootstrapped. REPL DUMP should have a mechanism to detect 
> such tables and automatically bootstrap with incremental replication.Also, if 
> renamed table is excluded from replication policy, then need to drop the old 
> table at target as well. 
> There are 4 scenarios that needs to be handled.
>  # Both new name and old name satisfies the table name pattern filter.
>  ## No need to do anything. The incremental event for rename should take care 
> of the replication.
>  # Both the names does not satisfy the table name pattern filter.
>  ## Both the names are not in the scope of the policy and thus nothing needs 
> to be done.
>  # New name satisfies the pattern but the old name does not.
>  ## The table will not be present at the target.
>  ## Rename event handler for dump should detect this case and add the new 
> table name to the list of table for bootstrap.
>  ## All the events related to the table (new name) should be ignored.
>  ## If there is a drop event for the table (with new name), then remove the 
> table from the list of tables to be bootstrapped.
>  ## In case of rename (double rename)
>  ### If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ### If the new name does not satisfies then just removed the table name from 
> the list of tables to be bootstrapped.
>  # New name does not satisfies the pattern but the old name satisfies.
>  ## Change the rename event to a drop event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868230#comment-16868230
 ] 

slim bouguerra commented on HIVE-21895:
---

[~kristopherkane] you need to set patch available flag in order to trigger the 
tests.

thanks

> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21895.1.patch
>
>
> The Kafka client version is 2.2 and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868219#comment-16868219
 ] 

Hive QA commented on HIVE-21787:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
13s{color} | {color:blue} standalone-metastore/metastore-server in master has 
184 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
45s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
41s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 41s{color} 
| {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 6 new + 44 unchanged - 165 fixed = 50 total (was 209) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} standalone-metastore/metastore-server generated 0 
new + 179 unchanged - 5 fixed = 179 total (was 184) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17665/dev-support/hive-personality.sh
 |
| git revision | master / cd42db4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17665/yetus/patch-mvninstall-itests_hive-unit.txt
 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17665/yetus/patch-compile-itests_hive-unit.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17665/yetus/patch-compile-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17665/yetus/diff-checkstyle-standalone-metastore_metastore-server.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17665/yetus/patch-findbugs-itests_hive-unit.txt
 |
| modules | C: standalone-metastore/metastore-server itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17665/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
> 

[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868204#comment-16868204
 ] 

Hive QA commented on HIVE-21891:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:red}-1{color} | {color:red} @author {color} | {color:red}  0m  
0s{color} | {color:red} The patch appears to contain 5 @author tags which the 
community has agreed to not allow in code contributions. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
7s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} contrib in master has 10 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} hcatalog/core in master has 28 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} itests/util in master has 44 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} ql: The patch generated 0 new + 1306 unchanged - 3 
fixed = 1306 total (was 1309) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch contrib passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch core passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
37s{color} | {color:green} root: The patch generated 0 new + 2185 unchanged - 3 
fixed = 2185 total (was 2188) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch util passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868202#comment-16868202
 ] 

Hive QA commented on HIVE-21891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972276/HIVE-21891.04.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16168 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17664/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17664/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17664/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972276 - PreCommit-HIVE-Build

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21764) REPL DUMP should detect and bootstrap any rename table events where old table was excluded but renamed table is included.

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21764:
--
Labels: DR Replication pull-request-available  (was: DR Replication)

> REPL DUMP should detect and bootstrap any rename table events where old table 
> was excluded but renamed table is included.
> -
>
> Key: HIVE-21764
> URL: https://issues.apache.org/jira/browse/HIVE-21764
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
>
> REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular 
> expression + inclusion/exclusion list. So, in case of rename table event, the 
> event will be ignored if old table doesn't match the pattern but the new 
> table should be bootstrapped. REPL DUMP should have a mechanism to detect 
> such tables and automatically bootstrap with incremental replication.Also, if 
> renamed table is excluded from replication policy, then need to drop the old 
> table at target as well. 
> There are 4 scenarios that needs to be handled.
>  # Both new name and old name satisfies the table name pattern filter.
>  ## No need to do anything. The incremental event for rename should take care 
> of the replication.
>  # Both the names does not satisfy the table name pattern filter.
>  ## Both the names are not in the scope of the policy and thus nothing needs 
> to be done.
>  # New name satisfies the pattern but the old name does not.
>  ## The table will not be present at the target.
>  ## Rename event handler for dump should detect this case and add the new 
> table name to the list of table for bootstrap.
>  ## All the events related to the table (new name) should be ignored.
>  ## If there is a drop event for the table (with new name), then remove the 
> table from the list of tables to be bootstrapped.
>  ## In case of rename (double rename)
>  ### If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ### If the new name does not satisfies then just removed the table name from 
> the list of tables to be bootstrapped.
>  # New name does not satisfies the pattern but the old name satisfies.
>  ## Change the rename event to a drop event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21764) REPL DUMP should detect and bootstrap any rename table events where old table was excluded but renamed table is included.

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21764?focusedWorklogId=263495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263495
 ]

ASF GitHub Bot logged work on HIVE-21764:
-

Author: ASF GitHub Bot
Created on: 20/Jun/19 02:28
Start Date: 20/Jun/19 02:28
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #679: HIVE-21764 : 
REPL DUMP should detect and bootstrap any rename table events where old table 
was excluded but renamed table is included.
URL: https://github.com/apache/hive/pull/679
 
 
   …
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263495)
Time Spent: 10m
Remaining Estimate: 0h

> REPL DUMP should detect and bootstrap any rename table events where old table 
> was excluded but renamed table is included.
> -
>
> Key: HIVE-21764
> URL: https://issues.apache.org/jira/browse/HIVE-21764
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: DR, Replication, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular 
> expression + inclusion/exclusion list. So, in case of rename table event, the 
> event will be ignored if old table doesn't match the pattern but the new 
> table should be bootstrapped. REPL DUMP should have a mechanism to detect 
> such tables and automatically bootstrap with incremental replication.Also, if 
> renamed table is excluded from replication policy, then need to drop the old 
> table at target as well. 
> There are 4 scenarios that needs to be handled.
>  # Both new name and old name satisfies the table name pattern filter.
>  ## No need to do anything. The incremental event for rename should take care 
> of the replication.
>  # Both the names does not satisfy the table name pattern filter.
>  ## Both the names are not in the scope of the policy and thus nothing needs 
> to be done.
>  # New name satisfies the pattern but the old name does not.
>  ## The table will not be present at the target.
>  ## Rename event handler for dump should detect this case and add the new 
> table name to the list of table for bootstrap.
>  ## All the events related to the table (new name) should be ignored.
>  ## If there is a drop event for the table (with new name), then remove the 
> table from the list of tables to be bootstrapped.
>  ## In case of rename (double rename)
>  ### If the new name satisfies the table pattern, then add the new name to 
> the list of tables to be bootstrapped and remove the old name from the list 
> of tables to be bootstrapped.
>  ### If the new name does not satisfies then just removed the table name from 
> the list of tables to be bootstrapped.
>  # New name does not satisfies the pattern but the old name satisfies.
>  ## Change the rename event to a drop event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-21787:
--
Attachment: HIVE-21787.14.patch

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.13.patch, HIVE-21787.14.patch, 
> HIVE-21787.2.patch, HIVE-21787.3.patch, HIVE-21787.4.patch, 
> HIVE-21787.5.patch, HIVE-21787.6.patch, HIVE-21787.7.patch, 
> HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868161#comment-16868161
 ] 

Hive QA commented on HIVE-21787:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972274/HIVE-21787.13.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17663/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17663/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17663/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12972274/HIVE-21787.13.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972274 - PreCommit-HIVE-Build

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.13.patch, HIVE-21787.2.patch, 
> HIVE-21787.3.patch, HIVE-21787.4.patch, HIVE-21787.5.patch, 
> HIVE-21787.6.patch, HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868160#comment-16868160
 ] 

Hive QA commented on HIVE-21787:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972274/HIVE-21787.13.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16170 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17662/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17662/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17662/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972274 - PreCommit-HIVE-Build

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.13.patch, HIVE-21787.2.patch, 
> HIVE-21787.3.patch, HIVE-21787.4.patch, HIVE-21787.5.patch, 
> HIVE-21787.6.patch, HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868143#comment-16868143
 ] 

Hive QA commented on HIVE-21787:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
11s{color} | {color:blue} standalone-metastore/metastore-server in master has 
184 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
41s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
43s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 43s{color} 
| {color:red} hive-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 6 new + 44 unchanged - 165 fixed = 50 total (was 209) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
22s{color} | {color:red} standalone-metastore/metastore-server generated 1 new 
+ 179 unchanged - 5 fixed = 180 total (was 184) {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
35s{color} | {color:red} hive-unit in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  Invocation of toString on SharedCache$TableWrapper.getSdHash() in 
org.apache.hadoop.hive.metastore.cache.SharedCache$1.onRemoval(RemovalNotification)
  At SharedCache.java:in 
org.apache.hadoop.hive.metastore.cache.SharedCache$1.onRemoval(RemovalNotification)
  At SharedCache.java:[line 207] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17662/dev-support/hive-personality.sh
 |
| git revision | master / cd42db4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| mvninstall | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17662/yetus/patch-mvninstall-itests_hive-unit.txt
 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17662/yetus/patch-compile-itests_hive-unit.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17662/yetus/patch-compile-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17662/yetus/diff-checkstyle-standalone-metastore_metastore-server.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17662/yetus/new-findbugs-standalone-metastore_metastore-server.html
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17662/yetus/patch-findbugs-itests_hive-unit.txt
 |
| 

[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: (was: HIVE-21891.04.patch)

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: HIVE-21891.04.patch

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868132#comment-16868132
 ] 

Hive QA commented on HIVE-21891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972266/HIVE-21891.04.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17661/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17661/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17661/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-06-20 00:07:47.067
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-17661/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-06-20 00:07:47.071
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at cd42db4 HIVE-21892: Trusted domain authentication should look at 
X-Forwarded-For header as well (Prasanth Jayachandran reviewed by Jason Dere, 
Ashutosh Bapat)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at cd42db4 HIVE-21892: Trusted domain authentication should look at 
X-Forwarded-For header as well (Prasanth Jayachandran reviewed by Jason Dere, 
Ashutosh Bapat)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-06-20 00:07:48.990
+ rm -rf ../yetus_PreCommit-HIVE-Build-17661
+ mkdir ../yetus_PreCommit-HIVE-Build-17661
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-17661
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-17661/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: git apply -p0
/data/hiveptest/working/scratch/build.patch:7982: trailing whitespace.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. Table tstsrc is not locked 
/data/hiveptest/working/scratch/build.patch:7992: trailing whitespace.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. Table tstsrcpart is not locked 
/data/hiveptest/working/scratch/build.patch:8022: trailing whitespace.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. Database lockneg1 is not locked 
warning: 3 lines add whitespace errors.
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]   
[ERROR]   The project  
(/data/hiveptest/working/apache-github-source-source/ql/pom.xml) has 1 error
[ERROR] Non-parseable POM 
/data/hiveptest/working/apache-github-source-source/ql/pom.xml: end tag name 
 must be the same as start tag  from line 760 
(position: TEXT seen ...\n... @764:18)  @ line 764, 
column 18 -> [Help 2]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/ModelParseException
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-17661
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972266 - 

[jira] [Commented] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868131#comment-16868131
 ] 

Hive QA commented on HIVE-21869:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972264/HIVE-21869.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16168 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17660/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17660/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17660/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972264 - PreCommit-HIVE-Build

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch, HIVE-21869.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21898) Wrong result with IN correlated subquery with aggregate in SELECT

2019-06-19 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868126#comment-16868126
 ] 

Vineet Garg commented on HIVE-21898:


Hive currently throw a runtime error for queries which are correlated,  
produces empty result set and contain  COUNT aggregate in subquery. Above query 
should follow the same.

> Wrong result with IN correlated subquery with aggregate in SELECT
> -
>
> Key: HIVE-21898
> URL: https://issues.apache.org/jira/browse/HIVE-21898
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> select 
> p_size in
>   (select min(p_size)
>from (select p_mfgr, p_size from part) a
>where a.p_mfgr = b.p_name
>   ) from part b limit 1
> {code}
> Expected result: *null*
>  Actual result: *false*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21898) Wrong result with IN correlated subquery with aggregate in SELECT

2019-06-19 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-21898:
--


> Wrong result with IN correlated subquery with aggregate in SELECT
> -
>
> Key: HIVE-21898
> URL: https://issues.apache.org/jira/browse/HIVE-21898
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> select 
> p_size in
>   (select min(p_size)
>from (select p_mfgr, p_size from part) a
>where a.p_mfgr = b.p_name
>   ) from part b limit 1
> {code}
> Expected result: null
> Actual result: false



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21898) Wrong result with IN correlated subquery with aggregate in SELECT

2019-06-19 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21898:
---
Description: 
{code:sql}
select 
p_size in
(select min(p_size)
 from (select p_mfgr, p_size from part) a
 where a.p_mfgr = b.p_name
) from part b limit 1
{code}
Expected result: *null*
 Actual result: *false*

  was:
{code:sql}
select 
p_size in
(select min(p_size)
 from (select p_mfgr, p_size from part) a
 where a.p_mfgr = b.p_name
) from part b limit 1
{code}

Expected result: null
Actual result: false


> Wrong result with IN correlated subquery with aggregate in SELECT
> -
>
> Key: HIVE-21898
> URL: https://issues.apache.org/jira/browse/HIVE-21898
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> select 
> p_size in
>   (select min(p_size)
>from (select p_mfgr, p_size from part) a
>where a.p_mfgr = b.p_name
>   ) from part b limit 1
> {code}
> Expected result: *null*
>  Actual result: *false*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-21787:
--
Attachment: HIVE-21787.13.patch

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.13.patch, HIVE-21787.2.patch, 
> HIVE-21787.3.patch, HIVE-21787.4.patch, HIVE-21787.5.patch, 
> HIVE-21787.6.patch, HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-21787:
--
Attachment: (was: HIVE-21787.12.patch)

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.2.patch, HIVE-21787.3.patch, 
> HIVE-21787.4.patch, HIVE-21787.5.patch, HIVE-21787.6.patch, 
> HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-21787:
--
Attachment: HIVE-21787.12.patch

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.12.patch, HIVE-21787.2.patch, 
> HIVE-21787.3.patch, HIVE-21787.4.patch, HIVE-21787.5.patch, 
> HIVE-21787.6.patch, HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868104#comment-16868104
 ] 

Miklos Gergely edited comment on HIVE-21897 at 6/19/19 11:40 PM:
-

[~mithun] after executing those commands:

SHOW EXTENDED foobar;

 
{code:java}
+-++--+

|          col_name           |                     data_type                   
   | comment  |

+-++--+

| foo                         | string                                          
   |          |

| bar                         | string                                          
   |          |

| dt                          | string                                          
   |          |

|                             | NULL                                            
   | NULL     |

| # Partition Information     | NULL                                            
   | NULL     |

| # col_name                  | data_type                                       
   | comment  |

| dt                          | string                                          
   |          |

|                             | NULL                                            
   | NULL     |

| Detailed Table Information  | Table(tableName:foobar, dbName:default, 
owner:hive, createTime:1560986681, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:foo, type:string, comment:null), 
FieldSchema(name:bar, type:string, comment:null), FieldSchema(name:dt, 
type:string, comment:null)], 
location:hdfs://hive-on-tezt-1.vpc.cloudera.com:8020/warehouse/tablespace/managed/hive/foobar,
 inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[FieldSchema(name:dt, type:string, comment:null)], 
parameters:{last_modified_time=1560986885, totalSize=0, numRows=0, 
rawDataSize=0, transactional_properties=insert_only, 
COLUMN_STATS_ACCURATE={\"BASIC_STATS\":\"true\"}, numFiles=0, numPartitions=2, 
transient_lastDdlTime=1560986885, bucketing_version=2, last_modified_by=hive, 
transactional=true}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE, rewriteEnabled:false, catName:hive, ownerType:USER, 
writeId:0) |          |
{code}
SHOW CREATE TABLE foobar;

 
{code:java}
++

|                   createtab_stmt                   |

++

| CREATE TABLE `foobar`(                             |

|   `foo` string,                                    |

|   `bar` string)                                    |

| PARTITIONED BY (                                   |

|   `dt` string)                                     |

| ROW FORMAT SERDE                                   |

|   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      |

| STORED AS INPUTFORMAT                              |

|   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |

| OUTPUTFORMAT                                       |

|   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |

| LOCATION                                           |

|   
'hdfs://hive-on-tezt-1.vpc.cloudera.com:8020/warehouse/tablespace/managed/hive/foobar'
 |

| TBLPROPERTIES (                                    |

|   'bucketing_version'='2',                         |

|   'last_modified_by'='hive',                       |

|   'last_modified_time'='1560986885',               |

|   'transactional'='true',                          |

|   'transactional_properties'='insert_only',        |

|   'transient_lastDdlTime'='1560986885')            |

++
{code}
So as it seems the table has only one SerDe, not per partition. Do we want to 
allow a different SerDe per partition? Because if we do, it needs planning, and 
code changes. Or for now we may stick to the one SerDe / table.


was (Author: mgergely):
[~mithun] after executing those commands:

SHOW EXTENDED foobar;

 
{code:java}
+-++--+

|          col_name           |                     data_type                   
   | comment  |

+-++--+

| foo                         | string                                          
   |          

[jira] [Commented] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868108#comment-16868108
 ] 

Miklos Gergely commented on HIVE-21897:
---

[~mithun] just for sure I've inserted two rows, one with dt=1, one with dt=2, 
and checked the files in HDFS. They are both ORC files.

> Setting serde / serde properties for partitions
> ---
>
> Key: HIVE-21897
> URL: https://issues.apache.org/jira/browse/HIVE-21897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties]
>  the SerDe and the SerDe properties can be set for a partition too, so
>  
> {code:java}
> ALTERT TABLE table PARTITION (partition_col='partition_value') SET SERDE 
> 'serde.class.name';{code}
> Is a valid statement. In fact it is not rejected, but it is not doing 
> anything at all. The execution is successful, everything remains the same. 
> The same is true for setting the serde properties:
> {code:java}
> ALTER TABLE table PARTITION (partition_col='partition_value') SET 
> SERDEPROPERTIES ('property_name'='property_value');{code}
> is also a valid statement, and not doing anything.
> I suggest to modify the parser, and reject these statements. SerDe is for a 
> table, and not for a partition.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868104#comment-16868104
 ] 

Miklos Gergely commented on HIVE-21897:
---

[~mithun] after executing those commands:

SHOW EXTENDED foobar;

 
{code:java}
+-++--+

|          col_name           |                     data_type                   
   | comment  |

+-++--+

| foo                         | string                                          
   |          |

| bar                         | string                                          
   |          |

| dt                          | string                                          
   |          |

|                             | NULL                                            
   | NULL     |

| # Partition Information     | NULL                                            
   | NULL     |

| # col_name                  | data_type                                       
   | comment  |

| dt                          | string                                          
   |          |

|                             | NULL                                            
   | NULL     |

| Detailed Table Information  | Table(tableName:foobar, dbName:default, 
owner:hive, createTime:1560986681, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:foo, type:string, comment:null), 
FieldSchema(name:bar, type:string, comment:null), FieldSchema(name:dt, 
type:string, comment:null)], 
location:hdfs://hive-on-tezt-1.vpc.cloudera.com:8020/warehouse/tablespace/managed/hive/foobar,
 inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.ql.io.orc.OrcSerde, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[FieldSchema(name:dt, type:string, comment:null)], 
parameters:{last_modified_time=1560986885, totalSize=0, numRows=0, 
rawDataSize=0, transactional_properties=insert_only, 
COLUMN_STATS_ACCURATE={\"BASIC_STATS\":\"true\"}, numFiles=0, numPartitions=2, 
transient_lastDdlTime=1560986885, bucketing_version=2, last_modified_by=hive, 
transactional=true}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE, rewriteEnabled:false, catName:hive, ownerType:USER, 
writeId:0) |          |
{code}
SHOW CREATE TABLE foobar;

 
{code:java}
++

|                   createtab_stmt                   |

++

| CREATE TABLE `foobar`(                             |

|   `foo` string,                                    |

|   `bar` string)                                    |

| PARTITIONED BY (                                   |

|   `dt` string)                                     |

| ROW FORMAT SERDE                                   |

|   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'      |

| STORED AS INPUTFORMAT                              |

|   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |

| OUTPUTFORMAT                                       |

|   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |

| LOCATION                                           |

|   
'hdfs://hive-on-tezt-1.vpc.cloudera.com:8020/warehouse/tablespace/managed/hive/foobar'
 |

| TBLPROPERTIES (                                    |

|   'bucketing_version'='2',                         |

|   'last_modified_by'='hive',                       |

|   'last_modified_time'='1560986885',               |

|   'transactional'='true',                          |

|   'transactional_properties'='insert_only',        |

|   'transient_lastDdlTime'='1560986885')            |

++
{code}
So as it seems the table has only one SerDe, not per partition. Do we want to 
allow a different SerDe per partition? Because if we do, it needs planning, and 
code changes. Or for now we may stick to the one SerDe / table.

 

 

 

 

 

 

> Setting serde / serde properties for partitions
> ---
>
> Key: HIVE-21897
> URL: https://issues.apache.org/jira/browse/HIVE-21897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> 

[jira] [Commented] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868092#comment-16868092
 ] 

Mithun Radhakrishnan commented on HIVE-21897:
-

At the risk of muddying the waters, I'd consider {{AvroSerDe}} which relies on 
the table/serde settings for {{"avro.schema.url"}} and 
{{"avro.schema.literal"}}.
When an Avro table's schema changes, old partitions might link to an older 
schema-literal SerDe-parameter value than newer partitions.

I could be wrong, but we might want to reevaluate the assumption that SerDe 
settings should apply uniformly across all partitions in a table.

> Setting serde / serde properties for partitions
> ---
>
> Key: HIVE-21897
> URL: https://issues.apache.org/jira/browse/HIVE-21897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties]
>  the SerDe and the SerDe properties can be set for a partition too, so
>  
> {code:java}
> ALTERT TABLE table PARTITION (partition_col='partition_value') SET SERDE 
> 'serde.class.name';{code}
> Is a valid statement. In fact it is not rejected, but it is not doing 
> anything at all. The execution is successful, everything remains the same. 
> The same is true for setting the serde properties:
> {code:java}
> ALTER TABLE table PARTITION (partition_col='partition_value') SET 
> SERDEPROPERTIES ('property_name'='property_value');{code}
> is also a valid statement, and not doing anything.
> I suggest to modify the parser, and reject these statements. SerDe is for a 
> table, and not for a partition.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868085#comment-16868085
 ] 

Mithun Radhakrishnan commented on HIVE-21897:
-

bq. SerDe is for a table, and not for a partition.

Pardon me, but wouldn't a SerDe be exercised per partition?

{code:sql}
CREATE TABLE foobar ( foo STRING, bar STRING ) PARTITIONED BY (dt STRING) 
STORED AS TEXTFILE;
ALTER TABLE foobar ADD PARTITION ( dt='1' ); -- SerDe == LazySimpleSerDe.
ALTER TABLE foobar SET FILEFORMAT ORCFILE;
ALTER TABLE foobar ADD PARTITION ( dt='2' ); -- SerDe == OrcSerDe.
{code}

{{foobar(dt='1')}} should use {{LazySimpleSerDe}}, while {{foobar(dt='2')}} 
would use {{OrcSerDe}}, when each is read.

> Setting serde / serde properties for partitions
> ---
>
> Key: HIVE-21897
> URL: https://issues.apache.org/jira/browse/HIVE-21897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties]
>  the SerDe and the SerDe properties can be set for a partition too, so
>  
> {code:java}
> ALTERT TABLE table PARTITION (partition_col='partition_value') SET SERDE 
> 'serde.class.name';{code}
> Is a valid statement. In fact it is not rejected, but it is not doing 
> anything at all. The execution is successful, everything remains the same. 
> The same is true for setting the serde properties:
> {code:java}
> ALTER TABLE table PARTITION (partition_col='partition_value') SET 
> SERDEPROPERTIES ('property_name'='property_value');{code}
> is also a valid statement, and not doing anything.
> I suggest to modify the parser, and reject these statements. SerDe is for a 
> table, and not for a partition.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868069#comment-16868069
 ] 

Hive QA commented on HIVE-21869:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 160 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  2m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17660/dev-support/hive-personality.sh
 |
| git revision | master / cd42db4 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17660/yetus/whitespace-eol.txt
 |
| modules | C: kafka-handler . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17660/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch, HIVE-21869.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: HIVE-21891.04.patch

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: (was: HIVE-21891.04.patch)

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868066#comment-16868066
 ] 

Miklos Gergely commented on HIVE-21897:
---

[~ashutoshc] please either approve this modification, or let me know what 
should happen if a user wants to set the SerDe / SerDe properties of a 
partition, and I'll implement it.

[~muleho...@gmail.com], likely we'll need to modify the documentation.

> Setting serde / serde properties for partitions
> ---
>
> Key: HIVE-21897
> URL: https://issues.apache.org/jira/browse/HIVE-21897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties]
>  the SerDe and the SerDe properties can be set for a partition too, so
>  
> {code:java}
> ALTERT TABLE table PARTITION (partition_col='partition_value') SET SERDE 
> 'serde.class.name';{code}
> Is a valid statement. In fact it is not rejected, but it is not doing 
> anything at all. The execution is successful, everything remains the same. 
> The same is true for setting the serde properties:
> {code:java}
> ALTER TABLE table PARTITION (partition_col='partition_value') SET 
> SERDEPROPERTIES ('property_name'='property_value');{code}
> is also a valid statement, and not doing anything.
> I suggest to modify the parser, and reject these statements. SerDe is for a 
> table, and not for a partition.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21897) Setting serde / serde properties for partitions

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-21897:
-


> Setting serde / serde properties for partitions
> ---
>
> Key: HIVE-21897
> URL: https://issues.apache.org/jira/browse/HIVE-21897
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties]
>  the SerDe and the SerDe properties can be set for a partition too, so
>  
> {code:java}
> ALTERT TABLE table PARTITION (partition_col='partition_value') SET SERDE 
> 'serde.class.name';{code}
> Is a valid statement. In fact it is not rejected, but it is not doing 
> anything at all. The execution is successful, everything remains the same. 
> The same is true for setting the serde properties:
> {code:java}
> ALTER TABLE table PARTITION (partition_col='partition_value') SET 
> SERDEPROPERTIES ('property_name'='property_value');{code}
> is also a valid statement, and not doing anything.
> I suggest to modify the parser, and reject these statements. SerDe is for a 
> table, and not for a partition.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868061#comment-16868061
 ] 

Hive QA commented on HIVE-21891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972255/HIVE-21891.04.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17659/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17659/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17659/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AbstractConstraintEventHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AddPartitionHandler.java:67
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AddPartitionHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterPartitionHandler.java:100
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterPartitionHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterTableHandler.java:88
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterTableHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CommitTxnHandler.java:113
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CommitTxnHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CreateTableHandler.java:54
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CreateTableHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/EventHandler.java:41
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/EventHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/InsertHandler.java:58
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/InsertHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdatePartColStatHandler.java:52
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdatePartColStatHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdateTableColStatHandler.java:37
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdateTableColStatHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/io/TableSerializer.java:52
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/io/TableSerializer.java' 
cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/DumpMetaData.java:19
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/DumpMetaData.java' 
cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java:146
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java' cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRegExp.java:23
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRegExp.java' 
cleanly.
error: patch failed: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestReplDumpTask.java:139
Falling back to three-way merge...
Applied patch to 
'ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestReplDumpTask.java' cleanly.
error: patch failed: 
ql/src/test/queries/clientpositive/auto_sortmerge_join_1.q:5
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/auto_sortmerge_join_1.q' 
cleanly.
error: patch failed: 
ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q:4
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q' 
cleanly.
error: patch failed: 
ql/src/test/queries/clientpositive/auto_sortmerge_join_12.q:5
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/auto_sortmerge_join_12.q' 

[jira] [Updated] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21895:
---
Affects Version/s: (was: 3.1.1)

> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21895.1.patch
>
>
> The Kafka client version is 2.2 and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21895:
---
Affects Version/s: (was: 3.1.0)

> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21895.1.patch
>
>
> The Kafka client version is 2.2 and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane reassigned HIVE-21895:
--

Assignee: Kristopher Kane

> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 3.1.0, 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21895.1.patch
>
>
> The Kafka client version is 2.2 and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21869:
---
Attachment: HIVE-21869.2.patch

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch, HIVE-21869.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21892:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the reviews!

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21892.1.patch, HIVE-21892.2.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21895:
---
Attachment: HIVE-21895.1.patch

> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 3.1.0, 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21895.1.patch
>
>
> The Kafka client version is 2.2 and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21892?focusedWorklogId=263369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263369
 ]

ASF GitHub Bot logged work on HIVE-21892:
-

Author: ASF GitHub Bot
Created on: 19/Jun/19 22:28
Start Date: 19/Jun/19 22:28
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on issue #678: HIVE-21892: Trusted 
domain authentication should look at X-Forwarded-For header as well
URL: https://github.com/apache/hive/pull/678#issuecomment-503770022
 
 
   Patch merged to master. Closing PR. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263369)
Time Spent: 1h 10m  (was: 1h)

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21892.1.patch, HIVE-21892.2.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21892?focusedWorklogId=263370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263370
 ]

ASF GitHub Bot logged work on HIVE-21892:
-

Author: ASF GitHub Bot
Created on: 19/Jun/19 22:28
Start Date: 19/Jun/19 22:28
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on pull request #678: HIVE-21892: 
Trusted domain authentication should look at X-Forwarded-For header as well
URL: https://github.com/apache/hive/pull/678
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263370)
Time Spent: 1h 20m  (was: 1h 10m)

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21892.1.patch, HIVE-21892.2.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-21892:
-
Attachment: HIVE-21892.2.patch

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21892.1.patch, HIVE-21892.2.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21892?focusedWorklogId=263368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263368
 ]

ASF GitHub Bot logged work on HIVE-21892:
-

Author: ASF GitHub Bot
Created on: 19/Jun/19 22:22
Start Date: 19/Jun/19 22:22
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on pull request #678: HIVE-21892: 
Trusted domain authentication should look at X-Forwarded-For header as well
URL: https://github.com/apache/hive/pull/678#discussion_r295548052
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -3486,6 +3486,13 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 " it is empty, which means that all the connections to HiveServer2 are 
authenticated. " +
 "When it is non-empty, the client has to provide a Hive user name. Any 
password, if " +
 "provided, will not be used when authentication is skipped."),
+
HIVE_SERVER2_TRUSTED_DOMAIN_USE_XFF_HEADER("hive.server2.trusted.domain.use.xff.header",
 false,
 
 Review comment:
   adding it by default will let clients spoof with XFF headers and some 
proxies might now sanitize it correctly. In most cases, proxies will use the 
client's ip to connect to HS2 (reverse proxies) in which case we can just use 
request.getRemoteHost() and not rely on XFF. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263368)
Time Spent: 1h  (was: 50m)

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21892.1.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21892?focusedWorklogId=263366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263366
 ]

ASF GitHub Bot logged work on HIVE-21892:
-

Author: ASF GitHub Bot
Created on: 19/Jun/19 22:21
Start Date: 19/Jun/19 22:21
Worklog Time Spent: 10m 
  Work Description: prasanthj commented on pull request #678: HIVE-21892: 
Trusted domain authentication should look at X-Forwarded-For header as well
URL: https://github.com/apache/hive/pull/678#discussion_r295547689
 
 

 ##
 File path: 
service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java
 ##
 @@ -150,16 +150,35 @@ protected void doPost(HttpServletRequest request, 
HttpServletResponse response)
   LOG.info("Could not validate cookie sent, will try to generate a new 
cookie");
 }
   }
+
+  // Set the thread local ip address
+  SessionManager.setIpAddress(clientIpAddress);
+
+  // get forwarded hosts address
+  String forwarded_for = request.getHeader(X_FORWARDED_FOR);
+  if (forwarded_for != null) {
+LOG.debug("{}:{}", X_FORWARDED_FOR, forwarded_for);
+List forwardedAddresses = 
Arrays.asList(forwarded_for.split(","));
+SessionManager.setForwardedAddresses(forwardedAddresses);
+  } else {
+SessionManager.setForwardedAddresses(Collections.emptyList());
+  }
+
   // If the cookie based authentication is not enabled or the request does 
not have a valid
   // cookie, use authentication depending on the server setup.
   if (clientUserName == null) {
 String trustedDomain = HiveConf.getVar(hiveConf, 
ConfVars.HIVE_SERVER2_TRUSTED_DOMAIN).trim();
-
+final boolean useXff = HiveConf.getBoolVar(hiveConf, 
ConfVars.HIVE_SERVER2_TRUSTED_DOMAIN_USE_XFF_HEADER);
+if (useXff && !trustedDomain.isEmpty() &&
+  SessionManager.getForwardedAddresses() != null && 
!SessionManager.getForwardedAddresses().isEmpty()) {
+  clientIpAddress = SessionManager.getForwardedAddresses().get(0);
 
 Review comment:
   will add
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 263366)
Time Spent: 50m  (was: 40m)

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21892.1.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21892) Trusted domain authentication should look at X-Forwarded-For header as well

2019-06-19 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868048#comment-16868048
 ] 

Jason Dere commented on HIVE-21892:
---

+1

> Trusted domain authentication should look at X-Forwarded-For header as well
> ---
>
> Key: HIVE-21892
> URL: https://issues.apache.org/jira/browse/HIVE-21892
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21892.1.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-21783 added trusted domain authentication. However, it looks only at 
> request.getRemoteAddr() which works in most cases where there are no 
> intermediate forward/reverse proxies. In trusted domain scenarios, if there 
> intermediate proxies, the proxies typically append its own ip address 
> "X-Forwarded-For" header. The X-Forwarded-For will look like clientIp -> 
> proxyIp1 -> proxyIp2. The left most ip address in the X-Forwarded-For 
> represents the real client ip address. For such scenarios, add a config to 
> optionally look at X-Forwarded-For header when available to determine the 
> real client ip. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: HIVE-21891.04.patch

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: (was: HIVE-21891.04.patch)

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: (was: HIVE-21891.04.patch)

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868022#comment-16868022
 ] 

Hive QA commented on HIVE-21891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972245/HIVE-21891.04.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17658/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17658/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17658/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AbstractConstraintEventHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AddPartitionHandler.java:67
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AddPartitionHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterPartitionHandler.java:100
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterPartitionHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterTableHandler.java:88
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterTableHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CommitTxnHandler.java:113
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CommitTxnHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CreateTableHandler.java:54
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/CreateTableHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/EventHandler.java:41
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/EventHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/InsertHandler.java:58
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/InsertHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdatePartColStatHandler.java:52
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdatePartColStatHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdateTableColStatHandler.java:37
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdateTableColStatHandler.java'
 cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/io/TableSerializer.java:52
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/io/TableSerializer.java' 
cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/DumpMetaData.java:19
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/DumpMetaData.java' 
cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java:146
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java' cleanly.
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRegExp.java:23
Falling back to three-way merge...
Applied patch to 
'ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRegExp.java' 
cleanly.
error: patch failed: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestReplDumpTask.java:139
Falling back to three-way merge...
Applied patch to 
'ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestReplDumpTask.java' cleanly.
error: patch failed: 
ql/src/test/queries/clientpositive/auto_sortmerge_join_1.q:5
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/auto_sortmerge_join_1.q' 
cleanly.
error: patch failed: 
ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q:4
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q' 
cleanly.
error: patch failed: 
ql/src/test/queries/clientpositive/auto_sortmerge_join_12.q:5
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/auto_sortmerge_join_12.q' 

[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868005#comment-16868005
 ] 

Hive QA commented on HIVE-21787:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972237/HIVE-21787.11.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16170 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17657/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17657/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17657/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972237 - PreCommit-HIVE-Build

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.2.patch, HIVE-21787.3.patch, 
> HIVE-21787.4.patch, HIVE-21787.5.patch, HIVE-21787.6.patch, 
> HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21895:
---
Description: 
The Kafka client version is 2.2 and there are deprecated methods used like
{code:java}
producer.close(0, TimeUnit){code}
in SimpleKafkaWriter

  was:
The Kafka client version is 2.x and there are deprecated methods used like
{code:java}
producer.close(0, TimeUnit){code}
in SimpleKafkaWriter


> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 3.1.0, 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
>
> The Kafka client version is 2.2 and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21895:
---
Description: 
The Kafka client version is 2.x and there are deprecated methods used like
{code:java}
producer.close(0, TimeUnit){code}
in SimpleKafkaWriter

  was:
The Kafka client version is 2.x and there are deprecated methods used like
{code:java}
producer.close(){code}
in SimpleKafkaWriter


> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 3.1.0, 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
>
> The Kafka client version is 2.x and there are deprecated methods used like
> {code:java}
> producer.close(0, TimeUnit){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21872) Bucketed tables that load data from data/files/auto_sortmerge_join should be tagged as 'bucketing_version'='1'

2019-06-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21872:
---
Fix Version/s: (was: 3.1.2)

> Bucketed tables that load data from data/files/auto_sortmerge_join should be 
> tagged as 'bucketing_version'='1'
> --
>
> Key: HIVE-21872
> URL: https://issues.apache.org/jira/browse/HIVE-21872
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21872.01.patch, HIVE-21872.01.patch, 
> HIVE-21872.01.patch, HIVE-21872.patch
>
>
> It is incorrect to use version 2, since the data files were created with old 
> hash function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21896) SHOW FUNCTIONS / SHOW FUNCTIONS LIKE - clarify

2019-06-19 Thread Miklos Gergely (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867979#comment-16867979
 ] 

Miklos Gergely commented on HIVE-21896:
---

[~ashutoshc] please decide what to accept as proper DDL, and what is not, then 
I'll do the code changes, and [~muleho...@gmail.com] can do the documentation 
changes.

> SHOW FUNCTIONS / SHOW FUNCTIONS LIKE - clarify
> --
>
> Key: HIVE-21896
> URL: https://issues.apache.org/jira/browse/HIVE-21896
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowFunctions]
>  the currently available functions can be listed like this:
> {code:java}
> SHOW FUNCTIONS ;{code}
> If the user executes this command, they will get the correct list of 
> functions, but they will also see this on the standard output:
> {code:java}
> SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.{code}
> If the user uses the
> {code:java}
> SHOW FUNCTIONS LIKE ;{code}
> command then they will receive the exact same result (though through 
> different codes). The only difference is that one can get all the function 
> names with "SHOW FUNCTIONS;", while "SHOW FUNCTIONS LIKE;" returns an 
> exception, so in this case the pattern is mandatory.
> So there should be a decision if we still accept "SHOW FUNCTIONS" without the 
> "LIKE". My suggestion is to accept it only if there is no pattern. so "SHOW 
> FUNCTIONS;" is ok, without deprecation message, but "SHOW FUNCTIONS 
> " should throw an exception.
> Whatever we decide, we should document it appropriately.
> cc [~krishahn]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21896) SHOW FUNCTIONS / SHOW FUNCTIONS LIKE - clarify

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21896:
--
Description: 
According to 
[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowFunctions]
 the currently available functions can be listed like this:
{code:java}
SHOW FUNCTIONS ;{code}
If the user executes this command, they will get the correct list of functions, 
but they will also see this on the standard output:
{code:java}
SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.{code}
If the user uses the
{code:java}
SHOW FUNCTIONS LIKE ;{code}
command then they will receive the exact same result (though through different 
codes). The only difference is that one can get all the function names with 
"SHOW FUNCTIONS;", while "SHOW FUNCTIONS LIKE;" returns an exception, so in 
this case the pattern is mandatory.

So there should be a decision if we still accept "SHOW FUNCTIONS" without the 
"LIKE". My suggestion is to accept it only if there is no pattern. so "SHOW 
FUNCTIONS;" is ok, without deprecation message, but "SHOW FUNCTIONS " 
should throw an exception.

Whatever we decide, we should document it appropriately.

cc [~krishahn]

  was:
According to the 
[documentation|[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowFunctions]]
 the currently available functions can be listed like this:
{code:java}
SHOW FUNCTIONS ;{code}
If the user executes this command, they will get the correct list of functions, 
but they will also see this on the standard output:
{code:java}
SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.{code}
If the user uses the
{code:java}
SHOW FUNCTIONS LIKE ;{code}
command then they will receive the exact same result (though through different 
codes). The only difference is that one can get all the function names with 
"SHOW FUNCTIONS;", while "SHOW FUNCTIONS LIKE;" returns an exception, so in 
this case the pattern is mandatory.

So there should be a decision if we still accept "SHOW FUNCTIONS" without the 
"LIKE". My suggestion is to accept it only if there is no pattern. so "SHOW 
FUNCTIONS;" is ok, without deprecation message, but "SHOW FUNCTIONS " 
should throw an exception.

Whatever we decide, we should document it appropriately.

cc [~krishahn]


> SHOW FUNCTIONS / SHOW FUNCTIONS LIKE - clarify
> --
>
> Key: HIVE-21896
> URL: https://issues.apache.org/jira/browse/HIVE-21896
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowFunctions]
>  the currently available functions can be listed like this:
> {code:java}
> SHOW FUNCTIONS ;{code}
> If the user executes this command, they will get the correct list of 
> functions, but they will also see this on the standard output:
> {code:java}
> SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.{code}
> If the user uses the
> {code:java}
> SHOW FUNCTIONS LIKE ;{code}
> command then they will receive the exact same result (though through 
> different codes). The only difference is that one can get all the function 
> names with "SHOW FUNCTIONS;", while "SHOW FUNCTIONS LIKE;" returns an 
> exception, so in this case the pattern is mandatory.
> So there should be a decision if we still accept "SHOW FUNCTIONS" without the 
> "LIKE". My suggestion is to accept it only if there is no pattern. so "SHOW 
> FUNCTIONS;" is ok, without deprecation message, but "SHOW FUNCTIONS 
> " should throw an exception.
> Whatever we decide, we should document it appropriately.
> cc [~krishahn]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21896) SHOW FUNCTIONS / SHOW FUNCTIONS LIKE - clarify

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely reassigned HIVE-21896:
-


> SHOW FUNCTIONS / SHOW FUNCTIONS LIKE - clarify
> --
>
> Key: HIVE-21896
> URL: https://issues.apache.org/jira/browse/HIVE-21896
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
>
> According to the 
> [documentation|[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowFunctions]]
>  the currently available functions can be listed like this:
> {code:java}
> SHOW FUNCTIONS ;{code}
> If the user executes this command, they will get the correct list of 
> functions, but they will also see this on the standard output:
> {code:java}
> SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.{code}
> If the user uses the
> {code:java}
> SHOW FUNCTIONS LIKE ;{code}
> command then they will receive the exact same result (though through 
> different codes). The only difference is that one can get all the function 
> names with "SHOW FUNCTIONS;", while "SHOW FUNCTIONS LIKE;" returns an 
> exception, so in this case the pattern is mandatory.
> So there should be a decision if we still accept "SHOW FUNCTIONS" without the 
> "LIKE". My suggestion is to accept it only if there is no pattern. so "SHOW 
> FUNCTIONS;" is ok, without deprecation message, but "SHOW FUNCTIONS 
> " should throw an exception.
> Whatever we decide, we should document it appropriately.
> cc [~krishahn]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21869:
---
Component/s: kafka integration

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21895) Kafka Storage handler uses deprecated Kafka client methods

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21895:
---
Component/s: kafka integration

> Kafka Storage handler uses deprecated Kafka client methods
> --
>
> Key: HIVE-21895
> URL: https://issues.apache.org/jira/browse/HIVE-21895
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 3.1.0, 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Priority: Minor
> Fix For: 4.0.0
>
>
> The Kafka client version is 2.x and there are deprecated methods used like
> {code:java}
> producer.close(){code}
> in SimpleKafkaWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867960#comment-16867960
 ] 

Xinli Shang commented on HIVE-21848:


[~gershinsky], thanks for sharing out the timing concern. How about we continue 
the discussion of this topic(unifying ORC and Parquet encryption properties) 
with smaller scope as "the minimum common set of both encryption properties". 
For Parquet or ORC specific settings, they can be defined in their own domain 
and be excluded for this topic. The motivation for starting the discussion now 
is that some companies might already start deploying. They will run into a bad 
situation if they have to revert their design later in production when they 
sync with upstream, and it could be even worse if they have to translate the 
encrypted data due to the later change. The discussion here may run for some 
time. While we discuss in Parquet community in parallel, we can keep mind the 
unifying discussion. 

The minimum common set should answer the questions of "*which* column to be 
encrypted with *which* key". Anything else? 

So far we have several proposals below according to the above discussion. 

 
| |Format Example
|Pros|Cons|
|1| "encrypt.with.pii" = "col1,col2"
"encrypt.with.credit" = "col3"| | |
|2|“encryption.column.keys" = "col1:pii,col2:pii,col3:credit” | | |
|3|"encrypt_col_col1" = "pii"
"encrypt_col_col2" = "pii"
"encrypt_col_col3" = "credit"| |Comparing with #2, this is not that compact as 
it needs multiple entries. |

 

 

 

   

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867956#comment-16867956
 ] 

Hive QA commented on HIVE-21787:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 16s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-17657/patches/PreCommit-HIVE-Build-17657.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17657/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.2.patch, HIVE-21787.3.patch, 
> HIVE-21787.4.patch, HIVE-21787.5.patch, HIVE-21787.6.patch, 
> HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: HIVE-21891.04.patch

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867947#comment-16867947
 ] 

Hive QA commented on HIVE-21891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972235/HIVE-21891.04.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17656/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17656/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17656/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-06-19 19:48:40.331
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-17656/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-06-19 19:48:40.334
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at f7b1809 HIVE-19661 : switch Hive UDFs to use Re2J regex engine 
(Rajkumar Singh via Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at f7b1809 HIVE-19661 : switch Hive UDFs to use Re2J regex engine 
(Rajkumar Singh via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-06-19 19:48:40.991
+ rm -rf ../yetus_PreCommit-HIVE-Build-17656
+ mkdir ../yetus_PreCommit-HIVE-Build-17656
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-17656
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-17656/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: ql/pom.xml:756
Falling back to three-way merge...
Applied patch to 'ql/pom.xml' with conflicts.
Going to apply patch with: git apply -p0
/data/hiveptest/working/scratch/build.patch:7982: trailing whitespace.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. Table tstsrc is not locked 
/data/hiveptest/working/scratch/build.patch:7992: trailing whitespace.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. Table tstsrcpart is not locked 
/data/hiveptest/working/scratch/build.patch:8022: trailing whitespace.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.ddl.DDLTask. Database lockneg1 is not locked 
error: patch failed: ql/pom.xml:756
Falling back to three-way merge...
Applied patch to 'ql/pom.xml' with conflicts.
U ql/pom.xml
warning: 3 lines add whitespace errors.
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-17656
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972235 - PreCommit-HIVE-Build

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package 

[jira] [Commented] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867945#comment-16867945
 ] 

Hive QA commented on HIVE-21869:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972226/HIVE-21869.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17655/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17655/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17655/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-06-19 19:47:19.133
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-17655/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-06-19 19:47:19.143
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at f7b1809 HIVE-19661 : switch Hive UDFs to use Re2J regex engine 
(Rajkumar Singh via Ashutosh Chauhan)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at f7b1809 HIVE-19661 : switch Hive UDFs to use Re2J regex engine 
(Rajkumar Singh via Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-06-19 19:47:19.908
+ rm -rf ../yetus_PreCommit-HIVE-Build-17655
+ mkdir ../yetus_PreCommit-HIVE-Build-17655
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-17655
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-17655/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/kafka-handler/README.md: does not exist in index
error: patch failed: kafka-handler/README.md:1
Falling back to three-way merge...
error: patch failed: kafka-handler/README.md:1
error: kafka-handler/README.md: patch does not apply
error: patch failed: README.md:1
Falling back to three-way merge...
error: patch failed: README.md:1
error: README.md: patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-17655
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972226 - PreCommit-HIVE-Build

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21841) Leader election in HMS to run housekeeping tasks.

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867944#comment-16867944
 ] 

Hive QA commented on HIVE-21841:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972208/HIVE-21841.09.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16171 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17654/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17654/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17654/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972208 - PreCommit-HIVE-Build

> Leader election in HMS to run housekeeping tasks.
> -
>
> Key: HIVE-21841
> URL: https://issues.apache.org/jira/browse/HIVE-21841
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21841.01.patch, HIVE-21841.02.patch, 
> HIVE-21841.04.patch, HIVE-21841.05.patch, HIVE-21841.06.patch, 
> HIVE-21841.07.patch, HIVE-21841.08.patch, HIVE-21841.09.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> HMS performs housekeeping tasks. When there are multiple HMSes we need to 
> have a leader HMS elected which will carry out those housekeeping tasks. 
> These tasks include execution of compaction tasks, auto-discovering 
> partitions for external tables, generation of compaction tasks, repl thread 
> etc.
> Note that, though the code for compaction tasks, auto-discovery of partitions 
> etc. is in Hive, the actual tasks are initiated by an HMS configured to do 
> so. So, leader election is required only for HMS and not for HS2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21841) Leader election in HMS to run housekeeping tasks.

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867926#comment-16867926
 ] 

Hive QA commented on HIVE-21841:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
44s{color} | {color:blue} standalone-metastore/metastore-common in master has 
31 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
11s{color} | {color:blue} standalone-metastore/metastore-server in master has 
184 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
44s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 47m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17654/dev-support/hive-personality.sh
 |
| git revision | master / f7b1809 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17654/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17654/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Leader election in HMS to run housekeeping tasks.
> -
>
> Key: HIVE-21841
> URL: https://issues.apache.org/jira/browse/HIVE-21841
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 4.0.0
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21841.01.patch, HIVE-21841.02.patch, 
> HIVE-21841.04.patch, HIVE-21841.05.patch, 

[jira] [Commented] (HIVE-21547) Temp Tables: Use stORC format for temporary tables

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867888#comment-16867888
 ] 

Hive QA commented on HIVE-21547:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972199/HIVE-21547.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 16168 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_nullscan] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_stats2] (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] (batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_nonpart] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_part2] (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_part] (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_sizebug] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ctas] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[default_constraint]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_llap_nonvector]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats]
 (batchId=182)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ctas] (batchId=115)
org.apache.hadoop.hive.llap.cache.TestBuddyAllocator.testMTT[2] (batchId=350)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=322)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testToAcidConversionMultiBucket 
(batchId=322)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=322)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testToAcidConversionMultiBucket
 (batchId=322)
org.apache.hadoop.hive.ql.io.orc.TestVectorizedOrcAcidRowBatchReader.testDeleteEventFilteringOn
 (batchId=313)
org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testTableProperties 
(batchId=244)
org.apache.hive.streaming.TestStreaming.testFileDumpDeltaFilesWithoutStreamingOptimizations
 (batchId=348)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17653/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17653/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17653/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972199 - PreCommit-HIVE-Build

> Temp Tables: Use stORC format for temporary tables
> --
>
> Key: HIVE-21547
> URL: https://issues.apache.org/jira/browse/HIVE-21547
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-21547.1.patch, HIVE-21547.2.patch
>
>
> Using st(reaming)ORC 
> (hive.exec.orc.delta.streaming.optimizations.enabled=true) format has massive 
> performance advantages when creating data-sets which will not be stored for 
> long-term.
> The format is compatible with ORC for vectorization and other features, while 
> being cheaper to write out to filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2019-06-19 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19661:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Raj!

> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19661.01.patch, HIVE-19661.02.patch, 
> HIVE-19661.03.patch, HIVE-19661.patch
>
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21787) Metastore table cache LRU eviction

2019-06-19 Thread Sam An (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam An updated HIVE-21787:
--
Attachment: HIVE-21787.11.patch

> Metastore table cache LRU eviction
> --
>
> Key: HIVE-21787
> URL: https://issues.apache.org/jira/browse/HIVE-21787
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Sam An
>Assignee: Sam An
>Priority: Major
> Attachments: HIVE-21787.1.patch, HIVE-21787.10.patch, 
> HIVE-21787.11.patch, HIVE-21787.2.patch, HIVE-21787.3.patch, 
> HIVE-21787.4.patch, HIVE-21787.5.patch, HIVE-21787.6.patch, 
> HIVE-21787.7.patch, HIVE-21787.8.patch, HIVE-21787.9.patch
>
>
> Metastore currently uses black/white list to specify patterns of tables to 
> load into the cache. Cache is loaded in one shot "prewarm", and updated by a 
> background thread. This is not a very efficient design. 
> In this feature, we try to enhance the cache for Tables with LRU to improve 
> cache utilization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21547) Temp Tables: Use stORC format for temporary tables

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867861#comment-16867861
 ] 

Hive QA commented on HIVE-21547:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
55s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} common in master has 62 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
12s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch common passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} ql: The patch generated 0 new + 37 unchanged - 1 
fixed = 37 total (was 38) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17653/dev-support/hive-personality.sh
 |
| git revision | master / 9451d3a |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17653/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Temp Tables: Use stORC format for temporary tables
> --
>
> Key: HIVE-21547
> URL: https://issues.apache.org/jira/browse/HIVE-21547
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-21547.1.patch, HIVE-21547.2.patch
>
>
> Using st(reaming)ORC 
> (hive.exec.orc.delta.streaming.optimizations.enabled=true) format has massive 
> performance advantages when creating data-sets which will not be stored for 
> long-term.
> The format is compatible with ORC for vectorization and other features, while 
> being cheaper to write out to filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Miklos Gergely (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-21891:
--
Attachment: HIVE-21891.04.patch

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch, HIVE-21891.04.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867830#comment-16867830
 ] 

Hive QA commented on HIVE-21891:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972193/HIVE-21891.03.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16168 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17652/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17652/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17652/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972193 - PreCommit-HIVE-Build

> Break up DDLTask - cleanup
> --
>
> Key: HIVE-21891
> URL: https://issues.apache.org/jira/browse/HIVE-21891
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-ddl
> Fix For: 4.0.0
>
> Attachments: HIVE-21891.01.patch, HIVE-21891.02.patch, 
> HIVE-21891.03.patch
>
>
> DDLTask was a huge class, more than 5000 lines long. The related DDLWork was 
> also a huge class, which had a field for each DDL operation it supported. The 
> goal was to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
>  * have a separate class for each operation
>  * have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
>  * make all the requests (DDLDesc subclasses) immutable - most of them are now
>  * DDLTask should be agnostic to the actual operations
>  * right now let's ignore the issue of having some operations handled by 
> DDLTask which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there were two DDLTask and DDLWork classes in the 
> code base the new ones in the new package were called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes were in use.
> Step #12: rename DDLTask2 and DDLWork2, now that they are alone. Remove the 
> old DDLDesc. Instead of registering, now DDLTask finds the DDLOperations, and 
> registers them itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21894) Hadoop credential password storage for the Kafka Storage handler when security is SSL

2019-06-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21894:
---
Component/s: kafka integration

> Hadoop credential password storage for the Kafka Storage handler when 
> security is SSL
> -
>
> Key: HIVE-21894
> URL: https://issues.apache.org/jira/browse/HIVE-21894
> Project: Hive
>  Issue Type: Improvement
>  Components: kafka integration
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Kristopher Kane
>Priority: Major
> Fix For: 4.0.0
>
>
> The Kafka storage handler assumes that if the Hive service is configured with 
> Kerberos then the destination Kafka cluster is also secured with the same 
> Kerberos realm or trust of realms.  The security configuration of the Kafka 
> client can be overwritten due to the additive operations of the Kafka client 
> configs, but, the only way to specify SSL and the keystore/truststore 
> user/pass is via plain text table properties. 
> This ticket proposes adding Hadoop credential security to the Kafka storage 
> handler in support of SSL secured Kafka clusters.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21891) Break up DDLTask - cleanup

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867799#comment-16867799
 ] 

Hive QA commented on HIVE-21891:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:red}-1{color} | {color:red} @author {color} | {color:red}  0m  
1s{color} | {color:red} The patch appears to contain 5 @author tags which the 
community has agreed to not allow in code contributions. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 9s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
8s{color} | {color:blue} ql in master has 2253 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
23s{color} | {color:blue} contrib in master has 10 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} hcatalog/core in master has 28 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
49s{color} | {color:blue} itests/util in master has 44 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} ql: The patch generated 0 new + 1306 unchanged - 3 
fixed = 1306 total (was 1309) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch contrib passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch core passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch util passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 44m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-17652/dev-support/hive-personality.sh
 |
| git revision | master / 7416fac |
| Default Java | 1.8.0_111 |
| @author | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-17652/yetus/author-tags.txt |
| findbugs | v3.0.0 |

[jira] [Updated] (HIVE-21872) Bucketed tables that load data from data/files/auto_sortmerge_join should be tagged as 'bucketing_version'='1'

2019-06-19 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21872:
---
   Resolution: Fixed
Fix Version/s: 3.1.2
   3.2.0
   4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master, branch-3, branch-3.1. Thanks for reviewing [~vgarg]

> Bucketed tables that load data from data/files/auto_sortmerge_join should be 
> tagged as 'bucketing_version'='1'
> --
>
> Key: HIVE-21872
> URL: https://issues.apache.org/jira/browse/HIVE-21872
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0, 3.2.0, 3.1.2
>
> Attachments: HIVE-21872.01.patch, HIVE-21872.01.patch, 
> HIVE-21872.01.patch, HIVE-21872.patch
>
>
> It is incorrect to use version 2, since the data files were created with old 
> hash function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Gidon Gershinsky (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867785#comment-16867785
 ] 

Gidon Gershinsky edited comment on HIVE-21848 at 6/19/19 4:26 PM:
--

[~sha...@uber.com], a few comments:

for either footer or columns, key metadata should not be passed as a property. 
Instead, it should be derived from the properties (such as key names, wrapping 
method, KMS type, etc).

on the other hand, a few substantial properties are missing in your list (like 
key names, token, etc)

actually, we have a draft that already defines the Parquet encryption 
properties, please have a look at

[https://docs.google.com/document/d/1boH6HPkG0ZhgxcaRkGk3QpZ8X_J91uXZwVGwYN45St4/edit?usp=sharing]

It had not been reviewed by the community yet, so its a bit early to try to 
unify ORC and Parquet properties. We might find at the end that the differences 
are bigger than the common. But in any case, I think this exercise of finding 
the common is helpful; its just a bit early at this point. 


was (Author: gershinsky):
[~sha...@uber.com], a few comments:
 * for either footer or columns, key metadata should not be passed as a 
property. Instead, it should be derived from the properties (such as key names, 
wrapping method, KMS type, etc).
 * on the other hand, a few substantial properties are missing in your list 
(like KMS client type, token, etc)
 * actually, we have a draft that already defines the Parquet encryption 
properties, please have a look at

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Gidon Gershinsky (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867785#comment-16867785
 ] 

Gidon Gershinsky commented on HIVE-21848:
-

[~sha...@uber.com], a few comments:
 * for either footer or columns, key metadata should not be passed as a 
property. Instead, it should be derived from the properties (such as key names, 
wrapping method, KMS type, etc).
 * on the other hand, a few substantial properties are missing in your list 
(like KMS client type, token, etc)
 * actually, we have a draft that already defines the Parquet encryption 
properties, please have a look at

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21867) Sort semijoin conditions to accelerate query processing

2019-06-19 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867779#comment-16867779
 ] 

Jesus Camacho Rodriguez commented on HIVE-21867:


This patch needs to be rebased on top of HIVE-21857.

> Sort semijoin conditions to accelerate query processing
> ---
>
> Key: HIVE-21867
> URL: https://issues.apache.org/jira/browse/HIVE-21867
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21867.02.patch, HIVE-21867.patch
>
>
> The problem was tackled for CBO in HIVE-21857. Semijoin filters are 
> introduced later in the planning phase. Follow similar approach to sort them, 
> trying to accelerate filter evaluation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21890) Fix alter_partition_change_col.q qtest inclusion in minillaplocal.query.files

2019-06-19 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-21890:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Patch committed to master.

> Fix alter_partition_change_col.q qtest inclusion in minillaplocal.query.files
> -
>
> Key: HIVE-21890
> URL: https://issues.apache.org/jira/browse/HIVE-21890
> Project: Hive
>  Issue Type: Bug
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21890.01.patch, HIVE-21890.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20833 introduced alter_partition_change_col.q under 
> minillaplocal.query.files, however it was named only 
> {{alter_partition_change_col}} without the postfix.
> Looking at the recent precommit tests, it looks like this test never gets 
> called.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21869) Clean up the Kafka storage handler readme and examples

2019-06-19 Thread Kristopher Kane (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kristopher Kane updated HIVE-21869:
---
Attachment: HIVE-21869.1.patch
Status: Patch Available  (was: Open)

> Clean up the Kafka storage handler readme and examples
> --
>
> Key: HIVE-21869
> URL: https://issues.apache.org/jira/browse/HIVE-21869
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Kristopher Kane
>Assignee: Kristopher Kane
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21869.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21741) Backport HIVE-20221 & related fix HIVE-20833 to branch-3: Increase column width for partition_params

2019-06-19 Thread Alan Gates (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-21741:
--
   Resolution: Fixed
Fix Version/s: (was: 3.1.2)
   Status: Resolved  (was: Patch Available)

Patch applied to branch-3.

> Backport HIVE-20221 & related fix HIVE-20833 to branch-3: Increase column 
> width for partition_params
> 
>
> Key: HIVE-21741
> URL: https://issues.apache.org/jira/browse/HIVE-21741
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore
>Affects Versions: 3.1.1
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
> Attachments: HIVE-21741.01.branch-3.patch, 
> HIVE-21741.01.branch-3.patch, HIVE-21741.02.branch-3.patch, 
> HIVE-21741.branch-3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is an umbrella for backporting HIVE-20221 & the related fix of 
> HIVE-20833 to branch-3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867767#comment-16867767
 ] 

Xinli Shang edited comment on HIVE-21848 at 6/19/19 3:45 PM:
-

Thanks, Owen! I Just have slight different thinking for "*encrypt.with.pii*" = 
"*col1,col2"*.  In case of that a company needs "*encrypt.with.abc*" = 
"*col3,col4*" and '*abc*' is not predefined in Hive/ORC/Parquet, does it mean 
they need to change the code of Hive/ORC/Parquet? This is real usage in 
production.  


was (Author: sha...@uber.com):
Thanks Owen! I Just have a slight different thinking for "*encrypt.with.pii*" = 
"*col1,col2"*.  In case of that a company needs "*encrypt.with.abc*" = 
"*col3,col4*" and '*abc*' is not predefined in Hive/ORC/Parquet, does it mean 
they need to change code?This is realy usage in production.  

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21848) Table property name definition between ORC and Parquet encrytion

2019-06-19 Thread Xinli Shang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867767#comment-16867767
 ] 

Xinli Shang commented on HIVE-21848:


Thanks Owen! I Just have a slight different thinking for "*encrypt.with.pii*" = 
"*col1,col2"*.  In case of that a company needs "*encrypt.with.abc*" = 
"*col3,col4*" and '*abc*' is not predefined in Hive/ORC/Parquet, does it mean 
they need to change code?This is realy usage in production.  

> Table property name definition between ORC and Parquet encrytion
> 
>
> Key: HIVE-21848
> URL: https://issues.apache.org/jira/browse/HIVE-21848
> Project: Hive
>  Issue Type: Task
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
> Fix For: 3.0.0
>
>
> The goal of this Jira is to define a superset of unified table property names 
> that can be used for both Parquet and ORC column encryption. There is no code 
> change needed for this Jira.
> *Background:*
> ORC-14 and Parquet-1178 introduced column encryption to ORC and Parquet. To 
> configure the encryption, e.g. which column is sensitive, what master key to 
> be used, algorithm, etc, table properties can be used. It is important that 
> both Parquet and ORC can use unified names.
> According to the slide 
> [https://www.slideshare.net/oom65/fine-grain-access-control-for-big-data-orc-column-encryption-137308692],
>  ORC use table properties like orc.encrypt.pii, orc.encrypt.credit. While in 
> the Parquet community, it is still discussing to provide several ways and 
> using table properties is one of the options, while there is no detailed 
> design of the table property names yet.
> So it is a good time to discuss within two communities to have unified table 
> names as a superset.
> *Proposal:*
> There are several encryption properties that need to be specified for a 
> table. Here is the list. This is the superset of Parquet and ORC. Some of 
> them might not apply to both.
>  # PII columns including nest columns
>  # Column key metadata, master key metadata
>  # Encryption algorithm, for example, Parquet support AES_GCM and AES_CTR. 
> ORC might support AES_CTR.
>  # Encryption footer - Parquet allow footer to be encrypted or plaintext
>  # Footer key metadata
> Here is the table properties proposal.  
> |*Table Property Name*|*Value*|*Notes*|
> |encrypt_algorithm|aes_ctr, aes_gcm|The algorithm to be used for encryption.|
> |encrypt_footer_plaintext|true, false|Parquet support plaintext and encrypted 
> footer. By default, it is encrypted.|
> |encrypt_footer_key_metadata|base64 string of footer key metadata|It is up to 
> the KMS to define what key metadata is. The metadata should have enough 
> information to figure out the corresponding key by the KMS.  |
> |encrypt_col_xxx|base64 string of column key metadata|‘xxx’ is the column 
> name for example, ‘address.zipcode’. 
>  
> It is up to the KMS to define what key metadata is. The metadata should have 
> enough information to figure out the corresponding key by the KMS.|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21890) Fix alter_partition_change_col.q qtest inclusion in minillaplocal.query.files

2019-06-19 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867762#comment-16867762
 ] 

Hive QA commented on HIVE-21890:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12972187/HIVE-21890.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 16168 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/17651/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/17651/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-17651/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12972187 - PreCommit-HIVE-Build

> Fix alter_partition_change_col.q qtest inclusion in minillaplocal.query.files
> -
>
> Key: HIVE-21890
> URL: https://issues.apache.org/jira/browse/HIVE-21890
> Project: Hive
>  Issue Type: Bug
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21890.01.patch, HIVE-21890.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-20833 introduced alter_partition_change_col.q under 
> minillaplocal.query.files, however it was named only 
> {{alter_partition_change_col}} without the postfix.
> Looking at the recent precommit tests, it looks like this test never gets 
> called.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >