from:"Szehon Ho \(JIRA\)"

[jira] [Commented] (HIVE-14117) HS2 UI: List of recent queries shows most recent query last

2016-06-28 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353450#comment-15353450
 ] 

Szehon Ho commented on HIVE-14117:
--

nice idea, +1

> HS2 UI: List of recent queries shows most recent query last
> ---
>
> Key: HIVE-14117
> URL: https://issues.apache.org/jira/browse/HIVE-14117
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-14117.1.patch
>
>
> It's more useful to see the latest one first in your "last n queries" view.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14063) beeline to auto connect to the HiveServer2

2016-06-29 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355658#comment-15355658
 ] 

Szehon Ho commented on HIVE-14063:
--

Cool.  I wonder to clarify, can we see what this conf file will look like?  And 
what is the relationship with the newly-added beeline.properties.  What kind of 
things go where, and the interaction between the two, if both are present and 
if there are same properties defined in both?

> beeline to auto connect to the HiveServer2
> --
>
> Key: HIVE-14063
> URL: https://issues.apache.org/jira/browse/HIVE-14063
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> Currently one has to give an jdbc:hive2 url in order for Beeline to connect a 
> hiveserver2 instance. It would be great if Beeline can get the info somehow 
> (from a properties file at a well-known location?) and connect automatically 
> if user doesn't specify such a url. If the properties file is not present, 
> then beeline would expect user to provide the url and credentials using 
> !connect or ./beeline -u .. commands
> While Beeline is flexible (being a mere JDBC client), most environments would 
> have just a single HS2. Having users to manually connect into this via either 
> "beeline ~/.propsfile" or -u or !connect statements is lowering the 
> experience part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14754) Track the queries execution lifecycle times

2017-02-07 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-14754:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master.  Thanks Barna for the contribution!

> Track the queries execution lifecycle times
> ---
>
> Key: HIVE-14754
> URL: https://issues.apache.org/jira/browse/HIVE-14754
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Affects Versions: 2.2.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch
>
>
> We should be able to track the nr. of queries being compiled/executed at any 
> given time, as well as the duration of the execution and compilation phase.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14754) Track the queries execution lifecycle times

2017-02-03 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851289#comment-15851289
 ] 

Szehon Ho commented on HIVE-14754:
--

+1 , glad to see new metrics!  just a question, what is the 1028 from (seems a 
bit arbitrary), and is it configurable?

> Track the queries execution lifecycle times
> ---
>
> Key: HIVE-14754
> URL: https://issues.apache.org/jira/browse/HIVE-14754
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Affects Versions: 2.2.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch
>
>
> We should be able to track the nr. of queries being compiled/executed at any 
> given time, as well as the duration of the execution and compilation phase.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14754) Track the queries execution lifecycle times

2017-02-09 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859308#comment-15859308
 ] 

Szehon Ho commented on HIVE-14754:
--

Sorry I did not catch the typo, yea we should fix it before the release if we 
can.

> Track the queries execution lifecycle times
> ---
>
> Key: HIVE-14754
> URL: https://issues.apache.org/jira/browse/HIVE-14754
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Affects Versions: 2.2.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch
>
>
> We should be able to track the nr. of queries being compiled/executed at any 
> given time, as well as the duration of the execution and compilation phase.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14775) Investigate IOException usage in Metrics APIs

2016-09-28 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530366#comment-15530366
 ] 

Szehon Ho commented on HIVE-14775:
--

Makes sense, +1 on latest patch to me.

> Investigate IOException usage in Metrics APIs
> -
>
> Key: HIVE-14775
> URL: https://issues.apache.org/jira/browse/HIVE-14775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2, Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> A large number of metrics APIs seem to declare to throw IOExceptions 
> needlessly. (incrementCounter, decrementCounter etc.)
> This is not only misleading but it fills up the code with unnecessary catch 
> blocks never to be reached.
> We should investigate if these exceptions are thrown at all, and remove them 
> if  it is truly unused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14776) Skip 'distcp' call when copying data from HDSF to S3

2016-09-16 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497699#comment-15497699
 ] 

Szehon Ho commented on HIVE-14776:
--

I'm curious about this.  Distcp parallelizes the copy, and so if the file/dir 
is very splittable then in theory it should be faster than single thread, even 
though there's the overhead of temporary location for it?  I understand for 
some small files it will be slower.

And just orthogonally, I thought actually distcp puts the file in temporary 
location on local file before uploading to S3, not a temporary location on S3.

> Skip 'distcp' call when copying data from HDSF to S3
> 
>
> Key: HIVE-14776
> URL: https://issues.apache.org/jira/browse/HIVE-14776
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14776.1.patch, HIVE-14776.2.patch
>
>
> Hive uses 'distcp' to copy files in parallel between HDFS encryption zones 
> when the {{hive.exec.copyfile.maxsize}} threshold is lower than the file to 
> copy. This 'distcp' is also executed when copying to S3, but it is causing 
> slower copies.
> We should not invoke distcp when copying to blobstore systems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14713) LDAP Authentication Provider should be covered with unit tests

2016-09-22 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514737#comment-15514737
 ] 

Szehon Ho commented on HIVE-14713:
--

I think there is a 24 hour wait after the last +1 to get merged (at least last 
time I checked).  Feel free to ping again if it is forgotten.

> LDAP Authentication Provider should be covered with unit tests
> --
>
> Key: HIVE-14713
> URL: https://issues.apache.org/jira/browse/HIVE-14713
> Project: Hive
>  Issue Type: Test
>  Components: Authentication, Tests
>Affects Versions: 2.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14713.1.patch, HIVE-14713.2.patch, 
> HIVE-14713.3.patch
>
>
> Currently LdapAuthenticationProviderImpl class is not covered with unit 
> tests. To make this class testable some minor refactoring will be required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14775) Investigate IOException usage in Metrics APIs

2016-09-27 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527043#comment-15527043
 ] 

Szehon Ho commented on HIVE-14775:
--

Yea definitely appreciate the cleanup, never had time to investigate.  Do we 
know what scenario lead to JMXException?  I did have only some minor comments, 
left on RB

> Investigate IOException usage in Metrics APIs
> -
>
> Key: HIVE-14775
> URL: https://issues.apache.org/jira/browse/HIVE-14775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2, Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>
> A large number of metrics APIs seem to declare to throw IOExceptions 
> needlessly. (incrementCounter, decrementCounter etc.)
> This is not only misleading but it fills up the code with unnecessary catch 
> blocks never to be reached.
> We should investigate if these exceptions are thrown at all, and remove them 
> if  it is truly unused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14984) Hive-WebUI access results in Request is a replay (34) attack

2016-10-20 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592474#comment-15592474
 ] 

Szehon Ho commented on HIVE-14984:
--

Thanks a lot Barna.  FYI [~jxiang]

> Hive-WebUI access results in Request is a replay (34) attack
> 
>
> Key: HIVE-14984
> URL: https://issues.apache.org/jira/browse/HIVE-14984
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Venkat Sambath
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-14984.patch
>
>
> When trying to access kerberized webui of HS2, The following error is received
> GSSException: Failure unspecified at GSS-API level (Mechanism level: Request 
> is a replay (34))
> While this is not happening for RM webui (checked if kerberos webui is 
> enabled)
> To reproduce the issue 
> Try running
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/
> from any cluster nodes
> or 
> Try accessing the URL from a VM with windows machine and firefox browser to 
> replicate the issue
> The following workaround helped, but need a permanent solution for the bug
> Workaround:
> =
> First access the index.html directly and then actual URL of webui
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/index.html
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002
> In browser:
> First access
> http://:10002/index.html
> then
> http://:10002



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2

2016-10-20 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592480#comment-15592480
 ] 

Szehon Ho commented on HIVE-14753:
--

Sorry for delay, +1 to me

> Track the number of open/closed/abandoned sessions in HS2
> -
>
> Key: HIVE-14753
> URL: https://issues.apache.org/jira/browse/HIVE-14753
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, 
> HIVE-14753.3.patch, HIVE-14753.patch
>
>
> We should be able to track the nr. of sessions since the startup of the HS2 
> instance as well as the average lifetime of a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2

2016-10-24 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-14753:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, thanks Barna for the contribution!

> Track the number of open/closed/abandoned sessions in HS2
> -
>
> Key: HIVE-14753
> URL: https://issues.apache.org/jira/browse/HIVE-14753
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, 
> HIVE-14753.3.patch, HIVE-14753.patch
>
>
> We should be able to track the nr. of sessions since the startup of the HS2 
> instance as well as the average lifetime of a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13517) Hive logs in Spark Executor and Driver should show thread-id.

2016-10-31 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623323#comment-15623323
 ] 

Szehon Ho commented on HIVE-13517:
--

Yea if the thread name is there, that is great.  

I thought last time when I checked the Spark Executor and Driver logs that they 
were mixed, and there was no indication about the thread.  I don't have an 
environment right now to check that, do you see the thread name now in those 
logs?

> Hive logs in Spark Executor and Driver should show thread-id.
> -
>
> Key: HIVE-13517
> URL: https://issues.apache.org/jira/browse/HIVE-13517
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Szehon Ho
>Assignee: liyunzhang_intel
>
> In Spark, there might be more than one task running in one executor. 
> Similarly, there may be more than one thread running in Driver.
> This makes debugging through the logs a nightmare. It would be great if there 
> could be thread-ids in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15102) Hiveptest is killing nodes where IP is reused after previous node termination

2016-11-01 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626046#comment-15626046
 ] 

Szehon Ho commented on HIVE-15102:
--

+1

I think Brock wrote this original code so he might know more, but yes it does 
look like a bug to me.  Only small comment is, you can annotate this method 
with @VisibleForTesting annotation.

> Hiveptest is killing nodes where IP is reused after previous node termination
> -
>
> Key: HIVE-15102
> URL: https://issues.apache.org/jira/browse/HIVE-15102
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-15102.1.patch
>
>
> NO PRECOMMIT TESTS
> The Hiveptest framework has a background thread that runs every hour, and 
> attempts to kill zombie nodes that are not being used by the test execution 
> anymore. 
> These killed nodes are kept in a list of terminated nodes, and next time the 
> background thread is executed, it will attempt to kill all those nodes again 
> because Hiveptest consider them as zombie nodes.
> The problem is that cloud providers can give you the same IP numbers for new 
> nodes, and when the background thread runs, it will kill those nodes that may 
> still be in used by Hiveptest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail

2016-12-12 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15741636#comment-15741636
 ] 

Szehon Ho commented on HIVE-15385:
--

Sorry for late reply, glad it's figured out, thanks guys for taking care of it.

> Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., 
> false) causes queries to fail
> --
>
> Key: HIVE-15385
> URL: https://issues.apache.org/jira/browse/HIVE-15385
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch
>
>
> According to 
> https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive,
>  failure to inherit permissions should not cause queries to fail.
> It looks like this was the case until HIVE-13716, which added some code to 
> use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set 
> permissions instead of shelling out and running {{-chgrp -R ...}}.
> When shelling out, the return status of each command is ignored, so if there 
> are any failures when inheriting permissions, a warning is logged, but the 
> query still succeeds.
> However, when invoked the {{FileSystem}} API, any failures will be propagated 
> up to the caller, and the query will fail.
> This is problematic because {{setFulFileStatus}} shells out when the 
> {{recursive}} parameter is set to {{true}}, and when it is false it invokes 
> the {{FileSystem}} API. So the behavior is inconsistent depending on the 
> value of {{recursive}}.
> We should decide whether or not permission inheritance should fail queries or 
> not, and then ensure the code consistently follows that decision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15330) Bump JClouds version to 2.0.0 on Hive/Ptest

2016-12-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725516#comment-15725516
 ] 

Szehon Ho commented on HIVE-15330:
--

+1, sounds good to me

> Bump JClouds version to 2.0.0 on Hive/Ptest
> ---
>
> Key: HIVE-15330
> URL: https://issues.apache.org/jira/browse/HIVE-15330
> Project: Hive
>  Issue Type: Task
>  Components: Hive, Testing Infrastructure
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-15330.1.patch
>
>
> NO PRECOMMIT TESTS
> JClouds 2.0.0 fixes several issues with Google Compute Engine API. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-06-06 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503320#comment-16503320
 ] 

Szehon Ho commented on HIVE-19767:
--

[~thejas] Sorry to bother , but as you are the original author of adding 
hiveconf to hiveserver2, do you think this change makes sense and if this is 
the way you would go about it?  Seems there is a bit of legacy code of setting 
them via environment variables that I did not want to touch.

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-06-11 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508239#comment-16508239
 ] 

Szehon Ho commented on HIVE-19767:
--

[~stakiar] , [~vihangk1] would you guys mind taking a look at this patch?  
Thanks!

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-06-01 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-19767:



> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-06-01 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: HIVE-19767.patch

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-06-01 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Status: Patch Available  (was: Open)

Simple fix attempt, as this seems to be done in HiveCLI: 
[https://github.com/apache/hive/blob/master/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java#L727]

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-06-01 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Affects Version/s: 1.2.2
   3.0.0
   2.3.2

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2017-12-29 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-18347:


Assignee: Szehon Ho

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2017-12-29 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Attachment: HIVE-18347.1.patch

This patch allows a plugable Hive Metastore URI Resolver hook that is called to 
resolve Metastore URI's from HiveMetastoreClient, and implements one that we 
use in production.  This connects to our Marathon-based Consul service for 
lookup of a particular Consul-registered service, and will read a consul-based 
scheme for hive.metastore.uris.

One can imagine other schemes, like for example Zookeeper-based Metastore 
registration, those could also be implemented via this plugin.

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-18347.1.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2017-12-29 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Attachment: HIVE-18347.2.patch

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2017-12-29 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Status: Patch Available  (was: Open)

This is my first patch in awhile, I hope I did it correctly :)

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2017-12-29 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Component/s: Metastore

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2017-12-29 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Attachment: HIVE-18347.3.patch

Fix compiling of just the 'contrib' module by adding explicit dependency to 
metastore.

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-01-15 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326348#comment-16326348
 ] 

Szehon Ho commented on HIVE-18347:
--

[~alangates] [~vihangk1] any thoughts on whether this is a useful contribution 
to hive?  Hive deployment in our organization (Criteo) uses open source tool 
Consul to do service discovery/state for Hive Metastores, but I am not sure the 
community guidelines on adding support for outside tools like this.  Though 
this patch does allow pluggable to other service discovery mechanisms.

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12338) Add webui to HiveServer2

2018-01-15 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326339#comment-16326339
 ] 

Szehon Ho commented on HIVE-12338:
--

Hey not yet, but I think it is pretty easy to do.  I had made HIVE-13457 but 
have not had time to do this yet.

> Add webui to HiveServer2
> 
>
> Key: HIVE-12338
> URL: https://issues.apache.org/jira/browse/HIVE-12338
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Major
> Attachments: HIVE-12338.1.patch, HIVE-12338.2.patch, 
> HIVE-12338.3.patch, HIVE-12338.4.patch, hs2-conf.png, hs2-logs.png, 
> hs2-metrics.png, hs2-webui.png
>
>
> A web ui for HiveServer2 can show some useful information such as:
>  
> 1. Sessions,
> 2. Queries that are executing on the HS2, their states, starting time, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-01-18 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16330785#comment-16330785
 ] 

Szehon Ho commented on HIVE-18347:
--

After reading the discussion on HIVE-18449, this looks like still a valid 
approach as it keeps the randomness of selection, but allows a custom resolver 
to resolve the uri.  [~thejas] [~vihangk1]  Would this patch be ok as is, or is 
better to have just the resolver hook, leaving the Consul implementation for 
our own repository?  (I'm ok with doing the latter)

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18449) Add configurable policy for choosing the HMS URI from hive.metastore.uris

2018-01-17 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328757#comment-16328757
 ] 

Szehon Ho commented on HIVE-18449:
--

Jumping in from HIVE-18347 on Vihang's pointers,  I think it would be great to 
have a pluggable way to get hive.metastore.uris as I tried to do there.

In our data center we have implemented Metastore on Mesos that can be restarted 
automatically across nodes depending on load, and we use consul to dynamically 
discover them.  Consul can also allow us to do some tricks, like not return a 
certain metastore if it is loaded.  Currently the list of metastore.uris as 
read by HS2 is static and would force us to restart all the HS2 everytime a 
Metastore is added, removed, or moved.

> Add configurable policy for choosing the HMS URI from hive.metastore.uris
> -
>
> Key: HIVE-18449
> URL: https://issues.apache.org/jira/browse/HIVE-18449
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Sahil Takiar
>Assignee: Janaki Lahorani
>Priority: Major
>
> HIVE-10815 added logic to randomly choose a HMS URI from 
> {{hive.metastore.uris}}. It would be nice if there was a configurable policy 
> that determined how a URI is chosen from this list - e.g. one option can be 
> to randomly pick a URI, another option can be to choose the first URI in the 
> list (which was the behavior prior to HIVE-10815).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-02-02 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350780#comment-16350780
 ] 

Szehon Ho commented on HIVE-18347:
--

Hi Vihang, thanks for the review!  Sorry about the late response.  But about 
the comments, I was under the impression (at least from Thejas's comment about 
HIVE-18449) that we wanted to keep the selection policy as a separate knob of 
Hive under Hive control, and not via this hook, which was what the review 
comment is suggesting?  I think HIVE-19449 seems to already give another knob 
for offering some predefined selection mechanism, so I think we shouldn't 
incorporate all that into a defaultHook that can be over-riden.  What do you 
think?

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-02-05 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Attachment: HIVE-18347.5.patch

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-02-05 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352752#comment-16352752
 ] 

Szehon Ho commented on HIVE-18347:
--

Thanks a lot.  Also I guess HIVE-18449 conflicts with this, so uploading 
another try to rebase it.

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-02-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354253#comment-16354253
 ] 

Szehon Ho commented on HIVE-18347:
--

[~vihangk1] do you want to have another review on my rebased patch ?  thanks in 
advance

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM

2018-02-06 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354065#comment-16354065
 ] 

Szehon Ho commented on HIVE-18541:
--

Hi Oleksiy, thanks for the patch.  I made some review comments.

> Secure HS2 web UI with PAM
> --
>
> Key: HIVE-18541
> URL: https://issues.apache.org/jira/browse/HIVE-18541
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18541.1.patch
>
>
> Secure HS2 web UI with PAM. Add two new properties
>  * hive.server2.webui.use.pam
>  * Default value: false
>  * Description: If true, the HiveServer2 WebUI will be secured with PAM
>  * hive.server2.webui.pam.authenticator
>  * Default value: org.apache.hive.http.security.PamAuthenticator
>  * Description: Class for PAM authentication



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM

2018-02-14 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364450#comment-16364450
 ] 

Szehon Ho commented on HIVE-18541:
--

Hi, I am sorry for the late reply.  I am mostly ok with the latest patch, 
although I would rather not allow a configuration where PAM is enabled but not 
HTTPS as its not recommended (exit versus throw a warning).  Would you have an 
issue with this?

As for the pluggable PAM Authenticator, it is just adding complexity to Hive I 
did not see is necessary, it seems like it could be a core piece of security so 
I did not see any need to make it pluggable other than the version reviewed and 
maintained by community.  Was there another motivation other than just enable 
unit test?  (It seems now you found a way to test without opening this as 
configuration.) 

 

Thanks again for the patch

> Secure HS2 web UI with PAM
> --
>
> Key: HIVE-18541
> URL: https://issues.apache.org/jira/browse/HIVE-18541
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, 
> HIVE-18541.5.patch
>
>
> Secure HS2 web UI with PAM. Add two new properties
>  * hive.server2.webui.use.pam
>  * Default value: false
>  * Description: If true, the HiveServer2 WebUI will be secured with PAM
>  * hive.server2.webui.pam.authenticator
>  * Default value: org.apache.hive.http.security.PamAuthenticator
>  * Description: Class for PAM authentication



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18347) Allow pluggable dynamic lookup of Hive Metastores from HiveServer2

2018-02-07 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Summary: Allow pluggable dynamic lookup of Hive Metastores from HiveServer2 
 (was: Allow dynamic lookup of Hive Metastores via Consul)

> Allow pluggable dynamic lookup of Hive Metastores from HiveServer2
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18347) Allow pluggable dynamic lookup of Hive Metastores from HiveServer2

2018-02-07 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master, thanks [~vihangk1] for review, and [~thejas] for help !

Will add information about open-source Criteo-hosted Consul Resolver once it is 
available.

> Allow pluggable dynamic lookup of Hive Metastores from HiveServer2
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM

2018-02-15 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365392#comment-16365392
 ] 

Szehon Ho commented on HIVE-18541:
--

OK sorry it is hacky but how about using the hive.in.test flag?  (It is not 
clean, but there are already some intest stuff in that class)

> Secure HS2 web UI with PAM
> --
>
> Key: HIVE-18541
> URL: https://issues.apache.org/jira/browse/HIVE-18541
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, 
> HIVE-18541.5.patch
>
>
> Secure HS2 web UI with PAM. Add two new properties
>  * hive.server2.webui.use.pam
>  * Default value: false
>  * Description: If true, the HiveServer2 WebUI will be secured with PAM
>  * hive.server2.webui.pam.authenticator
>  * Default value: org.apache.hive.http.security.PamAuthenticator
>  * Description: Class for PAM authentication



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM

2018-02-15 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365658#comment-16365658
 ] 

Szehon Ho commented on HIVE-18541:
--

+1

> Secure HS2 web UI with PAM
> --
>
> Key: HIVE-18541
> URL: https://issues.apache.org/jira/browse/HIVE-18541
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, 
> HIVE-18541.5.patch, HIVE-18541.6.patch
>
>
> Secure HS2 web UI with PAM. Add two new properties
>  * hive.server2.webui.use.pam
>  * Default value: false
>  * Description: If true, the HiveServer2 WebUI will be secured with PAM
>  * hive.server2.webui.pam.authenticator
>  * Default value: org.apache.hive.http.security.PamAuthenticator
>  * Description: Class for PAM authentication



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18746) add_months should validate the date first

2018-02-20 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370730#comment-16370730
 ] 

Szehon Ho commented on HIVE-18746:
--

Patch looks good but the related test seems to be failing, can you take a look?

> add_months should validate the date first
> -
>
> Key: HIVE-18746
> URL: https://issues.apache.org/jira/browse/HIVE-18746
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Subhasis Gorai
>Assignee: Kryvenko Igor
>Priority: Minor
> Attachments: HIVE-18746.patch
>
>
> hive (sbg_hvc_ods)> select add_months('2017-02-28', 1);
> OK
> _c0
> 2017-03-31
> Time taken: 0.107 seconds, Fetched: 1 row(s)
> hive (sbg_hvc_ods)> select add_months('2017-02-29', 1);
> OK
> _c0
> 2017-04-01
> Time taken: 0.084 seconds, Fetched: 1 row(s)
> hive (sbg_hvc_ods)>
>  
> '2017-02-29' is an invalid date.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2018-02-20 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370529#comment-16370529
 ] 

Szehon Ho commented on HIVE-17300:
--

Hello, [~klcopp] and [~pvary], I just stumbled across this Jira and it looks 
like a really cool addition, I am sorry to have missed it.  I will add a link 
to the webui uber Jira to make it easier to find.  Would love to get it 
committed, would you want to rebase it? 

Also do you know why we need to have a flag to configure whether to update MR 
stats?  Is there some kind performance implication if we just did all the time?

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, 
> HIVE-17300.5.patch, HIVE-17300.patch, complete_success.png, 
> full_mapred_stats.png, graph_with_mapred_stats.png, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage
> Review request: https://reviews.apache.org/r/61663/
> Any input is welcome!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-17300) WebUI query plan graphs

2018-02-20 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-17300:
-
Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-12338

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Sub-task
>  Components: Web UI
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, 
> HIVE-17300.5.patch, HIVE-17300.patch, complete_success.png, 
> full_mapred_stats.png, graph_with_mapred_stats.png, last_stage_error.png, 
> last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage
> Review request: https://reviews.apache.org/r/61663/
> Any input is welcome!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18541) Secure HS2 web UI with PAM

2018-02-20 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18541:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, thanks Oleksiy for the patch.

> Secure HS2 web UI with PAM
> --
>
> Key: HIVE-18541
> URL: https://issues.apache.org/jira/browse/HIVE-18541
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, 
> HIVE-18541.5.patch, HIVE-18541.6.patch, HIVE-18541.7.patch, HIVE-18541.8.patch
>
>
> Secure HS2 web UI with PAM. Add  property
>  * {{hive.server2.webui.use.pam}}
>  * Default value: {{false}}
>  * Description: If {{true}}, the HiveServer2 WebUI will be secured with PAM



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-07-30 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562162#comment-16562162
 ] 

Szehon Ho commented on HIVE-19767:
--

Hi Aihua, thanks for looking at this patch!  I can set it in the session, but  
to me it would be nice to set some permanent properties for the whole 
HiveServer2, not tied to a session, and as to your suggestion, I would like to 
start HiveServer2 and not a beeline with embedded HiveServer2.  In our 
use-case, we have some custom listener plugins that take in some properties not 
listed in HiveConf, what do you think?

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-07-30 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562507#comment-16562507
 ] 

Szehon Ho commented on HIVE-19767:
--

yes will do, many thanks for having a look!

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20254) CheckNonCombinablePathCallable is buggy

2018-07-30 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562715#comment-16562715
 ] 

Szehon Ho commented on HIVE-20254:
--

Looks like it is resolved by HIVE-13968

> CheckNonCombinablePathCallable is buggy
> ---
>
> Key: HIVE-20254
> URL: https://issues.apache.org/jira/browse/HIVE-20254
> Project: Hive
>  Issue Type: Bug
>Reporter: Qinghui Xu
>Priority: Major
>
> CombineHiveInputFormat provides the possibility for people to avoid combine 
> some part of their inputs (by implementing AvoidSplitCombination)
> We spot a problem with that when our query tries to read a lot of partitions 
> (more than 100). In fact, when there are more than 100 input paths, the check 
> of combinability is run in parallel:
>  * dividing the input path array into several chunks (each chunk with no more 
> than 100 paths)
>  * submit each chunk to a CheckNonCombinablePathCallable
>  * each CheckNonCombinablePathCallable will return a set of index for the 
> paths to not be combined
> The problem is that CheckNonCombinablePathCallable returns a set of relative 
> index (the index inside the chunk) instead of the absolute index, it means 
> that the returned indices are always smaller than 100, thus all the paths in 
> the array with position bigger than 100 are never taken into account for 
> avoiding combine input.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-27 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560102#comment-16560102
 ] 

Szehon Ho commented on HIVE-20153:
--

Thanks Aihua for the fix.  Yes I can test it, I am out of town at the moment so 
need to wait to get back, and hope I can do it sometime next week.  If you dont 
want to wait, feel free to go ahead, I can comment my findings afterwards.

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-20153.1.patch, Screen Shot 2018-07-12 at 6.41.28 
> PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-10 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: (was: HIVE-19767.3.patch)

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-10 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: HIVE-19767.3.patch

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-10 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577009#comment-16577009
 ] 

Szehon Ho commented on HIVE-19767:
--

OK, I wonder if this is like what you mean?

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-10 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: HIVE-19767.3.patch

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20396) Test HS2 open_connection metrics

2018-08-16 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582434#comment-16582434
 ] 

Szehon Ho commented on HIVE-20396:
--

+1

> Test HS2 open_connection metrics
> 
>
> Key: HIVE-20396
> URL: https://issues.apache.org/jira/browse/HIVE-20396
> Project: Hive
>  Issue Type: Test
>  Components: HiveServer2
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-20396.patch
>
>
> HiveServer2 is emitting metrics _default.General.open_connections_ in both 
> binary and http mode. These metrics should be tested.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-16 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582430#comment-16582430
 ] 

Szehon Ho commented on HIVE-19767:
--

Thanks for review!  Yea i guess it's more or less read-only in our use-case, i 
could fix it in a later patch if it becomes an issue.

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-17 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master, thanks Aihua for review!

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-13457) Create HS2 REST API endpoints for monitoring information

2018-08-17 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-13457:
-
Status: Patch Available  (was: Open)

> Create HS2 REST API endpoints for monitoring information
> 
>
> Key: HIVE-13457
> URL: https://issues.apache.org/jira/browse/HIVE-13457
> Project: Hive
>  Issue Type: Improvement
>Reporter: Szehon Ho
>Assignee: Pawel Szostek
>Priority: Major
> Attachments: HIVE-13457.patch
>
>
> Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if 
> other UI's like admin tools or Hue can access and display this information as 
> well.  Hence, we will create some REST endpoints to expose this information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-13 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Status: Open  (was: Patch Available)

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.2, 3.0.0, 1.2.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-13 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: HIVE-19767.4.patch

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-13 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Status: Patch Available  (was: Open)

OK, i cleaned up the patch as the setting on the HS2's hiveconf instance is 
unnecessary now, if this is the way you had in mind?  [~aihuaxu].  thanks!

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.2, 3.0.0, 1.2.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-10 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575934#comment-16575934
 ] 

Szehon Ho commented on HIVE-19767:
--

[~aihuaxu] do you mind taking another look?

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-09 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575110#comment-16575110
 ] 

Szehon Ho commented on HIVE-19767:
--

OK, I attached another patch removing the (now) redundant code.

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-09 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: HIVE-19767.2.patch

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-13 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-19767:
-
Attachment: HIVE-19767.5.patch

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-08-13 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578589#comment-16578589
 ] 

Szehon Ho commented on HIVE-19767:
--

Very minor fix for findBugs.

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, 
> HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-13457) Create HS2 REST API endpoints for monitoring information

2018-08-28 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595153#comment-16595153
 ] 

Szehon Ho commented on HIVE-13457:
--

Actually can you fix the checkstyle and findbugs?

> Create HS2 REST API endpoints for monitoring information
> 
>
> Key: HIVE-13457
> URL: https://issues.apache.org/jira/browse/HIVE-13457
> Project: Hive
>  Issue Type: Improvement
>Reporter: Szehon Ho
>Assignee: Pawel Szostek
>Priority: Major
> Attachments: HIVE-13457.3.patch, HIVE-13457.4.patch, 
> HIVE-13457.5.patch, HIVE-13457.patch, HIVE-13457.patch
>
>
> Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if 
> other UI's like admin tools or Hue can access and display this information as 
> well.  Hence, we will create some REST endpoints to expose this information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-13457) Create HS2 REST API endpoints for monitoring information

2018-08-28 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595141#comment-16595141
 ] 

Szehon Ho commented on HIVE-13457:
--

Nice +1

> Create HS2 REST API endpoints for monitoring information
> 
>
> Key: HIVE-13457
> URL: https://issues.apache.org/jira/browse/HIVE-13457
> Project: Hive
>  Issue Type: Improvement
>Reporter: Szehon Ho
>Assignee: Pawel Szostek
>Priority: Major
> Attachments: HIVE-13457.3.patch, HIVE-13457.4.patch, 
> HIVE-13457.5.patch, HIVE-13457.patch, HIVE-13457.patch
>
>
> Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if 
> other UI's like admin tools or Hue can access and display this information as 
> well.  Hence, we will create some REST endpoints to expose this information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-13 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542758#comment-16542758
 ] 

Szehon Ho commented on HIVE-20153:
--

Hello Aihua, nice to see you too, thanks for looking at it! 

Yes, in fact they are all hashmap of 0 items.

I cant get jxray to work on Mac, but i shared the heap dump on my Drive, does 
it work?  

[https://drive.google.com/open?id=1nKe43ybfgEEe0yQvtsyQPVyxghGa5X2A]

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Assignee: Aihua Xu
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20153:
-
Description: 
While playing with Hive2, we noticed that queries with a lot of count() and 
sum() aggregations run out of memory on Hadoop side much faster than in Hive1.  
In many queries, we have to double the memory.

 

Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
functions.

  was:While playing with Hive2, we noticed that queries with a lot of count() 
and sum() aggregations run out of memory on Hadoop side much faster than in 
Hive1.  Taking heap dump, we see one of the main culprit is the field 
'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to 
support Window functions.


> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory.
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20153:
-
Attachment: Screen Shot 2018-07-12 at 6.41.28 PM.png

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory.
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541929#comment-16541929
 ] 

Szehon Ho commented on HIVE-20153:
--

[~aihuaxu] do you think there is some way to improve this?  (I didn't yet take 
much look at this code to deeply understand).   It seems to consume memory even 
if its used in the window function or not.

The query is something like (generalizing the table):

select count(distinct), count(), count(), count(), min(), min(), max(), max(), 
min(), max() from table group by field;

Also I attach the heap dump of a mapper that was killed OOM for reference, 
there's 3 million GenericUDAFCountEvaluator, each with a hashmap, I also don't 
know if that is weird or not.

 

 

!Screen Shot 2018-07-12 at 6.41.28 PM.png!

 

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory.
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541929#comment-16541929
 ] 

Szehon Ho edited comment on HIVE-20153 at 7/12/18 4:49 PM:
---

[~aihuaxu] do you think there is some way to improve this?  (I didn't yet take 
much look at this code to deeply understand).   It seems to consume memory 
whether its used in the window function or not.

The query is something like (generalizing the table):

select count(distinct), count(), count(), count(), min(), min(), max(), max(), 
min(), max() from table group by field;

Also I attach the heap dump of a mapper that was killed OOM for reference, 
there's 3 million GenericUDAFCountEvaluator, each with a hashset of 
uniqueObjects.

 

 

!Screen Shot 2018-07-12 at 6.41.28 PM.png!

 


was (Author: szehon):
[~aihuaxu] do you think there is some way to improve this?  (I didn't yet take 
much look at this code to deeply understand).   It seems to consume memory even 
if its used in the window function or not.

The query is something like (generalizing the table):

select count(distinct), count(), count(), count(), min(), min(), max(), max(), 
min(), max() from table group by field;

Also I attach the heap dump of a mapper that was killed OOM for reference, 
there's 3 million GenericUDAFCountEvaluator, each with a hashmap, I also don't 
know if that is weird or not.

 

 

!Screen Shot 2018-07-12 at 6.41.28 PM.png!

 

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory.
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541929#comment-16541929
 ] 

Szehon Ho edited comment on HIVE-20153 at 7/12/18 4:55 PM:
---

[~aihuaxu] do you think there is some way to improve this?  (I didn't yet take 
much look at this code to deeply understand).   It seems to consume memory 
whether its used in the window function or not.

The query is something like (generalizing the table):

select count(distinct), count(), count(), count(), min(), min(), max(), max(), 
min(), max() from table group by field;

Also I attach the heap dump of a mapper that was killed OOM for reference, 
there's 3 million GenericUDAFCountEvaluator, each with a 'uniqueObjects' 
hashSet (each hashSet in turn containing a hashMap).

 

 

!Screen Shot 2018-07-12 at 6.41.28 PM.png!

 


was (Author: szehon):
[~aihuaxu] do you think there is some way to improve this?  (I didn't yet take 
much look at this code to deeply understand).   It seems to consume memory 
whether its used in the window function or not.

The query is something like (generalizing the table):

select count(distinct), count(), count(), count(), min(), min(), max(), max(), 
min(), max() from table group by field;

Also I attach the heap dump of a mapper that was killed OOM for reference, 
there's 3 million GenericUDAFCountEvaluator, each with a hashset of 
uniqueObjects.

 

 

!Screen Shot 2018-07-12 at 6.41.28 PM.png!

 

> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory.
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20153:
-
Description: 
While playing with Hive2, we noticed that queries with a lot of count() and 
sum() aggregations run out of memory on Hadoop side much faster than in Hive1.  
In many queries, we have to double the memory (in our particular case 
mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M)

 

Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
functions.

  was:
While playing with Hive2, we noticed that queries with a lot of count() and 
sum() aggregations run out of memory on Hadoop side much faster than in Hive1.  
In many queries, we have to double the memory.

 

Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
functions.


> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory (in our particular case 
> mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M)
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

2018-07-12 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20153:
-
Description: 
While playing with Hive2, we noticed that queries with a lot of count() and 
sum() aggregations run out of memory on Hadoop side where they worked before in 
Hive1. 

In many queries, we have to double the Mapper Memory settings (in our 
particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it makes 
it not so easy to upgrade to Hive 2.

Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
functions.

  was:
While playing with Hive2, we noticed that queries with a lot of count() and 
sum() aggregations run out of memory on Hadoop side much faster than in Hive1.  
In many queries, we have to double the memory (in our particular case 
mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M)

 

Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
functions.


> Count and Sum UDF consume more memory in Hive 2+
> 
>
> Key: HIVE-20153
> URL: https://issues.apache.org/jira/browse/HIVE-20153
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 2.3.2
>Reporter: Szehon Ho
>Priority: Major
> Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side where they worked before 
> in Hive1. 
> In many queries, we have to double the Mapper Memory settings (in our 
> particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it 
> makes it not so easy to upgrade to Hive 2.
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties

2018-07-09 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536990#comment-16536990
 ] 

Szehon Ho commented on HIVE-19767:
--

[~thejas] any chance to take a look at this patch?

> HiveServer2 should take hiveconf for non Hive properties
> 
>
> Key: HIVE-19767
> URL: https://issues.apache.org/jira/browse/HIVE-19767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2, 3.0.0, 2.3.2
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-19767.patch
>
>
> The -hiveconf command line option works in HiveServer2 with properties in 
> HiveConf.java, but not so well with other properties (like mapred properties 
> or spark properties to control underlying execution engine, or custom 
> properties understood by custom listeners)
> It is inconsistent with HiveCLI.
> HiveCLI behavior:
> {noformat}
> ./bin/hive --hiveconf a=b
> hive> set a;
> a=b {noformat}
> HiveServer2 behavior:
> {noformat}
> ./bin/hiveserver2 --hiveconf a=b
> beeline> set a;
> +-+
> |       set       |
> +-+
> | a is undefined  |
> +-+{noformat}
> Although it is possible to set up hive-site.xml or even mapred-site.xml to 
> fill in the relevant properties, it is more convenient when testing HS2 with 
> different configuration to be able to use --hiveconf to change on the fly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-01-17 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328727#comment-16328727
 ] 

Szehon Ho commented on HIVE-18347:
--

Thanks for the response guys.  What Vihang says makes sense, I am fine with not 
bundling the consul implementation in Hive and putting in our own open source 
repository.  It would be great if HIVE-18449 can help this use case by offering 
a generic hook mechanism for pluggable load-balancing between HMS instances!

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-01-22 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-18347:
-
Attachment: HIVE-18347.4.patch

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul

2018-01-22 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334604#comment-16334604
 ] 

Szehon Ho commented on HIVE-18347:
--

Thanks for the message, [~thejas]!  Included a review request for a new patch, 
that just has the hook.

> Allow dynamic lookup of Hive Metastores via Consul
> --
>
> Key: HIVE-18347
> URL: https://issues.apache.org/jira/browse/HIVE-18347
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, 
> HIVE-18347.3.patch, HIVE-18347.4.patch
>
>
> In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos 
> as dynamic services for scalability and flexibility.
> In this architecture, we would like to allow HiveServer2 to dynamically load 
> balance between Metastores (which may be scaled up and down or to different 
> nodes) for different requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2018-10-08 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641598#comment-16641598
 ] 

Szehon Ho commented on HIVE-17300:
--

Sorry guys, I was on vacation :).  Really glad to see this patch in Hive, it is 
a lot of effort and a great contribution

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, features, patch
> Fix For: 4.0.0
>
> Attachments: HIVE-17300.10.patch, HIVE-17300.10.patch, 
> HIVE-17300.10.patch, HIVE-17300.3.patch, HIVE-17300.4.patch, 
> HIVE-17300.5.patch, HIVE-17300.6.patch, HIVE-17300.7.patch, 
> HIVE-17300.7.patch, HIVE-17300.8.patch, HIVE-17300.8.patch, 
> HIVE-17300.8.patch, HIVE-17300.8.patch, HIVE-17300.9.patch, HIVE-17300.patch, 
> complete_success.png, full_mapred_stats.png, graph_with_mapred_stats.png, 
> last_stage_error.png, last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage
> Review request: https://reviews.apache.org/r/61663/
> Any input is welcome!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2018-10-24 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20789:
-
Attachment: HIVE-20789.2.patch

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-20789.2.patch, HIVE-20789.patch
>
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2018-10-24 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662256#comment-16662256
 ] 

Szehon Ho commented on HIVE-20789:
--

Thanks a lot for taking a look!  Sorry I made a mistake as I was not used to 
the confs being split :).  I left just the one on MetastoreConf, can you see if 
that is right?

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-20789.2.patch, HIVE-20789.patch
>
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20786:
-
Status: Patch Available  (was: Open)

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging . SKIPPED
> [INFO] Hive Metastore Tools ... SKIPPED
> [INFO]

[jira] [Assigned] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-20786:


Assignee: Szehon Ho

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging . SKIPPED
> [INFO] Hive Metastore Tools ... SKIPPED
> [INFO] Hive

[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20786:
-
Attachment: HIVE-20786.patch

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging . SKIPPED
> [INFO] Hive Metastore Tools ... SKIPPED
> [INFO] Hive Metastore Tools common libraries

[jira] [Commented] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-23 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660999#comment-16660999
 ] 

Szehon Ho commented on HIVE-20786:
--

Yes I've hit this error for awhile now.  Uploading a patch that seems to solve 
it for me.  Though I wasn't following so not sure why it was set on purpose to 
gnu before.

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Priority: Major
>  Labels: maven
> Attachments: hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging

[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20789:
-
Status: Patch Available  (was: Open)

We noticed that the TSocket used by the thrift thread pools are purposely 
initiated with 0 clientTimeout.  It makes sense to make it configurable to 
prevent DDOS.

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-20789.patch
>
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20789:
-
Attachment: HIVE-20789.patch

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-20789.patch
>
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-20789:


Assignee: Szehon Ho

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2018-10-23 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20789:
-
Description: 
We have had a scenario that health checks sending 0 bytes to HiveServer2 
sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will block 
reading the socket.

This is the stack (we are running an older version of Hive here)
{noformat}
"HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <23781b74> (a java.io.BufferedInputStream)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
at 
org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){noformat}

Eventually HiveServer2 has no more free threads left.

  was:
We have had a scenario that health checks sending 0 bytes to HiveServer2 
sockets would DDOS the HiveServer2, if they dont send TCP FIN then they will 
continually cause all HiveServer2 thrift thread-pool threads to block at this 
stack (we are running an older version of Hive here, so ignore the lines)
{noformat}
"HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <23781b74> (a java.io.BufferedInputStream)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
at 
org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){noformat}

Eventually HiveServer2 has no more free threads left.


> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
>

[jira] [Commented] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-31 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670502#comment-16670502
 ] 

Szehon Ho commented on HIVE-20786:
--

Hey Vihang for some reason when I just change it on packaging/pom.xml like that 
it gives an error 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-assembly-plugin:2.3:single (assemble) on project 
hive-packaging: Failed to create assembly: Error creating assembly archive bin: 
posix is not a legal value for this attribute -> [Help 1]

I need to debug it further.

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator

[jira] [Commented] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-31 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670912#comment-16670912
 ] 

Szehon Ho commented on HIVE-20786:
--

So the problem was the top-level packaging was using an old version of maven 
assembly before posix was even supported.  This latest patch builds but also 
upgrade the maven.assembly.plugin to the same one used in standalone-metastore.

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils

[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-31 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20786:
-
Attachment: (was: HIVE-20789.2.patch)

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging . SKIPPED
> [INFO] Hive Metastore Tools ... SKIPPED
>

[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-31 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20786:
-
Attachment: HIVE-20786.2.patch

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.2.patch, HIVE-20786.patch, 
> hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging . SKIPPED
> [INFO] Hive Metastore Tools ...

[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big

2018-10-31 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20786:
-
Attachment: HIVE-20789.2.patch

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Attachments: HIVE-20786.patch, HIVE-20789.2.patch, 
> hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler . SKIPPED
> [INFO] Hive Packaging . SKIPPED
> [INFO] Hive Metastore Tools ...

[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big

2018-11-15 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20786:
-
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Committed to master, thanks Vihang for the review

> Maven Build Failed with group id is too big 
> 
>
> Key: HIVE-20786
> URL: https://issues.apache.org/jira/browse/HIVE-20786
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
> Environment:  
> OS: MacOS 10.13.6
> Java:
> {code}
> java version "1.8.0_192"
> Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)
> {code}
> Maven:
> {code}
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-18T02:33:14+08:00)
> Maven home: /usr/local/Cellar/maven/3.5.4/libexec
> Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre
> Default locale: en_CN, platform encoding: UTF-8
> OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac"
> {code}
>  
>  
>Reporter: PENG Zhengshuai
>Assignee: Szehon Ho
>Priority: Major
>  Labels: maven
> Fix For: 4.0.0
>
> Attachments: HIVE-20786.2.patch, HIVE-20786.patch, 
> hive_build_error.log
>
>
> When executing
> {code}
> mvn clean install -DskipTests
> {code}
> Build Failed:
> {code}
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Hive Storage API 2.7.0-SNAPSHOT  SUCCESS [  5.299 
> s]
> [INFO] Hive 4.0.0-SNAPSHOT  SUCCESS [  0.750 
> s]
> [INFO] Hive Classifications ... SUCCESS [  1.057 
> s]
> [INFO] Hive Shims Common .. SUCCESS [  3.882 
> s]
> [INFO] Hive Shims 0.23  SUCCESS [  5.020 
> s]
> [INFO] Hive Shims Scheduler ... SUCCESS [  2.587 
> s]
> [INFO] Hive Shims . SUCCESS [  2.038 
> s]
> [INFO] Hive Common  SUCCESS [  6.921 
> s]
> [INFO] Hive Service RPC ... SUCCESS [  3.503 
> s]
> [INFO] Hive Serde . SUCCESS [  6.322 
> s]
> [INFO] Hive Standalone Metastore .. FAILURE [  0.557 
> s]
> [INFO] Hive Standalone Metastore Common Code .. SKIPPED
> [INFO] Hive Metastore . SKIPPED
> [INFO] Hive Vector-Code-Gen Utilities . SKIPPED
> [INFO] Hive Llap Common ... SKIPPED
> [INFO] Hive Llap Client ... SKIPPED
> [INFO] Hive Llap Tez .. SKIPPED
> [INFO] Hive Spark Remote Client ... SKIPPED
> [INFO] Hive Metastore Server .. SKIPPED
> [INFO] Hive Query Language  SKIPPED
> [INFO] Hive Llap Server ... SKIPPED
> [INFO] Hive Service ... SKIPPED
> [INFO] Hive Accumulo Handler .. SKIPPED
> [INFO] Hive JDBC .. SKIPPED
> [INFO] Hive Beeline ... SKIPPED
> [INFO] Hive CLI ... SKIPPED
> [INFO] Hive Contrib ... SKIPPED
> [INFO] Hive Druid Handler . SKIPPED
> [INFO] Hive HBase Handler . SKIPPED
> [INFO] Hive JDBC Handler .. SKIPPED
> [INFO] Hive HCatalog .. SKIPPED
> [INFO] Hive HCatalog Core . SKIPPED
> [INFO] Hive HCatalog Pig Adapter .. SKIPPED
> [INFO] Hive HCatalog Server Extensions  SKIPPED
> [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED
> [INFO] Hive HCatalog Webhcat .. SKIPPED
> [INFO] Hive HCatalog Streaming  SKIPPED
> [INFO] Hive HPL/SQL ... SKIPPED
> [INFO] Hive Streaming . SKIPPED
> [INFO] Hive Llap External Client .. SKIPPED
> [INFO] Hive Shims Aggregator .. SKIPPED
> [INFO] Hive Kryo Registrator .. SKIPPED
> [INFO] Hive TestUtils . SKIPPED
> [INFO] Hive Kafka Storage Handler

[jira] [Commented] (HIVE-17300) WebUI query plan graphs

2018-09-25 Thread Szehon Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627772#comment-16627772
 ] 

Szehon Ho commented on HIVE-17300:
--

Hi Karen, sorry for the StringUtils, I missed this on my end that it's already 
imported.

For the OperationLog I saw its accessible from a ThreadLocal, I wonder if it 
will work.

> WebUI query plan graphs
> ---
>
> Key: HIVE-17300
> URL: https://issues.apache.org/jira/browse/HIVE-17300
> Project: Hive
>  Issue Type: Sub-task
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, features, patch
> Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, 
> HIVE-17300.5.patch, HIVE-17300.6.patch, HIVE-17300.7.patch, 
> HIVE-17300.7.patch, HIVE-17300.8.patch, HIVE-17300.8.patch, 
> HIVE-17300.8.patch, HIVE-17300.8.patch, HIVE-17300.9.patch, HIVE-17300.patch, 
> complete_success.png, full_mapred_stats.png, graph_with_mapred_stats.png, 
> last_stage_error.png, last_stage_running.png, non_mapred_task_selected.png
>
>
> Hi all,
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below. 
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info. 
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> I’d love to hear any and all input from the community about this feature: do 
> you think it’s useful, and is there anything important I’m missing?
> Thanks,
> Karen Coppage
> Review request: https://reviews.apache.org/r/61663/
> Any input is welcome!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21033) Forgetting to close operation cuts off any more HiveServer2 output

2018-12-27 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-21033:
-
Attachment: HIVE-21033.5.patch

> Forgetting to close operation cuts off any more HiveServer2 output
> --
>
> Key: HIVE-21033
> URL: https://issues.apache.org/jira/browse/HIVE-21033
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-21033.2.patch, HIVE-21033.3.patch, 
> HIVE-21033.4.patch, HIVE-21033.5.patch, HIVE-21033.patch
>
>
> We had a custom client that did not handle closing the operations, until the 
> end of the session.  it is a mistake in the client, but it reveals kind of a 
> vulnerability in HiveServer2
> This happens if you have a session with  (1) HiveCommandOperation and (2) 
> SQLOperation and don't close them right after.  For example a session that 
> does the operations (set a=b; select * from foobar; ). 
> When SQLOperation runs , it set SessionState.out and err to be System.out and 
> System.err . Ref:  
> [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139]
> Then the client closes the session, or disconnects which triggers 
> closeSession() on the Thrift side.  In this case, the closeSession closes all 
> the operations, starting with HiveCommandOperation.  This closes all the 
> streams, which is System.out and System.err as set by SQLOperation earlier.  
> Ref: 
> [HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101]
>  
> After this, no more HiveServer2 output appears as System.out and System.err 
> are closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21033) Sudden disconnect for a session with set and SQL operation cuts off any more HiveServer2 output

2018-12-12 Thread Szehon Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-21033:
-
Description: 
Its a bit tricky to reproduce, but we were able to do it (unfortunately) with 
our custom client that did not handle closing the operation or session on the 
error case.  But it may also happen for any client that just disconnects in the 
middle of this operation.

Basically you have a session with both HiveCommandOperation and SQLOperation.  
For example a session that does the operations (set a=b; select * from foobar; 
). 

The SQLOperation runs last and set SessionState.out and err to be System.out 
and System.err . Ref:  
[SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139]

Then the client terminates without closing the session. (In our case, a 
SemanticException triggered it).  The deleteContext is called, which closes the 
session:  Ref 
[ThriftBinaryCLIService#deleteContext|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java#L141]

The Session closes all the operations, starting with HiveCommandOperation.  
This one closes all the streams, which is System.out and System.err as set by 
SQLOperation earlier.  Ref: 
[HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101]
 

After this, no more HiveServer2 output appears as System.out and System.err are 
closed.

  was:
Its a bit tricky to reproduce, but we were able to do it (unfortunately) with 
our custom client that did not handle closing the session on the error case.  
But it may also happen for any client that just disconnects in the middle of 
this operation.

Basically you have a session with both HiveCommandOperation and SQLOperation.  
For example a session that does the operations (set a=b; select * from foobar; 
). 

The SQLOperation runs last and set SessionState.out and err to be System.out 
and System.err . Ref:  
[SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139]

Then the client terminates without closing the session. (In our case, a 
SemanticException triggered it).  The deleteContext is called, which closes the 
session:  Ref 
[ThriftBinaryCLIService#deleteContext|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java#L141]

The Session closes all the operations, starting with HiveCommandOperation.  
This one closes all the streams, which is System.out and System.err as set by 
SQLOperation earlier.  Ref: 
[HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101]
 

After this, no more HiveServer2 output appears as System.out and System.err are 
closed.


> Sudden disconnect for a session with set and SQL operation cuts off any more 
> HiveServer2 output
> ---
>
> Key: HIVE-21033
> URL: https://issues.apache.org/jira/browse/HIVE-21033
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Priority: Major
>
> Its a bit tricky to reproduce, but we were able to do it (unfortunately) with 
> our custom client that did not handle closing the operation or session on the 
> error case.  But it may also happen for any client that just disconnects in 
> the middle of this operation.
> Basically you have a session with both HiveCommandOperation and SQLOperation. 
>  For example a session that does the operations (set a=b; select * from 
> foobar; ). 
> The SQLOperation runs last and set SessionState.out and err to be System.out 
> and System.err . Ref:  
> [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139]
> Then the client terminates without closing the session. (In our case, a 
> SemanticException triggered it).  The deleteContext is called, which closes 
> the session:  Ref 
> [ThriftBinaryCLIService#deleteContext|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java#L141]
> The Session closes all the operations, starting with HiveCommandOperation.  
> This one closes all the streams, which is System.out and System.err as set by 
> SQLOperation earlier.  Ref: 
>

< 2 3 4 5 6 7 8 >

601 - 700 of 797 matches

Mail list logo