[jira] [Commented] (HIVE-14117) HS2 UI: List of recent queries shows most recent query last
[ https://issues.apache.org/jira/browse/HIVE-14117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353450#comment-15353450 ] Szehon Ho commented on HIVE-14117: -- nice idea, +1 > HS2 UI: List of recent queries shows most recent query last > --- > > Key: HIVE-14117 > URL: https://issues.apache.org/jira/browse/HIVE-14117 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-14117.1.patch > > > It's more useful to see the latest one first in your "last n queries" view. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14063) beeline to auto connect to the HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355658#comment-15355658 ] Szehon Ho commented on HIVE-14063: -- Cool. I wonder to clarify, can we see what this conf file will look like? And what is the relationship with the newly-added beeline.properties. What kind of things go where, and the interaction between the two, if both are present and if there are same properties defined in both? > beeline to auto connect to the HiveServer2 > -- > > Key: HIVE-14063 > URL: https://issues.apache.org/jira/browse/HIVE-14063 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > > Currently one has to give an jdbc:hive2 url in order for Beeline to connect a > hiveserver2 instance. It would be great if Beeline can get the info somehow > (from a properties file at a well-known location?) and connect automatically > if user doesn't specify such a url. If the properties file is not present, > then beeline would expect user to provide the url and credentials using > !connect or ./beeline -u .. commands > While Beeline is flexible (being a mere JDBC client), most environments would > have just a single HS2. Having users to manually connect into this via either > "beeline ~/.propsfile" or -u or !connect statements is lowering the > experience part. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14754) Track the queries execution lifecycle times
[ https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-14754: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks Barna for the contribution! > Track the queries execution lifecycle times > --- > > Key: HIVE-14754 > URL: https://issues.apache.org/jira/browse/HIVE-14754 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch > > > We should be able to track the nr. of queries being compiled/executed at any > given time, as well as the duration of the execution and compilation phase. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14754) Track the queries execution lifecycle times
[ https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851289#comment-15851289 ] Szehon Ho commented on HIVE-14754: -- +1 , glad to see new metrics! just a question, what is the 1028 from (seems a bit arbitrary), and is it configurable? > Track the queries execution lifecycle times > --- > > Key: HIVE-14754 > URL: https://issues.apache.org/jira/browse/HIVE-14754 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch > > > We should be able to track the nr. of queries being compiled/executed at any > given time, as well as the duration of the execution and compilation phase. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14754) Track the queries execution lifecycle times
[ https://issues.apache.org/jira/browse/HIVE-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859308#comment-15859308 ] Szehon Ho commented on HIVE-14754: -- Sorry I did not catch the typo, yea we should fix it before the release if we can. > Track the queries execution lifecycle times > --- > > Key: HIVE-14754 > URL: https://issues.apache.org/jira/browse/HIVE-14754 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Affects Versions: 2.2.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-14754.1.patch, HIVE-14754.2.patch, HIVE-14754.patch > > > We should be able to track the nr. of queries being compiled/executed at any > given time, as well as the duration of the execution and compilation phase. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-14775) Investigate IOException usage in Metrics APIs
[ https://issues.apache.org/jira/browse/HIVE-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15530366#comment-15530366 ] Szehon Ho commented on HIVE-14775: -- Makes sense, +1 on latest patch to me. > Investigate IOException usage in Metrics APIs > - > > Key: HIVE-14775 > URL: https://issues.apache.org/jira/browse/HIVE-14775 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2, Metastore >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > > A large number of metrics APIs seem to declare to throw IOExceptions > needlessly. (incrementCounter, decrementCounter etc.) > This is not only misleading but it fills up the code with unnecessary catch > blocks never to be reached. > We should investigate if these exceptions are thrown at all, and remove them > if it is truly unused. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14776) Skip 'distcp' call when copying data from HDSF to S3
[ https://issues.apache.org/jira/browse/HIVE-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15497699#comment-15497699 ] Szehon Ho commented on HIVE-14776: -- I'm curious about this. Distcp parallelizes the copy, and so if the file/dir is very splittable then in theory it should be faster than single thread, even though there's the overhead of temporary location for it? I understand for some small files it will be slower. And just orthogonally, I thought actually distcp puts the file in temporary location on local file before uploading to S3, not a temporary location on S3. > Skip 'distcp' call when copying data from HDSF to S3 > > > Key: HIVE-14776 > URL: https://issues.apache.org/jira/browse/HIVE-14776 > Project: Hive > Issue Type: Sub-task > Components: Hive >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-14776.1.patch, HIVE-14776.2.patch > > > Hive uses 'distcp' to copy files in parallel between HDFS encryption zones > when the {{hive.exec.copyfile.maxsize}} threshold is lower than the file to > copy. This 'distcp' is also executed when copying to S3, but it is causing > slower copies. > We should not invoke distcp when copying to blobstore systems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14713) LDAP Authentication Provider should be covered with unit tests
[ https://issues.apache.org/jira/browse/HIVE-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15514737#comment-15514737 ] Szehon Ho commented on HIVE-14713: -- I think there is a 24 hour wait after the last +1 to get merged (at least last time I checked). Feel free to ping again if it is forgotten. > LDAP Authentication Provider should be covered with unit tests > -- > > Key: HIVE-14713 > URL: https://issues.apache.org/jira/browse/HIVE-14713 > Project: Hive > Issue Type: Test > Components: Authentication, Tests >Affects Versions: 2.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy > Attachments: HIVE-14713.1.patch, HIVE-14713.2.patch, > HIVE-14713.3.patch > > > Currently LdapAuthenticationProviderImpl class is not covered with unit > tests. To make this class testable some minor refactoring will be required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14775) Investigate IOException usage in Metrics APIs
[ https://issues.apache.org/jira/browse/HIVE-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527043#comment-15527043 ] Szehon Ho commented on HIVE-14775: -- Yea definitely appreciate the cleanup, never had time to investigate. Do we know what scenario lead to JMXException? I did have only some minor comments, left on RB > Investigate IOException usage in Metrics APIs > - > > Key: HIVE-14775 > URL: https://issues.apache.org/jira/browse/HIVE-14775 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2, Metastore >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > > A large number of metrics APIs seem to declare to throw IOExceptions > needlessly. (incrementCounter, decrementCounter etc.) > This is not only misleading but it fills up the code with unnecessary catch > blocks never to be reached. > We should investigate if these exceptions are thrown at all, and remove them > if it is truly unused. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14984) Hive-WebUI access results in Request is a replay (34) attack
[ https://issues.apache.org/jira/browse/HIVE-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592474#comment-15592474 ] Szehon Ho commented on HIVE-14984: -- Thanks a lot Barna. FYI [~jxiang] > Hive-WebUI access results in Request is a replay (34) attack > > > Key: HIVE-14984 > URL: https://issues.apache.org/jira/browse/HIVE-14984 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.0 >Reporter: Venkat Sambath >Assignee: Barna Zsombor Klara > Attachments: HIVE-14984.patch > > > When trying to access kerberized webui of HS2, The following error is received > GSSException: Failure unspecified at GSS-API level (Mechanism level: Request > is a replay (34)) > While this is not happening for RM webui (checked if kerberos webui is > enabled) > To reproduce the issue > Try running > curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt > http://:10002/ > from any cluster nodes > or > Try accessing the URL from a VM with windows machine and firefox browser to > replicate the issue > The following workaround helped, but need a permanent solution for the bug > Workaround: > = > First access the index.html directly and then actual URL of webui > curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt > http://:10002/index.html > curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt > http://:10002 > In browser: > First access > http://:10002/index.html > then > http://:10002 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2
[ https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592480#comment-15592480 ] Szehon Ho commented on HIVE-14753: -- Sorry for delay, +1 to me > Track the number of open/closed/abandoned sessions in HS2 > - > > Key: HIVE-14753 > URL: https://issues.apache.org/jira/browse/HIVE-14753 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, > HIVE-14753.3.patch, HIVE-14753.patch > > > We should be able to track the nr. of sessions since the startup of the HS2 > instance as well as the average lifetime of a session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2
[ https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-14753: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master, thanks Barna for the contribution! > Track the number of open/closed/abandoned sessions in HS2 > - > > Key: HIVE-14753 > URL: https://issues.apache.org/jira/browse/HIVE-14753 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara > Fix For: 2.2.0 > > Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, > HIVE-14753.3.patch, HIVE-14753.patch > > > We should be able to track the nr. of sessions since the startup of the HS2 > instance as well as the average lifetime of a session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13517) Hive logs in Spark Executor and Driver should show thread-id.
[ https://issues.apache.org/jira/browse/HIVE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623323#comment-15623323 ] Szehon Ho commented on HIVE-13517: -- Yea if the thread name is there, that is great. I thought last time when I checked the Spark Executor and Driver logs that they were mixed, and there was no indication about the thread. I don't have an environment right now to check that, do you see the thread name now in those logs? > Hive logs in Spark Executor and Driver should show thread-id. > - > > Key: HIVE-13517 > URL: https://issues.apache.org/jira/browse/HIVE-13517 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1, 2.0.0 >Reporter: Szehon Ho >Assignee: liyunzhang_intel > > In Spark, there might be more than one task running in one executor. > Similarly, there may be more than one thread running in Driver. > This makes debugging through the logs a nightmare. It would be great if there > could be thread-ids in the logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15102) Hiveptest is killing nodes where IP is reused after previous node termination
[ https://issues.apache.org/jira/browse/HIVE-15102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626046#comment-15626046 ] Szehon Ho commented on HIVE-15102: -- +1 I think Brock wrote this original code so he might know more, but yes it does look like a bug to me. Only small comment is, you can annotate this method with @VisibleForTesting annotation. > Hiveptest is killing nodes where IP is reused after previous node termination > - > > Key: HIVE-15102 > URL: https://issues.apache.org/jira/browse/HIVE-15102 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.2.0 >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-15102.1.patch > > > NO PRECOMMIT TESTS > The Hiveptest framework has a background thread that runs every hour, and > attempts to kill zombie nodes that are not being used by the test execution > anymore. > These killed nodes are kept in a list of terminated nodes, and next time the > background thread is executed, it will attempt to kill all those nodes again > because Hiveptest consider them as zombie nodes. > The problem is that cloud providers can give you the same IP numbers for new > nodes, and when the background thread runs, it will kill those nodes that may > still be in used by Hiveptest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15385) Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., false) causes queries to fail
[ https://issues.apache.org/jira/browse/HIVE-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15741636#comment-15741636 ] Szehon Ho commented on HIVE-15385: -- Sorry for late reply, glad it's figured out, thanks guys for taking care of it. > Failure to inherit permissions when running HdfsUtils.setFullFileStatus(..., > false) causes queries to fail > -- > > Key: HIVE-15385 > URL: https://issues.apache.org/jira/browse/HIVE-15385 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Fix For: 2.2.0 > > Attachments: HIVE-15385.1.patch, HIVE-15385.2.patch > > > According to > https://cwiki.apache.org/confluence/display/Hive/Permission+Inheritance+in+Hive, > failure to inherit permissions should not cause queries to fail. > It looks like this was the case until HIVE-13716, which added some code to > use {{fs.setOwner}}, {{fs.setAcl}}, and {{fs.setPermission}} to set > permissions instead of shelling out and running {{-chgrp -R ...}}. > When shelling out, the return status of each command is ignored, so if there > are any failures when inheriting permissions, a warning is logged, but the > query still succeeds. > However, when invoked the {{FileSystem}} API, any failures will be propagated > up to the caller, and the query will fail. > This is problematic because {{setFulFileStatus}} shells out when the > {{recursive}} parameter is set to {{true}}, and when it is false it invokes > the {{FileSystem}} API. So the behavior is inconsistent depending on the > value of {{recursive}}. > We should decide whether or not permission inheritance should fail queries or > not, and then ensure the code consistently follows that decision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15330) Bump JClouds version to 2.0.0 on Hive/Ptest
[ https://issues.apache.org/jira/browse/HIVE-15330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15725516#comment-15725516 ] Szehon Ho commented on HIVE-15330: -- +1, sounds good to me > Bump JClouds version to 2.0.0 on Hive/Ptest > --- > > Key: HIVE-15330 > URL: https://issues.apache.org/jira/browse/HIVE-15330 > Project: Hive > Issue Type: Task > Components: Hive, Testing Infrastructure >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-15330.1.patch > > > NO PRECOMMIT TESTS > JClouds 2.0.0 fixes several issues with Google Compute Engine API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503320#comment-16503320 ] Szehon Ho commented on HIVE-19767: -- [~thejas] Sorry to bother , but as you are the original author of adding hiveconf to hiveserver2, do you think this change makes sense and if this is the way you would go about it? Seems there is a bit of legacy code of setting them via environment variables that I did not want to touch. > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508239#comment-16508239 ] Szehon Ho commented on HIVE-19767: -- [~stakiar] , [~vihangk1] would you guys mind taking a look at this patch? Thanks! > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-19767: > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: HIVE-19767.patch > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Status: Patch Available (was: Open) Simple fix attempt, as this seems to be done in HiveCLI: [https://github.com/apache/hive/blob/master/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java#L727] > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Affects Version/s: 1.2.2 3.0.0 2.3.2 > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-18347: Assignee: Szehon Ho > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature >Reporter: Szehon Ho >Assignee: Szehon Ho > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Attachment: HIVE-18347.1.patch This patch allows a plugable Hive Metastore URI Resolver hook that is called to resolve Metastore URI's from HiveMetastoreClient, and implements one that we use in production. This connects to our Marathon-based Consul service for lookup of a particular Consul-registered service, and will read a consul-based scheme for hive.metastore.uris. One can imagine other schemes, like for example Zookeeper-based Metastore registration, those could also be implemented via this plugin. > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-18347.1.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Attachment: HIVE-18347.2.patch > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Status: Patch Available (was: Open) This is my first patch in awhile, I hope I did it correctly :) > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Component/s: Metastore > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Attachment: HIVE-18347.3.patch Fix compiling of just the 'contrib' module by adding explicit dependency to metastore. > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326348#comment-16326348 ] Szehon Ho commented on HIVE-18347: -- [~alangates] [~vihangk1] any thoughts on whether this is a useful contribution to hive? Hive deployment in our organization (Criteo) uses open source tool Consul to do service discovery/state for Hive Metastores, but I am not sure the community guidelines on adding support for outside tools like this. Though this patch does allow pluggable to other service discovery mechanisms. > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-12338) Add webui to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326339#comment-16326339 ] Szehon Ho commented on HIVE-12338: -- Hey not yet, but I think it is pretty easy to do. I had made HIVE-13457 but have not had time to do this yet. > Add webui to HiveServer2 > > > Key: HIVE-12338 > URL: https://issues.apache.org/jira/browse/HIVE-12338 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Major > Attachments: HIVE-12338.1.patch, HIVE-12338.2.patch, > HIVE-12338.3.patch, HIVE-12338.4.patch, hs2-conf.png, hs2-logs.png, > hs2-metrics.png, hs2-webui.png > > > A web ui for HiveServer2 can show some useful information such as: > > 1. Sessions, > 2. Queries that are executing on the HS2, their states, starting time, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16330785#comment-16330785 ] Szehon Ho commented on HIVE-18347: -- After reading the discussion on HIVE-18449, this looks like still a valid approach as it keeps the randomness of selection, but allows a custom resolver to resolve the uri. [~thejas] [~vihangk1] Would this patch be ok as is, or is better to have just the resolver hook, leaving the Consul implementation for our own repository? (I'm ok with doing the latter) > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18449) Add configurable policy for choosing the HMS URI from hive.metastore.uris
[ https://issues.apache.org/jira/browse/HIVE-18449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328757#comment-16328757 ] Szehon Ho commented on HIVE-18449: -- Jumping in from HIVE-18347 on Vihang's pointers, I think it would be great to have a pluggable way to get hive.metastore.uris as I tried to do there. In our data center we have implemented Metastore on Mesos that can be restarted automatically across nodes depending on load, and we use consul to dynamically discover them. Consul can also allow us to do some tricks, like not return a certain metastore if it is loaded. Currently the list of metastore.uris as read by HS2 is static and would force us to restart all the HS2 everytime a Metastore is added, removed, or moved. > Add configurable policy for choosing the HMS URI from hive.metastore.uris > - > > Key: HIVE-18449 > URL: https://issues.apache.org/jira/browse/HIVE-18449 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Sahil Takiar >Assignee: Janaki Lahorani >Priority: Major > > HIVE-10815 added logic to randomly choose a HMS URI from > {{hive.metastore.uris}}. It would be nice if there was a configurable policy > that determined how a URI is chosen from this list - e.g. one option can be > to randomly pick a URI, another option can be to choose the first URI in the > list (which was the behavior prior to HIVE-10815). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350780#comment-16350780 ] Szehon Ho commented on HIVE-18347: -- Hi Vihang, thanks for the review! Sorry about the late response. But about the comments, I was under the impression (at least from Thejas's comment about HIVE-18449) that we wanted to keep the selection policy as a separate knob of Hive under Hive control, and not via this hook, which was what the review comment is suggesting? I think HIVE-19449 seems to already give another knob for offering some predefined selection mechanism, so I think we shouldn't incorporate all that into a defaultHook that can be over-riden. What do you think? > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Attachment: HIVE-18347.5.patch > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352752#comment-16352752 ] Szehon Ho commented on HIVE-18347: -- Thanks a lot. Also I guess HIVE-18449 conflicts with this, so uploading another try to rebase it. > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354253#comment-16354253 ] Szehon Ho commented on HIVE-18347: -- [~vihangk1] do you want to have another review on my rebased patch ? thanks in advance > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM
[ https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354065#comment-16354065 ] Szehon Ho commented on HIVE-18541: -- Hi Oleksiy, thanks for the patch. I made some review comments. > Secure HS2 web UI with PAM > -- > > Key: HIVE-18541 > URL: https://issues.apache.org/jira/browse/HIVE-18541 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Oleksiy Sayankin >Assignee: Oleksiy Sayankin >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18541.1.patch > > > Secure HS2 web UI with PAM. Add two new properties > * hive.server2.webui.use.pam > * Default value: false > * Description: If true, the HiveServer2 WebUI will be secured with PAM > * hive.server2.webui.pam.authenticator > * Default value: org.apache.hive.http.security.PamAuthenticator > * Description: Class for PAM authentication -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM
[ https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364450#comment-16364450 ] Szehon Ho commented on HIVE-18541: -- Hi, I am sorry for the late reply. I am mostly ok with the latest patch, although I would rather not allow a configuration where PAM is enabled but not HTTPS as its not recommended (exit versus throw a warning). Would you have an issue with this? As for the pluggable PAM Authenticator, it is just adding complexity to Hive I did not see is necessary, it seems like it could be a core piece of security so I did not see any need to make it pluggable other than the version reviewed and maintained by community. Was there another motivation other than just enable unit test? (It seems now you found a way to test without opening this as configuration.) Thanks again for the patch > Secure HS2 web UI with PAM > -- > > Key: HIVE-18541 > URL: https://issues.apache.org/jira/browse/HIVE-18541 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Oleksiy Sayankin >Assignee: Oleksiy Sayankin >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, > HIVE-18541.5.patch > > > Secure HS2 web UI with PAM. Add two new properties > * hive.server2.webui.use.pam > * Default value: false > * Description: If true, the HiveServer2 WebUI will be secured with PAM > * hive.server2.webui.pam.authenticator > * Default value: org.apache.hive.http.security.PamAuthenticator > * Description: Class for PAM authentication -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18347) Allow pluggable dynamic lookup of Hive Metastores from HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Summary: Allow pluggable dynamic lookup of Hive Metastores from HiveServer2 (was: Allow dynamic lookup of Hive Metastores via Consul) > Allow pluggable dynamic lookup of Hive Metastores from HiveServer2 > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18347) Allow pluggable dynamic lookup of Hive Metastores from HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master, thanks [~vihangk1] for review, and [~thejas] for help ! Will add information about open-source Criteo-hosted Consul Resolver once it is available. > Allow pluggable dynamic lookup of Hive Metastores from HiveServer2 > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch, HIVE-18347.5.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM
[ https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365392#comment-16365392 ] Szehon Ho commented on HIVE-18541: -- OK sorry it is hacky but how about using the hive.in.test flag? (It is not clean, but there are already some intest stuff in that class) > Secure HS2 web UI with PAM > -- > > Key: HIVE-18541 > URL: https://issues.apache.org/jira/browse/HIVE-18541 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Oleksiy Sayankin >Assignee: Oleksiy Sayankin >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, > HIVE-18541.5.patch > > > Secure HS2 web UI with PAM. Add two new properties > * hive.server2.webui.use.pam > * Default value: false > * Description: If true, the HiveServer2 WebUI will be secured with PAM > * hive.server2.webui.pam.authenticator > * Default value: org.apache.hive.http.security.PamAuthenticator > * Description: Class for PAM authentication -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18541) Secure HS2 web UI with PAM
[ https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365658#comment-16365658 ] Szehon Ho commented on HIVE-18541: -- +1 > Secure HS2 web UI with PAM > -- > > Key: HIVE-18541 > URL: https://issues.apache.org/jira/browse/HIVE-18541 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Oleksiy Sayankin >Assignee: Oleksiy Sayankin >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, > HIVE-18541.5.patch, HIVE-18541.6.patch > > > Secure HS2 web UI with PAM. Add two new properties > * hive.server2.webui.use.pam > * Default value: false > * Description: If true, the HiveServer2 WebUI will be secured with PAM > * hive.server2.webui.pam.authenticator > * Default value: org.apache.hive.http.security.PamAuthenticator > * Description: Class for PAM authentication -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18746) add_months should validate the date first
[ https://issues.apache.org/jira/browse/HIVE-18746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370730#comment-16370730 ] Szehon Ho commented on HIVE-18746: -- Patch looks good but the related test seems to be failing, can you take a look? > add_months should validate the date first > - > > Key: HIVE-18746 > URL: https://issues.apache.org/jira/browse/HIVE-18746 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Subhasis Gorai >Assignee: Kryvenko Igor >Priority: Minor > Attachments: HIVE-18746.patch > > > hive (sbg_hvc_ods)> select add_months('2017-02-28', 1); > OK > _c0 > 2017-03-31 > Time taken: 0.107 seconds, Fetched: 1 row(s) > hive (sbg_hvc_ods)> select add_months('2017-02-29', 1); > OK > _c0 > 2017-04-01 > Time taken: 0.084 seconds, Fetched: 1 row(s) > hive (sbg_hvc_ods)> > > '2017-02-29' is an invalid date. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17300) WebUI query plan graphs
[ https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370529#comment-16370529 ] Szehon Ho commented on HIVE-17300: -- Hello, [~klcopp] and [~pvary], I just stumbled across this Jira and it looks like a really cool addition, I am sorry to have missed it. I will add a link to the webui uber Jira to make it easier to find. Would love to get it committed, would you want to rebase it? Also do you know why we need to have a flag to configure whether to update MR stats? Is there some kind performance implication if we just did all the time? > WebUI query plan graphs > --- > > Key: HIVE-17300 > URL: https://issues.apache.org/jira/browse/HIVE-17300 > Project: Hive > Issue Type: Improvement > Components: Web UI >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, > HIVE-17300.5.patch, HIVE-17300.patch, complete_success.png, > full_mapred_stats.png, graph_with_mapred_stats.png, last_stage_error.png, > last_stage_running.png, non_mapred_task_selected.png > > > Hi all, > I’m working on a feature of the Hive WebUI Query Plan tab that would provide > the option to display the query plan as a nice graph (scroll down for > screenshots). If you click on one of the graph’s stages, the plan for that > stage appears as text below. > Stages are color-coded if they have a status (Success, Error, Running), and > the rest are grayed out. Coloring is based on status already available in the > WebUI, under the Stages tab. > There is an additional option to display stats for MapReduce tasks. This > includes the job’s ID, tracking URL (where the logs are found), and mapper > and reducer numbers/progress, among other info. > The library I’m using for the graph is called vis.js (http://visjs.org/). It > has an Apache license, and the only necessary file to be included from this > library is about 700 KB. > I tried to keep server-side changes minimal, and graph generation is taken > care of by the client. Plans with more than a given number of stages > (default: 25) won't be displayed in order to preserve resources. > I’d love to hear any and all input from the community about this feature: do > you think it’s useful, and is there anything important I’m missing? > Thanks, > Karen Coppage > Review request: https://reviews.apache.org/r/61663/ > Any input is welcome! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17300) WebUI query plan graphs
[ https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-17300: - Issue Type: Sub-task (was: Improvement) Parent: HIVE-12338 > WebUI query plan graphs > --- > > Key: HIVE-17300 > URL: https://issues.apache.org/jira/browse/HIVE-17300 > Project: Hive > Issue Type: Sub-task > Components: Web UI >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, > HIVE-17300.5.patch, HIVE-17300.patch, complete_success.png, > full_mapred_stats.png, graph_with_mapred_stats.png, last_stage_error.png, > last_stage_running.png, non_mapred_task_selected.png > > > Hi all, > I’m working on a feature of the Hive WebUI Query Plan tab that would provide > the option to display the query plan as a nice graph (scroll down for > screenshots). If you click on one of the graph’s stages, the plan for that > stage appears as text below. > Stages are color-coded if they have a status (Success, Error, Running), and > the rest are grayed out. Coloring is based on status already available in the > WebUI, under the Stages tab. > There is an additional option to display stats for MapReduce tasks. This > includes the job’s ID, tracking URL (where the logs are found), and mapper > and reducer numbers/progress, among other info. > The library I’m using for the graph is called vis.js (http://visjs.org/). It > has an Apache license, and the only necessary file to be included from this > library is about 700 KB. > I tried to keep server-side changes minimal, and graph generation is taken > care of by the client. Plans with more than a given number of stages > (default: 25) won't be displayed in order to preserve resources. > I’d love to hear any and all input from the community about this feature: do > you think it’s useful, and is there anything important I’m missing? > Thanks, > Karen Coppage > Review request: https://reviews.apache.org/r/61663/ > Any input is welcome! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18541) Secure HS2 web UI with PAM
[ https://issues.apache.org/jira/browse/HIVE-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18541: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master, thanks Oleksiy for the patch. > Secure HS2 web UI with PAM > -- > > Key: HIVE-18541 > URL: https://issues.apache.org/jira/browse/HIVE-18541 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Oleksiy Sayankin >Assignee: Oleksiy Sayankin >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18541.1.patch, HIVE-18541.2.patch, > HIVE-18541.5.patch, HIVE-18541.6.patch, HIVE-18541.7.patch, HIVE-18541.8.patch > > > Secure HS2 web UI with PAM. Add property > * {{hive.server2.webui.use.pam}} > * Default value: {{false}} > * Description: If {{true}}, the HiveServer2 WebUI will be secured with PAM -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562162#comment-16562162 ] Szehon Ho commented on HIVE-19767: -- Hi Aihua, thanks for looking at this patch! I can set it in the session, but to me it would be nice to set some permanent properties for the whole HiveServer2, not tied to a session, and as to your suggestion, I would like to start HiveServer2 and not a beeline with embedded HiveServer2. In our use-case, we have some custom listener plugins that take in some properties not listed in HiveConf, what do you think? > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562507#comment-16562507 ] Szehon Ho commented on HIVE-19767: -- yes will do, many thanks for having a look! > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20254) CheckNonCombinablePathCallable is buggy
[ https://issues.apache.org/jira/browse/HIVE-20254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562715#comment-16562715 ] Szehon Ho commented on HIVE-20254: -- Looks like it is resolved by HIVE-13968 > CheckNonCombinablePathCallable is buggy > --- > > Key: HIVE-20254 > URL: https://issues.apache.org/jira/browse/HIVE-20254 > Project: Hive > Issue Type: Bug >Reporter: Qinghui Xu >Priority: Major > > CombineHiveInputFormat provides the possibility for people to avoid combine > some part of their inputs (by implementing AvoidSplitCombination) > We spot a problem with that when our query tries to read a lot of partitions > (more than 100). In fact, when there are more than 100 input paths, the check > of combinability is run in parallel: > * dividing the input path array into several chunks (each chunk with no more > than 100 paths) > * submit each chunk to a CheckNonCombinablePathCallable > * each CheckNonCombinablePathCallable will return a set of index for the > paths to not be combined > The problem is that CheckNonCombinablePathCallable returns a set of relative > index (the index inside the chunk) instead of the absolute index, it means > that the returned indices are always smaller than 100, thus all the paths in > the array with position bigger than 100 are never taken into account for > avoiding combine input. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560102#comment-16560102 ] Szehon Ho commented on HIVE-20153: -- Thanks Aihua for the fix. Yes I can test it, I am out of town at the moment so need to wait to get back, and hope I can do it sometime next week. If you dont want to wait, feel free to go ahead, I can comment my findings afterwards. > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Assignee: Aihua Xu >Priority: Major > Attachments: HIVE-20153.1.patch, Screen Shot 2018-07-12 at 6.41.28 > PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side where they worked before > in Hive1. > In many queries, we have to double the Mapper Memory settings (in our > particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it > makes it not so easy to upgrade to Hive 2. > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: (was: HIVE-19767.3.patch) > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: HIVE-19767.3.patch > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577009#comment-16577009 ] Szehon Ho commented on HIVE-19767: -- OK, I wonder if this is like what you mean? > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: HIVE-19767.3.patch > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20396) Test HS2 open_connection metrics
[ https://issues.apache.org/jira/browse/HIVE-20396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582434#comment-16582434 ] Szehon Ho commented on HIVE-20396: -- +1 > Test HS2 open_connection metrics > > > Key: HIVE-20396 > URL: https://issues.apache.org/jira/browse/HIVE-20396 > Project: Hive > Issue Type: Test > Components: HiveServer2 >Reporter: Laszlo Pinter >Assignee: Laszlo Pinter >Priority: Minor > Fix For: 4.0.0 > > Attachments: HIVE-20396.patch > > > HiveServer2 is emitting metrics _default.General.open_connections_ in both > binary and http mode. These metrics should be tested. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582430#comment-16582430 ] Szehon Ho commented on HIVE-19767: -- Thanks for review! Yea i guess it's more or less read-only in our use-case, i could fix it in a later patch if it becomes an issue. > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Committed to master, thanks Aihua for review! > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-13457) Create HS2 REST API endpoints for monitoring information
[ https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-13457: - Status: Patch Available (was: Open) > Create HS2 REST API endpoints for monitoring information > > > Key: HIVE-13457 > URL: https://issues.apache.org/jira/browse/HIVE-13457 > Project: Hive > Issue Type: Improvement >Reporter: Szehon Ho >Assignee: Pawel Szostek >Priority: Major > Attachments: HIVE-13457.patch > > > Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if > other UI's like admin tools or Hue can access and display this information as > well. Hence, we will create some REST endpoints to expose this information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Status: Open (was: Patch Available) > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.2, 3.0.0, 1.2.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: HIVE-19767.4.patch > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Status: Patch Available (was: Open) OK, i cleaned up the patch as the setting on the HS2's hiveconf instance is unnecessary now, if this is the way you had in mind? [~aihuaxu]. thanks! > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.2, 3.0.0, 1.2.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575934#comment-16575934 ] Szehon Ho commented on HIVE-19767: -- [~aihuaxu] do you mind taking another look? > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575110#comment-16575110 ] Szehon Ho commented on HIVE-19767: -- OK, I attached another patch removing the (now) redundant code. > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: HIVE-19767.2.patch > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-19767: - Attachment: HIVE-19767.5.patch > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578589#comment-16578589 ] Szehon Ho commented on HIVE-19767: -- Very minor fix for findBugs. > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Improvement >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.2.patch, HIVE-19767.3.patch, > HIVE-19767.4.patch, HIVE-19767.5.patch, HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-13457) Create HS2 REST API endpoints for monitoring information
[ https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595153#comment-16595153 ] Szehon Ho commented on HIVE-13457: -- Actually can you fix the checkstyle and findbugs? > Create HS2 REST API endpoints for monitoring information > > > Key: HIVE-13457 > URL: https://issues.apache.org/jira/browse/HIVE-13457 > Project: Hive > Issue Type: Improvement >Reporter: Szehon Ho >Assignee: Pawel Szostek >Priority: Major > Attachments: HIVE-13457.3.patch, HIVE-13457.4.patch, > HIVE-13457.5.patch, HIVE-13457.patch, HIVE-13457.patch > > > Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if > other UI's like admin tools or Hue can access and display this information as > well. Hence, we will create some REST endpoints to expose this information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-13457) Create HS2 REST API endpoints for monitoring information
[ https://issues.apache.org/jira/browse/HIVE-13457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595141#comment-16595141 ] Szehon Ho commented on HIVE-13457: -- Nice +1 > Create HS2 REST API endpoints for monitoring information > > > Key: HIVE-13457 > URL: https://issues.apache.org/jira/browse/HIVE-13457 > Project: Hive > Issue Type: Improvement >Reporter: Szehon Ho >Assignee: Pawel Szostek >Priority: Major > Attachments: HIVE-13457.3.patch, HIVE-13457.4.patch, > HIVE-13457.5.patch, HIVE-13457.patch, HIVE-13457.patch > > > Similar to what is exposed in HS2 webui in HIVE-12338, it would be nice if > other UI's like admin tools or Hue can access and display this information as > well. Hence, we will create some REST endpoints to expose this information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542758#comment-16542758 ] Szehon Ho commented on HIVE-20153: -- Hello Aihua, nice to see you too, thanks for looking at it! Yes, in fact they are all hashmap of 0 items. I cant get jxray to work on Mac, but i shared the heap dump on my Drive, does it work? [https://drive.google.com/open?id=1nKe43ybfgEEe0yQvtsyQPVyxghGa5X2A] > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Assignee: Aihua Xu >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side where they worked before > in Hive1. > In many queries, we have to double the Mapper Memory settings (in our > particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it > makes it not so easy to upgrade to Hive 2. > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20153: - Description: While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations run out of memory on Hadoop side much faster than in Hive1. In many queries, we have to double the memory. Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window functions. was:While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations run out of memory on Hadoop side much faster than in Hive1. Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window functions. > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side much faster than in > Hive1. In many queries, we have to double the memory. > > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20153: - Attachment: Screen Shot 2018-07-12 at 6.41.28 PM.png > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side much faster than in > Hive1. In many queries, we have to double the memory. > > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541929#comment-16541929 ] Szehon Ho commented on HIVE-20153: -- [~aihuaxu] do you think there is some way to improve this? (I didn't yet take much look at this code to deeply understand). It seems to consume memory even if its used in the window function or not. The query is something like (generalizing the table): select count(distinct), count(), count(), count(), min(), min(), max(), max(), min(), max() from table group by field; Also I attach the heap dump of a mapper that was killed OOM for reference, there's 3 million GenericUDAFCountEvaluator, each with a hashmap, I also don't know if that is weird or not. !Screen Shot 2018-07-12 at 6.41.28 PM.png! > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side much faster than in > Hive1. In many queries, we have to double the memory. > > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541929#comment-16541929 ] Szehon Ho edited comment on HIVE-20153 at 7/12/18 4:49 PM: --- [~aihuaxu] do you think there is some way to improve this? (I didn't yet take much look at this code to deeply understand). It seems to consume memory whether its used in the window function or not. The query is something like (generalizing the table): select count(distinct), count(), count(), count(), min(), min(), max(), max(), min(), max() from table group by field; Also I attach the heap dump of a mapper that was killed OOM for reference, there's 3 million GenericUDAFCountEvaluator, each with a hashset of uniqueObjects. !Screen Shot 2018-07-12 at 6.41.28 PM.png! was (Author: szehon): [~aihuaxu] do you think there is some way to improve this? (I didn't yet take much look at this code to deeply understand). It seems to consume memory even if its used in the window function or not. The query is something like (generalizing the table): select count(distinct), count(), count(), count(), min(), min(), max(), max(), min(), max() from table group by field; Also I attach the heap dump of a mapper that was killed OOM for reference, there's 3 million GenericUDAFCountEvaluator, each with a hashmap, I also don't know if that is weird or not. !Screen Shot 2018-07-12 at 6.41.28 PM.png! > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side much faster than in > Hive1. In many queries, we have to double the memory. > > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541929#comment-16541929 ] Szehon Ho edited comment on HIVE-20153 at 7/12/18 4:55 PM: --- [~aihuaxu] do you think there is some way to improve this? (I didn't yet take much look at this code to deeply understand). It seems to consume memory whether its used in the window function or not. The query is something like (generalizing the table): select count(distinct), count(), count(), count(), min(), min(), max(), max(), min(), max() from table group by field; Also I attach the heap dump of a mapper that was killed OOM for reference, there's 3 million GenericUDAFCountEvaluator, each with a 'uniqueObjects' hashSet (each hashSet in turn containing a hashMap). !Screen Shot 2018-07-12 at 6.41.28 PM.png! was (Author: szehon): [~aihuaxu] do you think there is some way to improve this? (I didn't yet take much look at this code to deeply understand). It seems to consume memory whether its used in the window function or not. The query is something like (generalizing the table): select count(distinct), count(), count(), count(), min(), min(), max(), max(), min(), max() from table group by field; Also I attach the heap dump of a mapper that was killed OOM for reference, there's 3 million GenericUDAFCountEvaluator, each with a hashset of uniqueObjects. !Screen Shot 2018-07-12 at 6.41.28 PM.png! > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side much faster than in > Hive1. In many queries, we have to double the memory. > > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20153: - Description: While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations run out of memory on Hadoop side much faster than in Hive1. In many queries, we have to double the memory (in our particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M) Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window functions. was: While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations run out of memory on Hadoop side much faster than in Hive1. In many queries, we have to double the memory. Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window functions. > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side much faster than in > Hive1. In many queries, we have to double the memory (in our particular case > mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M) > > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+
[ https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20153: - Description: While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations run out of memory on Hadoop side where they worked before in Hive1. In many queries, we have to double the Mapper Memory settings (in our particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it makes it not so easy to upgrade to Hive 2. Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window functions. was: While playing with Hive2, we noticed that queries with a lot of count() and sum() aggregations run out of memory on Hadoop side much faster than in Hive1. In many queries, we have to double the memory (in our particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M) Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window functions. > Count and Sum UDF consume more memory in Hive 2+ > > > Key: HIVE-20153 > URL: https://issues.apache.org/jira/browse/HIVE-20153 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 2.3.2 >Reporter: Szehon Ho >Priority: Major > Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png > > > While playing with Hive2, we noticed that queries with a lot of count() and > sum() aggregations run out of memory on Hadoop side where they worked before > in Hive1. > In many queries, we have to double the Mapper Memory settings (in our > particular case mapreduce.map.java.opts from -Xmx2000M to -Xmx4000M), it > makes it not so easy to upgrade to Hive 2. > Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' > in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19767) HiveServer2 should take hiveconf for non Hive properties
[ https://issues.apache.org/jira/browse/HIVE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536990#comment-16536990 ] Szehon Ho commented on HIVE-19767: -- [~thejas] any chance to take a look at this patch? > HiveServer2 should take hiveconf for non Hive properties > > > Key: HIVE-19767 > URL: https://issues.apache.org/jira/browse/HIVE-19767 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.2, 3.0.0, 2.3.2 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-19767.patch > > > The -hiveconf command line option works in HiveServer2 with properties in > HiveConf.java, but not so well with other properties (like mapred properties > or spark properties to control underlying execution engine, or custom > properties understood by custom listeners) > It is inconsistent with HiveCLI. > HiveCLI behavior: > {noformat} > ./bin/hive --hiveconf a=b > hive> set a; > a=b {noformat} > HiveServer2 behavior: > {noformat} > ./bin/hiveserver2 --hiveconf a=b > beeline> set a; > +-+ > | set | > +-+ > | a is undefined | > +-+{noformat} > Although it is possible to set up hive-site.xml or even mapred-site.xml to > fill in the relevant properties, it is more convenient when testing HS2 with > different configuration to be able to use --hiveconf to change on the fly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328727#comment-16328727 ] Szehon Ho commented on HIVE-18347: -- Thanks for the response guys. What Vihang says makes sense, I am fine with not bundling the consul implementation in Hive and putting in our own open source repository. It would be great if HIVE-18449 can help this use case by offering a generic hook mechanism for pluggable load-balancing between HMS instances! > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-18347: - Attachment: HIVE-18347.4.patch > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18347) Allow dynamic lookup of Hive Metastores via Consul
[ https://issues.apache.org/jira/browse/HIVE-18347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334604#comment-16334604 ] Szehon Ho commented on HIVE-18347: -- Thanks for the message, [~thejas]! Included a review request for a new patch, that just has the hook. > Allow dynamic lookup of Hive Metastores via Consul > -- > > Key: HIVE-18347 > URL: https://issues.apache.org/jira/browse/HIVE-18347 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-18347.1.patch, HIVE-18347.2.patch, > HIVE-18347.3.patch, HIVE-18347.4.patch > > > In our organization, we have deployed HiveMetastore and HiveServer2 on Mesos > as dynamic services for scalability and flexibility. > In this architecture, we would like to allow HiveServer2 to dynamically load > balance between Metastores (which may be scaled up and down or to different > nodes) for different requests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17300) WebUI query plan graphs
[ https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16641598#comment-16641598 ] Szehon Ho commented on HIVE-17300: -- Sorry guys, I was on vacation :). Really glad to see this patch in Hive, it is a lot of effort and a great contribution > WebUI query plan graphs > --- > > Key: HIVE-17300 > URL: https://issues.apache.org/jira/browse/HIVE-17300 > Project: Hive > Issue Type: Sub-task > Components: Web UI >Affects Versions: 4.0.0 >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: beginner, features, patch > Fix For: 4.0.0 > > Attachments: HIVE-17300.10.patch, HIVE-17300.10.patch, > HIVE-17300.10.patch, HIVE-17300.3.patch, HIVE-17300.4.patch, > HIVE-17300.5.patch, HIVE-17300.6.patch, HIVE-17300.7.patch, > HIVE-17300.7.patch, HIVE-17300.8.patch, HIVE-17300.8.patch, > HIVE-17300.8.patch, HIVE-17300.8.patch, HIVE-17300.9.patch, HIVE-17300.patch, > complete_success.png, full_mapred_stats.png, graph_with_mapred_stats.png, > last_stage_error.png, last_stage_running.png, non_mapred_task_selected.png > > > Hi all, > I’m working on a feature of the Hive WebUI Query Plan tab that would provide > the option to display the query plan as a nice graph (scroll down for > screenshots). If you click on one of the graph’s stages, the plan for that > stage appears as text below. > Stages are color-coded if they have a status (Success, Error, Running), and > the rest are grayed out. Coloring is based on status already available in the > WebUI, under the Stages tab. > There is an additional option to display stats for MapReduce tasks. This > includes the job’s ID, tracking URL (where the logs are found), and mapper > and reducer numbers/progress, among other info. > The library I’m using for the graph is called vis.js (http://visjs.org/). It > has an Apache license, and the only necessary file to be included from this > library is about 700 KB. > I tried to keep server-side changes minimal, and graph generation is taken > care of by the client. Plans with more than a given number of stages > (default: 25) won't be displayed in order to preserve resources. > I’d love to hear any and all input from the community about this feature: do > you think it’s useful, and is there anything important I’m missing? > Thanks, > Karen Coppage > Review request: https://reviews.apache.org/r/61663/ > Any input is welcome! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20789: - Attachment: HIVE-20789.2.patch > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-20789.2.patch, HIVE-20789.patch > > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662256#comment-16662256 ] Szehon Ho commented on HIVE-20789: -- Thanks a lot for taking a look! Sorry I made a mistake as I was not used to the confs being split :). I left just the one on MetastoreConf, can you see if that is right? > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-20789.2.patch, HIVE-20789.patch > > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20786: - Status: Patch Available (was: Open) > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] Hive Metastore Tools ... SKIPPED > [INFO]
[jira] [Assigned] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-20786: Assignee: Szehon Ho > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] Hive Metastore Tools ... SKIPPED > [INFO] Hive
[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20786: - Attachment: HIVE-20786.patch > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] Hive Metastore Tools ... SKIPPED > [INFO] Hive Metastore Tools common libraries
[jira] [Commented] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660999#comment-16660999 ] Szehon Ho commented on HIVE-20786: -- Yes I've hit this error for awhile now. Uploading a patch that seems to solve it for me. Though I wasn't following so not sure why it was set on purpose to gnu before. > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Priority: Major > Labels: maven > Attachments: hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging
[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20789: - Status: Patch Available (was: Open) We noticed that the TSocket used by the thrift thread pools are purposely initiated with 0 clientTimeout. It makes sense to make it configurable to prevent DDOS. > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-20789.patch > > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20789: - Attachment: HIVE-20789.patch > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-20789.patch > > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho reassigned HIVE-20789: Assignee: Szehon Ho > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20789: - Description: We have had a scenario that health checks sending 0 bytes to HiveServer2 sockets would DDOS the HiveServer2, if for some reason they hang or otherwise don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will block reading the socket. This is the stack (we are running an older version of Hive here) {noformat} "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <23781b74> (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){noformat} Eventually HiveServer2 has no more free threads left. was: We have had a scenario that health checks sending 0 bytes to HiveServer2 sockets would DDOS the HiveServer2, if they dont send TCP FIN then they will continually cause all HiveServer2 thrift thread-pool threads to block at this stack (we are running an older version of Hive here, so ignore the lines) {noformat} "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <23781b74> (a java.io.BufferedInputStream) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) at org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){noformat} Eventually HiveServer2 has no more free threads left. > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 >
[jira] [Commented] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670502#comment-16670502 ] Szehon Ho commented on HIVE-20786: -- Hey Vihang for some reason when I just change it on packaging/pom.xml like that it gives an error [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.3:single (assemble) on project hive-packaging: Failed to create assembly: Error creating assembly archive bin: posix is not a legal value for this attribute -> [Help 1] I need to debug it further. > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator
[jira] [Commented] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670912#comment-16670912 ] Szehon Ho commented on HIVE-20786: -- So the problem was the top-level packaging was using an old version of maven assembly before posix was even supported. This latest patch builds but also upgrade the maven.assembly.plugin to the same one used in standalone-metastore. > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils
[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20786: - Attachment: (was: HIVE-20789.2.patch) > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] Hive Metastore Tools ... SKIPPED >
[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20786: - Attachment: HIVE-20786.2.patch > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.2.patch, HIVE-20786.patch, > hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] Hive Metastore Tools ...
[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20786: - Attachment: HIVE-20789.2.patch > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Attachments: HIVE-20786.patch, HIVE-20789.2.patch, > hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler . SKIPPED > [INFO] Hive Packaging . SKIPPED > [INFO] Hive Metastore Tools ...
[jira] [Updated] (HIVE-20786) Maven Build Failed with group id is too big
[ https://issues.apache.org/jira/browse/HIVE-20786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20786: - Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) Committed to master, thanks Vihang for the review > Maven Build Failed with group id is too big > > > Key: HIVE-20786 > URL: https://issues.apache.org/jira/browse/HIVE-20786 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore > Environment: > OS: MacOS 10.13.6 > Java: > {code} > java version "1.8.0_192" > Java(TM) SE Runtime Environment (build 1.8.0_192-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode) > {code} > Maven: > {code} > Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; > 2018-06-18T02:33:14+08:00) > Maven home: /usr/local/Cellar/maven/3.5.4/libexec > Java version: 1.8.0_192, vendor: Oracle Corporation, runtime: > /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_CN, platform encoding: UTF-8 > OS name: "mac os x", version: "10.13.6", arch: "x86_64", family: "mac" > {code} > > >Reporter: PENG Zhengshuai >Assignee: Szehon Ho >Priority: Major > Labels: maven > Fix For: 4.0.0 > > Attachments: HIVE-20786.2.patch, HIVE-20786.patch, > hive_build_error.log > > > When executing > {code} > mvn clean install -DskipTests > {code} > Build Failed: > {code} > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Hive Storage API 2.7.0-SNAPSHOT SUCCESS [ 5.299 > s] > [INFO] Hive 4.0.0-SNAPSHOT SUCCESS [ 0.750 > s] > [INFO] Hive Classifications ... SUCCESS [ 1.057 > s] > [INFO] Hive Shims Common .. SUCCESS [ 3.882 > s] > [INFO] Hive Shims 0.23 SUCCESS [ 5.020 > s] > [INFO] Hive Shims Scheduler ... SUCCESS [ 2.587 > s] > [INFO] Hive Shims . SUCCESS [ 2.038 > s] > [INFO] Hive Common SUCCESS [ 6.921 > s] > [INFO] Hive Service RPC ... SUCCESS [ 3.503 > s] > [INFO] Hive Serde . SUCCESS [ 6.322 > s] > [INFO] Hive Standalone Metastore .. FAILURE [ 0.557 > s] > [INFO] Hive Standalone Metastore Common Code .. SKIPPED > [INFO] Hive Metastore . SKIPPED > [INFO] Hive Vector-Code-Gen Utilities . SKIPPED > [INFO] Hive Llap Common ... SKIPPED > [INFO] Hive Llap Client ... SKIPPED > [INFO] Hive Llap Tez .. SKIPPED > [INFO] Hive Spark Remote Client ... SKIPPED > [INFO] Hive Metastore Server .. SKIPPED > [INFO] Hive Query Language SKIPPED > [INFO] Hive Llap Server ... SKIPPED > [INFO] Hive Service ... SKIPPED > [INFO] Hive Accumulo Handler .. SKIPPED > [INFO] Hive JDBC .. SKIPPED > [INFO] Hive Beeline ... SKIPPED > [INFO] Hive CLI ... SKIPPED > [INFO] Hive Contrib ... SKIPPED > [INFO] Hive Druid Handler . SKIPPED > [INFO] Hive HBase Handler . SKIPPED > [INFO] Hive JDBC Handler .. SKIPPED > [INFO] Hive HCatalog .. SKIPPED > [INFO] Hive HCatalog Core . SKIPPED > [INFO] Hive HCatalog Pig Adapter .. SKIPPED > [INFO] Hive HCatalog Server Extensions SKIPPED > [INFO] Hive HCatalog Webhcat Java Client .. SKIPPED > [INFO] Hive HCatalog Webhcat .. SKIPPED > [INFO] Hive HCatalog Streaming SKIPPED > [INFO] Hive HPL/SQL ... SKIPPED > [INFO] Hive Streaming . SKIPPED > [INFO] Hive Llap External Client .. SKIPPED > [INFO] Hive Shims Aggregator .. SKIPPED > [INFO] Hive Kryo Registrator .. SKIPPED > [INFO] Hive TestUtils . SKIPPED > [INFO] Hive Kafka Storage Handler
[jira] [Commented] (HIVE-17300) WebUI query plan graphs
[ https://issues.apache.org/jira/browse/HIVE-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627772#comment-16627772 ] Szehon Ho commented on HIVE-17300: -- Hi Karen, sorry for the StringUtils, I missed this on my end that it's already imported. For the OperationLog I saw its accessible from a ThreadLocal, I wonder if it will work. > WebUI query plan graphs > --- > > Key: HIVE-17300 > URL: https://issues.apache.org/jira/browse/HIVE-17300 > Project: Hive > Issue Type: Sub-task > Components: Web UI >Affects Versions: 4.0.0 >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: beginner, features, patch > Attachments: HIVE-17300.3.patch, HIVE-17300.4.patch, > HIVE-17300.5.patch, HIVE-17300.6.patch, HIVE-17300.7.patch, > HIVE-17300.7.patch, HIVE-17300.8.patch, HIVE-17300.8.patch, > HIVE-17300.8.patch, HIVE-17300.8.patch, HIVE-17300.9.patch, HIVE-17300.patch, > complete_success.png, full_mapred_stats.png, graph_with_mapred_stats.png, > last_stage_error.png, last_stage_running.png, non_mapred_task_selected.png > > > Hi all, > I’m working on a feature of the Hive WebUI Query Plan tab that would provide > the option to display the query plan as a nice graph (scroll down for > screenshots). If you click on one of the graph’s stages, the plan for that > stage appears as text below. > Stages are color-coded if they have a status (Success, Error, Running), and > the rest are grayed out. Coloring is based on status already available in the > WebUI, under the Stages tab. > There is an additional option to display stats for MapReduce tasks. This > includes the job’s ID, tracking URL (where the logs are found), and mapper > and reducer numbers/progress, among other info. > The library I’m using for the graph is called vis.js (http://visjs.org/). It > has an Apache license, and the only necessary file to be included from this > library is about 700 KB. > I tried to keep server-side changes minimal, and graph generation is taken > care of by the client. Plans with more than a given number of stages > (default: 25) won't be displayed in order to preserve resources. > I’d love to hear any and all input from the community about this feature: do > you think it’s useful, and is there anything important I’m missing? > Thanks, > Karen Coppage > Review request: https://reviews.apache.org/r/61663/ > Any input is welcome! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21033) Forgetting to close operation cuts off any more HiveServer2 output
[ https://issues.apache.org/jira/browse/HIVE-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-21033: - Attachment: HIVE-21033.5.patch > Forgetting to close operation cuts off any more HiveServer2 output > -- > > Key: HIVE-21033 > URL: https://issues.apache.org/jira/browse/HIVE-21033 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-21033.2.patch, HIVE-21033.3.patch, > HIVE-21033.4.patch, HIVE-21033.5.patch, HIVE-21033.patch > > > We had a custom client that did not handle closing the operations, until the > end of the session. it is a mistake in the client, but it reveals kind of a > vulnerability in HiveServer2 > This happens if you have a session with (1) HiveCommandOperation and (2) > SQLOperation and don't close them right after. For example a session that > does the operations (set a=b; select * from foobar; ). > When SQLOperation runs , it set SessionState.out and err to be System.out and > System.err . Ref: > [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139] > Then the client closes the session, or disconnects which triggers > closeSession() on the Thrift side. In this case, the closeSession closes all > the operations, starting with HiveCommandOperation. This closes all the > streams, which is System.out and System.err as set by SQLOperation earlier. > Ref: > [HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101] > > After this, no more HiveServer2 output appears as System.out and System.err > are closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-21033) Sudden disconnect for a session with set and SQL operation cuts off any more HiveServer2 output
[ https://issues.apache.org/jira/browse/HIVE-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-21033: - Description: Its a bit tricky to reproduce, but we were able to do it (unfortunately) with our custom client that did not handle closing the operation or session on the error case. But it may also happen for any client that just disconnects in the middle of this operation. Basically you have a session with both HiveCommandOperation and SQLOperation. For example a session that does the operations (set a=b; select * from foobar; ). The SQLOperation runs last and set SessionState.out and err to be System.out and System.err . Ref: [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139] Then the client terminates without closing the session. (In our case, a SemanticException triggered it). The deleteContext is called, which closes the session: Ref [ThriftBinaryCLIService#deleteContext|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java#L141] The Session closes all the operations, starting with HiveCommandOperation. This one closes all the streams, which is System.out and System.err as set by SQLOperation earlier. Ref: [HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101] After this, no more HiveServer2 output appears as System.out and System.err are closed. was: Its a bit tricky to reproduce, but we were able to do it (unfortunately) with our custom client that did not handle closing the session on the error case. But it may also happen for any client that just disconnects in the middle of this operation. Basically you have a session with both HiveCommandOperation and SQLOperation. For example a session that does the operations (set a=b; select * from foobar; ). The SQLOperation runs last and set SessionState.out and err to be System.out and System.err . Ref: [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139] Then the client terminates without closing the session. (In our case, a SemanticException triggered it). The deleteContext is called, which closes the session: Ref [ThriftBinaryCLIService#deleteContext|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java#L141] The Session closes all the operations, starting with HiveCommandOperation. This one closes all the streams, which is System.out and System.err as set by SQLOperation earlier. Ref: [HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101] After this, no more HiveServer2 output appears as System.out and System.err are closed. > Sudden disconnect for a session with set and SQL operation cuts off any more > HiveServer2 output > --- > > Key: HIVE-21033 > URL: https://issues.apache.org/jira/browse/HIVE-21033 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Priority: Major > > Its a bit tricky to reproduce, but we were able to do it (unfortunately) with > our custom client that did not handle closing the operation or session on the > error case. But it may also happen for any client that just disconnects in > the middle of this operation. > Basically you have a session with both HiveCommandOperation and SQLOperation. > For example a session that does the operations (set a=b; select * from > foobar; ). > The SQLOperation runs last and set SessionState.out and err to be System.out > and System.err . Ref: > [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139] > Then the client terminates without closing the session. (In our case, a > SemanticException triggered it). The deleteContext is called, which closes > the session: Ref > [ThriftBinaryCLIService#deleteContext|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/thrift/ThriftBinaryCLIService.java#L141] > The Session closes all the operations, starting with HiveCommandOperation. > This one closes all the streams, which is System.out and System.err as set by > SQLOperation earlier. Ref: >