Re: How to obtain concurrent query executions
Thanks! 2016-09-28 12:55 GMT-03:00 Frank Luo <j...@merkleinc.com>: > If you are using Hadoop 2.7 or newer, you can use > mapreduce.job.running.map.limit and mapreduce.job.running.reduce.limit to > restrict map and reduce tasks at each job level. > > Another way is to use Scheduler to limit queue size. > > > > *From:* Jose Rozanec [mailto:jose.roza...@mercadolibre.com] > *Sent:* Tuesday, September 27, 2016 5:54 PM > *To:* user@hive.apache.org > *Subject:* How to obtain concurrent query executions > > > > Hi, > > > > We have a Hive cluster. We notice that some queries consume all resources, > which is not desirable to us, since we want to grant some degree of > parallelism to incoming ones: any incoming query should be able to do at > least some progress, not just wait the big one finish. > > > > Is there way to do so? We use Hive 2.1.0 with Tez engine. > > > > Thank you in advance, > > > > Joze. > > *Access the Q2 2016 Digital Marketing Report for a fresh set of trends and > benchmarks in digital marketing* > <http://www2.merkleinc.com/l/47252/2016-07-26/47gt7c> > > *Download our latest report titled “The Case for Change: Exploring the > Myths of Customer-Centric Transformation” * > <http://www2.merkleinc.com/l/47252/2016-08-04/4b9p7c> > > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute > this email without the author’s prior permission. We take precautions to > minimize the risk of transmitting software viruses, but we advise you to > perform your own virus checks on any attachment to this message. We cannot > accept liability for any loss or damage caused by software viruses. The > information contained in this communication may be confidential and may be > subject to the attorney-client privilege. >
Hive queries rejected under heavy load
Hi, We have a Hive cluster (Hive 2.1.0+Tez 0.8.4) which works well for most queries. Though for some heavy ones we observe that sometimes are able to execute and sometimes get rejected. We are not sure why we get a rejection instead of getting them enqueued and wait for execution until resources in cluster are available again. We notice that the connection waits for a minute, and if fails to assign resources, will drop the query. Looking at configuration parameters, is not clear to us if this can be changed. Did anyone had a similar experience and can provide us some guidance? Thank you in advance, Joze.
How to obtain concurrent query executions
Hi, We have a Hive cluster. We notice that some queries consume all resources, which is not desirable to us, since we want to grant some degree of parallelism to incoming ones: any incoming query should be able to do at least some progress, not just wait the big one finish. Is there way to do so? We use Hive 2.1.0 with Tez engine. Thank you in advance, Joze.
Query consuming all resources
Hi, We have a Hive cluster. We notice that some queries consume all resources, which is not desirable to us, since we want to grant some degree of parallelism to incoming ones: any incoming query should be able to do at least some progress, not just wait the big one finish. Is there way to do so? We use Hive 2.1.0 with Tez engine. Thank you in advance, Joze.
Upgrading Metastore schema 2.0.0->2.1.0
Hi all, Upgrading DB schema from 2.0.0 to 2.1.0 is causing an error. Did anyone experience similar issues? Below we leave the command and stacktrace. Thanks, *./schematool -dbType mysql -upgradeSchemaFrom 2.0.0* Starting upgrade metastore schema from version 2.0.0 to 2.1.0 Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql Error: Duplicate key name 'CONSTRAINTS_PARENT_TABLE_ID_INDEX' Query is : CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE (state=42000,code=1061) org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !! Underlying cause: java.io.IOException : Schema script failed, errorcode 2 Use --verbose for detailed stacktrace. *** schemaTool failed ***
LDAPS jdbc connection string
Hi, We set up a Hive cluster with LDAP and we are able to authenticate and use if from beeline without issues: beeline> !connect jdbc:hive2://localhost:1/default Connecting to jdbc:hive2://localhost:1/default Enter username for jdbc:hive2://localhost:1/default: uid=,ou=People,ou=Mexico,dc=ms,dc=com Enter password for jdbc:hive2://localhost:1/default: When trying to connect via JDBC, we get rejected ("*Peer indicated failure: Error validating the login*"). We are not sure where the issue lays, but we suspect we may not be passing ldap parameters as expected. Did anyone face a similar issue? Below the groovy snippet we use to connect against Hive: @GrabConfig(systemClassLoader= true) @Grab(group='org.apache.hive', module='hive-jdbc', version='2.0.0') @Grab(group='org.apache.hadoop', module='hadoop-common', version='2.7.2') @Grab(group='org.apache.commons', module='commons-csv', version='1.1') import groovy.sql.Sql; TimeZone.setDefault(TimeZone.getTimeZone('UTC')) def master = "masterip" def port = 1 def jdbcurl = String.format("jdbc:hive2://%s:%s/default", master, port) def db = [url:jdbcurl, user:'uid=,ou=People,ou=Mexicodc=ms,dc=com', password:'ourpassword!', driver:'org.apache.hive.jdbc.HiveDriver'] def sql = Sql.newInstance(db.url, db.user, db.password, db.driver) def query = "select * from ourtable limit 10" sql.execute query println "thanks! :)"
Re: LDAPS (Secure LDAP) Hive configuration
Hi, Yes, that is correct. We have LDAPS configured on 636, and certificate available only at that port. 443 is not enabled in our case, and should not bother, since communication is performed just on 636. 2016-06-15 23:20 GMT-03:00 Anurag Tangri <tangri.anu...@gmail.com>: > > Hey Joze, > Ldaps is a different port like 636 or something. Default port does not > work as far as I remember. > > Could you check if something on these lines ? > > Thanks, > Anurag Tangri > > Sent from my iPhone > > On Jun 15, 2016, at 3:01 PM, Jose Rozanec <jose.roza...@mercadolibre.com> > wrote: > > Hi, > > We upgraded to 2.1.0, but we still cannot get it working: we get "LDAP: > error code 34 - invalid DN". We double-checked the DN configuration, and > the ldap team agrees is ok. > We then configured SSL parameters as well (hive.server2.use.SSL, > hive.server2.keystore.path, hive.server2.keystore.password), so that Hive > would know where the truststore is located and its password, but in that > case we get the following error: "SSLException: Unrecognized SSL message, > plaintext connection". Our LDAP server does not expose the ssl > certificate on the default port (443), but in the one LDAPS is configured. > May that cause some trouble? > > We would value any insight or guidance from those who already worked on > this. > > Thanks! > > Joze. > > > > > > 2016-06-13 9:45 GMT-03:00 Jose Rozanec <jose.roza...@mercadolibre.com>: > >> Thank you for the quick response. Will try upgrading to version 2.1.0 >> >> Thanks! >> >> 2016-06-13 4:34 GMT-03:00 Oleksiy S <osayankin.superu...@gmail.com>: >> >>> Hello, >>>> >>>> We are working on a Hive 2.0.0 cluster, to configure LDAPS >>>> authentication, but I get some errors preventing a successful >>>> authentication. >>>> Does anyone have some insight on how to solve this? >>>> >>>> *The problem* >>>> The errors we get are (first is most frequent): >>>> - sun.security.provider.certpath.SunCertPathBuilderException: unable to >>>> find valid certification path to requested target >>>> - javax.naming.InvalidNameException: [LDAP: error code 34 - invalid DN] >>>> >>>> *Our config* >>>> We configure the certificate obtaining a jssecacerts file and >>>> overriding Java's default at master, as specified in this post >>>> <http://nodsw.com/blog/leeland/2006/12/06-no-more-unable-find-valid-certification-path-requested-target> >>>> . >>>> >>>> *hive-site.xml* has the following properties: >>>> >>>> hive.server2.authentication >>>> LDAP >>>> >>>> >>>> hive.server2.authentication.ldap.url >>>> ldaps://ip:port >>>> >>>> >>>> hive.server2.authentication.ldap.baseDN >>>> dc=net,dc=com >>>> >>>> >>>> Thanks! >>>> >>>> Joze. >>>> >>> >>> >>> This issue is fixed here >>> https://issues.apache.org/jira/browse/HIVE-12885 >>> >>> On Fri, Jun 10, 2016 at 10:41 PM, Jose Rozanec < >>> jose.roza...@mercadolibre.com> wrote: >>> >>>> Hello, >>>> >>>> We are working on a Hive 2.0.0 cluster, to configure LDAPS >>>> authentication, but I get some errors preventing a successful >>>> authentication. >>>> Does anyone have some insight on how to solve this? >>>> >>>> *The problem* >>>> The errors we get are (first is most frequent): >>>> - sun.security.provider.certpath.SunCertPathBuilderException: unable to >>>> find valid certification path to requested target >>>> - javax.naming.InvalidNameException: [LDAP: error code 34 - invalid DN] >>>> >>>> *Our config* >>>> We configure the certificate obtaining a jssecacerts file and >>>> overriding Java's default at master, as specified in this post >>>> <http://nodsw.com/blog/leeland/2006/12/06-no-more-unable-find-valid-certification-path-requested-target> >>>> . >>>> >>>> *hive-site.xml* has the following properties: >>>> >>>> hive.server2.authentication >>>> LDAP >>>> >>>> >>>> hive.server2.authentication.ldap.url >>>> ldaps://ip:port >>>> >>>> >>>> hive.server2.authentication.ldap.baseDN >>>> dc=net,dc=com >>>> >>>> >>>> Thanks! >>>> >>>> Joze. >>>> >>> >>> >>> >>> -- >>> Oleksiy >>> >> >> >
Re: LDAPS (Secure LDAP) Hive configuration
Hi, We upgraded to 2.1.0, but we still cannot get it working: we get "LDAP: error code 34 - invalid DN". We double-checked the DN configuration, and the ldap team agrees is ok. We then configured SSL parameters as well (hive.server2.use.SSL, hive.server2.keystore.path, hive.server2.keystore.password), so that Hive would know where the truststore is located and its password, but in that case we get the following error: "SSLException: Unrecognized SSL message, plaintext connection". Our LDAP server does not expose the ssl certificate on the default port (443), but in the one LDAPS is configured. May that cause some trouble? We would value any insight or guidance from those who already worked on this. Thanks! Joze. 2016-06-13 9:45 GMT-03:00 Jose Rozanec <jose.roza...@mercadolibre.com>: > Thank you for the quick response. Will try upgrading to version 2.1.0 > > Thanks! > > 2016-06-13 4:34 GMT-03:00 Oleksiy S <osayankin.superu...@gmail.com>: > >> Hello, >>> >>> We are working on a Hive 2.0.0 cluster, to configure LDAPS >>> authentication, but I get some errors preventing a successful >>> authentication. >>> Does anyone have some insight on how to solve this? >>> >>> *The problem* >>> The errors we get are (first is most frequent): >>> - sun.security.provider.certpath.SunCertPathBuilderException: unable to >>> find valid certification path to requested target >>> - javax.naming.InvalidNameException: [LDAP: error code 34 - invalid DN] >>> >>> *Our config* >>> We configure the certificate obtaining a jssecacerts file and overriding >>> Java's default at master, as specified in this post >>> <http://nodsw.com/blog/leeland/2006/12/06-no-more-unable-find-valid-certification-path-requested-target> >>> . >>> >>> *hive-site.xml* has the following properties: >>> >>> hive.server2.authentication >>> LDAP >>> >>> >>> hive.server2.authentication.ldap.url >>> ldaps://ip:port >>> >>> >>> hive.server2.authentication.ldap.baseDN >>> dc=net,dc=com >>> >>> >>> Thanks! >>> >>> Joze. >>> >> >> >> This issue is fixed here https://issues.apache.org/jira/browse/HIVE-12885 >> >> On Fri, Jun 10, 2016 at 10:41 PM, Jose Rozanec < >> jose.roza...@mercadolibre.com> wrote: >> >>> Hello, >>> >>> We are working on a Hive 2.0.0 cluster, to configure LDAPS >>> authentication, but I get some errors preventing a successful >>> authentication. >>> Does anyone have some insight on how to solve this? >>> >>> *The problem* >>> The errors we get are (first is most frequent): >>> - sun.security.provider.certpath.SunCertPathBuilderException: unable to >>> find valid certification path to requested target >>> - javax.naming.InvalidNameException: [LDAP: error code 34 - invalid DN] >>> >>> *Our config* >>> We configure the certificate obtaining a jssecacerts file and overriding >>> Java's default at master, as specified in this post >>> <http://nodsw.com/blog/leeland/2006/12/06-no-more-unable-find-valid-certification-path-requested-target> >>> . >>> >>> *hive-site.xml* has the following properties: >>> >>> hive.server2.authentication >>> LDAP >>> >>> >>> hive.server2.authentication.ldap.url >>> ldaps://ip:port >>> >>> >>> hive.server2.authentication.ldap.baseDN >>> dc=net,dc=com >>> >>> >>> Thanks! >>> >>> Joze. >>> >> >> >> >> -- >> Oleksiy >> > >
Re: Query fails if condition placed on Parquet struct field
Hi! Is not due to memory allocation. I found that I am able to perform que query ok, if I rewrite it as: select *a.user_agent* from (SELECT *device.user_agent* as *user_agent* FROM sometable WHERE ds >= '2016-03-30 00' AND ds <= '2016-03-30 01')a where *a.user_agent* LIKE 'Mozilla%' LIMIT 1; I see the amount of mappers and execution time is almost the same, but this way we are able to execute ok and get the results. Any ideas why may this happen? 2016-05-03 17:02 GMT-03:00 Haas, Nichole <nichole.h...@concur.com>: > What are you memory allocations set to? When using something as expensive > as LIKE and a date range together, I often have to increase my standard > memory allocation. > > Try changing your memory allocation settings to: > Key: ***mapreduce.map.memory.mb*** Value: ***2048*** and Key: ** > *mapreduce.map.java.opts*** Value: ***-Xmx1500m* > > In HUE, this is the settings tab and you enter them manually. I’m unsure > about command line. > > > From: Jose Rozanec <jose.roza...@mercadolibre.com> > Reply-To: "user@hive.apache.org" <user@hive.apache.org> > Date: Tuesday, May 3, 2016 at 12:45 PM > To: "user@hive.apache.org" <user@hive.apache.org> > Subject: Query fails if condition placed on Parquet struct field > > Hello, > > We are running queries on Hive against parquet files. > In the schema definition, we have a parquet struct called device with a > string field user_agent. > > If we run query from Example 1, it returns results as expected. > If we run query from Example 2, execution fails and exits with error. > > Did anyone face a similar case? > > Thanks! > > *Example 1:* > SELECT *device.user_agent* FROM sometable WHERE ds >= '2016-03-30 00' AND > ds <= '2016-03-30 01' LIMIT 1; > > *Example 2:* > SELECT *device.user_agent* FROM sometable WHERE ds >= '2016-03-30 00' AND > ds <= '2016-03-30 01' AND *device.user_agent* LIKE 'Mozilla%' LIMIT 1; > > > The error and trace we get is: > > Exception from container-launch. > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask > Container exited with a non-zero exit code 1 > > *Stack trace: ExitCodeException exitCode=1:* > *at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)* > *at org.apache.hadoop.util.Shell.run(Shell.java:456)* > *at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)* > *at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)* > *at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)* > *at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)* > *at java.util.concurrent.FutureTask.run(FutureTask.java:262)* > *at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)* > *at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)* > *at java.lang.Thread.run(Thread.java:745)* > > > -- > > This e-mail message is authorized for use by the intended recipient only > and may contain information that is privileged and confidential. If you > received this message in error, please call us immediately at (425) > 590-5000 and ask to speak to the message sender. Please do not copy, > disseminate, or retain this message unless you are the intended recipient. > In addition, to ensure the security of your data, please do not send any > unencrypted credit card or personally identifiable information to this > email address. Thank you. >
Query fails if condition placed on Parquet struct field
Hello, We are running queries on Hive against parquet files. In the schema definition, we have a parquet struct called device with a string field user_agent. If we run query from Example 1, it returns results as expected. If we run query from Example 2, execution fails and exits with error. Did anyone face a similar case? Thanks! *Example 1:* SELECT *device.user_agent* FROM sometable WHERE ds >= '2016-03-30 00' AND ds <= '2016-03-30 01' LIMIT 1; *Example 2:* SELECT *device.user_agent* FROM sometable WHERE ds >= '2016-03-30 00' AND ds <= '2016-03-30 01' AND *device.user_agent* LIKE 'Mozilla%' LIMIT 1; The error and trace we get is: Exception from container-launch. FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask Container exited with a non-zero exit code 1 *Stack trace: ExitCodeException exitCode=1:* * at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)* * at org.apache.hadoop.util.Shell.run(Shell.java:456)* * at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)* * at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)* * at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)* * at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)* * at java.util.concurrent.FutureTask.run(FutureTask.java:262)* * at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)* * at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)* * at java.lang.Thread.run(Thread.java:745)*
Re: Hue 3.7.1 issue at EMR 4.3.0: fails to install Hive Editor
Hello, Just for the record: we found out that a property at /etc/hue/conf.empty/hue.ini was not matching Hive configurations: at Hive we had authentication set to a value different from NONE; while at Hue configuration we found security_enabled=false After changing the value to security_enabled=true and restarting the process, we got it running as expected. 2016-04-04 13:00 GMT-03:00 Jose Rozanec <jose.roza...@mercadolibre.com>: > Hello, > > Today morning we started getting an issue at Hue instance, when creating > an EMR cluster with Hive/Hue. At this link we provide the trace we get, > when attempting to setup Hue's Hive Editor: http://pastebin.com/LTN15k7r > Did anyone face this? A few days ago we had no issues creating a cluster > with this same configurations. > > Thank you in advance, >
Hue 3.7.1 issue at EMR 4.3.0: fails to install Hive Editor
Hello, Today morning we started getting an issue at Hue instance, when creating an EMR cluster with Hive/Hue. At this link we provide the trace we get, when attempting to setup Hue's Hive Editor: http://pastebin.com/LTN15k7r Did anyone face this? A few days ago we had no issues creating a cluster with this same configurations. Thank you in advance,