Re: Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe
Still any inputs on that one ? Nobody met it ? Regards, Loïc Loïc CHANEL System Big Data engineer MS - Worldline Analytics Platform - Worldline (Villeurbanne, France) 2017-08-07 18:26 GMT+02:00 Loïc Chanel <loic.cha...@telecomnancy.net>: > Hi, > > As I tried to run some queries with JSON SerDe from Spark SQL client, I > encountered that error : > > 17/08/07 18:20:40 ERROR SparkSQLDriver: Failed in [create external table > client_project.test_ext(DocVersion string, DriverID string) row format > serde 'org.apache.hive.hcatalog.data.JsonSerDe' WITH SERDEPROPERTIES > ("ignore.malformed.json" = "true") location '/etl/client/ct/aftermarket/ > processing/proj/envt=M5'] > org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot > validate serde: org.apache.hive.hcatalog.data.JsonSerDe > at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$ > runHive$1.apply(ClientWrapper.scala:455) > at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$ > runHive$1.apply(ClientWrapper.scala:440) > at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$ > withHiveState$1.apply(ClientWrapper.scala:278) > at org.apache.spark.sql.hive.client.ClientWrapper. > retryLocked(ClientWrapper.scala:233) > at org.apache.spark.sql.hive.client.ClientWrapper. > withHiveState(ClientWrapper.scala:270) > at org.apache.spark.sql.hive.client.ClientWrapper.runHive( > ClientWrapper.scala:440) > at org.apache.spark.sql.hive.client.ClientWrapper. > runSqlHive(ClientWrapper.scala:430) > at org.apache.spark.sql.hive.HiveContext.runSqlHive( > HiveContext.scala:561) > at org.apache.spark.sql.hive.execution.HiveNativeCommand. > run(HiveNativeCommand.scala:33) > at org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult$lzycompute(commands.scala:57) > at org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult(commands.scala:57) > at org.apache.spark.sql.execution.ExecutedCommand. > doExecute(commands.scala:69) > at org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$5.apply(SparkPlan.scala:140) > at org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$5.apply(SparkPlan.scala:138) > at org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:147) > at org.apache.spark.sql.execution.SparkPlan.execute( > SparkPlan.scala:138) > at org.apache.spark.sql.SQLContext$QueryExecution. > toRdd$lzycompute(SQLContext.scala:933) > at org.apache.spark.sql.SQLContext$QueryExecution. > toRdd(SQLContext.scala:933) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:144) > at org.apache.spark.sql.DataFrame.(DataFrame.scala:129) > at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725) > at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver. > run(SparkSQLDriver.scala:62) > at org.apache.spark.sql.hive.thriftserver. > SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:308) > at org.apache.hadoop.hive.cli.CliDriver.processLine( > CliDriver.java:376) > at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main( > SparkSQLCLIDriver.scala:226) > at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main( > SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ > deploy$SparkSubmit$$runMain(SparkSubmit.scala:685) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1( > SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit$.submit( > SparkSubmit.scala:205) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit. > scala:120) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution > Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot > validate serde: org.apache.hive.hcatalog.data.JsonSerDe > at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$ > runHive$1.apply(ClientWrapper.scala:455) > at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$ > runHive$1.apply(ClientWrapper.scala:440) > a
Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe
and.sideEffectResult$lzycompute(commands.scala:57) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:69) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:140) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:138) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:933) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:933) at org.apache.spark.sql.DataFrame.(DataFrame.scala:144) at org.apache.spark.sql.DataFrame.(DataFrame.scala:129) at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:725) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:308) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:685) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) I double-checked, my classpath is clean, I have both of /usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar and /usr/hdp/2.3.4.0-3485/hive/lib/hive-serde.jar lib in it. Does someone know that problem ? Any inputs about where it could come from ? Thanks in advance for your help ! Regards, Loïc Loïc CHANEL System Big Data engineer MS - Worldline Analytics Platform - Worldline (Villeurbanne, France)
Re: Fail to load table via Tez
Hi Rajesh, Thanks for your quick answer. Seems like we also have a problem with the logs, as even with aggreagation enabled we get the following : /app-logs/yarn/logs/application_1499426430661_0113 does not exist. Log aggregation has not completed or is not enabled. Still, I tried what you suggested and still got the same problem: INFO : Map 1: 255(+85,-31)/340 INFO : Map 1: 256(+84,-31)/340 INFO : Map 1: 257(+77,-33)/340 INFO : Map 1: 257(+0,-33)/340 ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1499426430661_0119_1_00, diagnostics=[Task failed, taskId=task_1499426430661_0119_1_00_000273, diagnostics=[TaskAttempt 0 failed, info=[Container container_e17_1499426430661_0119_01_000170 finished with diagnostics set to [Container failed, exitCode=-104. Container [pid=37464,containerID=container_e17_1499426430661_0119_01_000170] is running beyond physical memory limits. Current usage: 2.5 GB of 2.5 GB physical memory used; 4.4 GB of 5.3 GB virtual memory used. Killing container. Do you have another idea ? Regards, Loïc Loïc CHANEL System Big Data engineer MS - Worldline Analytics Platform - Worldline (Villeurbanne, France) 2017-07-07 16:51 GMT+02:00 Rajesh Balamohan <rbalamo...@apache.org>: > You can run *"yarn logs -applicationId application_1499426430661_0113 > > application_1499426430661_**0113.log"* to get the app logs. > > Would suggest you to try with *"hive --hiveconf > tez.grouping.max-size=134217728 --hiveconf tez.grouping.min-size=** 134217728" > *for running your hive query. You may want to adjust this parameter (to > say 256 MB or so) in case too may mappers are created. > > ~Rajesh.B > > > On Fri, Jul 7, 2017 at 8:02 PM, Loïc Chanel <loic.cha...@telecomnancy.net> > wrote: > >> Hi guys, >> >> I'm having some troubles with Tez when I try to load some data stored in >> small JSON files in HDFS into a Hive table. >> >> At first I got some Out of memory exceptions, so I tried increasing the >> amount of memory allocated to Tez, until the problem turned to a GC >> Overhead limit exceeded after 10 GB of RAM was allocated to Tez containers. >> >> So I upgraded my common sense and put back memory limits to a normal >> level, and now the problem I hit is the following : >> >> INFO : Map 1: 276(+63,-84)/339 >> INFO : Map 1: 276(+63,-85)/339 >> INFO : Map 1: 276(+63,-85)/339 >> INFO : Map 1: 276(+0,-86)/339 >> INFO : Map 1: 276(+0,-86)/339 >> ERROR : Status: Failed >> ERROR : Status: Failed >> ERROR : Vertex failed, vertexName=Map 1, >> vertexId=vertex_1499426430661_0113_1_00, >> diagnostics=[Task failed, taskId=task_1499426430661_0113_1_00_000241, >> diagnostics=[TaskAttempt 0 failed, info=[Container >> container_e17_1499426430661_0113_01_000170 finished with diagnostics set >> to [Container failed, exitCode=-104. Container >> [pid=59528,containerID=container_e17_1499426430661_0113_01_000170] is >> running beyond physical memory limits. Current usage: 2.7 GB of 2.5 GB >> physical memory used; 4.4 GB of 5.3 GB virtual memory used. Killing >> container. >> >> The problem is I can't see how the container could be allocated so much >> memory, and why can't Tez split the jobs into smaller ones when it fails >> for memory reasons. >> >> FYI, in YARN, Max container memory is 92160 MB, in MR2 Map can have 4GB >> and Reduce 5GB, Tez container size is set to 2560 MB and tez.grouping. >> max-size is set to 1073741824. >> >> If you need more information feel free to ask. >> >> I am currently running out of ideas on how to debug this as I have a >> limited access to Tez container logs, so any inputs will be highly >> appreciated. >> >> Thanks ! >> >> >> Loïc >> >> Loïc CHANEL >> System Big Data engineer >> MS - Worldline Analytics Platform - Worldline (Villeurbanne, France) >> > >
Fail to load table via Tez
Hi guys, I'm having some troubles with Tez when I try to load some data stored in small JSON files in HDFS into a Hive table. At first I got some Out of memory exceptions, so I tried increasing the amount of memory allocated to Tez, until the problem turned to a GC Overhead limit exceeded after 10 GB of RAM was allocated to Tez containers. So I upgraded my common sense and put back memory limits to a normal level, and now the problem I hit is the following : INFO : Map 1: 276(+63,-84)/339 INFO : Map 1: 276(+63,-85)/339 INFO : Map 1: 276(+63,-85)/339 INFO : Map 1: 276(+0,-86)/339 INFO : Map 1: 276(+0,-86)/339 ERROR : Status: Failed ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1499426430661_0113_1_00, diagnostics=[Task failed, taskId=task_1499426430661_0113_1_00_000241, diagnostics=[TaskAttempt 0 failed, info=[Container container_e17_1499426430661_0113_01_000170 finished with diagnostics set to [Container failed, exitCode=-104. Container [pid=59528,containerID=container_e17_1499426430661_0113_01_000170] is running beyond physical memory limits. Current usage: 2.7 GB of 2.5 GB physical memory used; 4.4 GB of 5.3 GB virtual memory used. Killing container. The problem is I can't see how the container could be allocated so much memory, and why can't Tez split the jobs into smaller ones when it fails for memory reasons. FYI, in YARN, Max container memory is 92160 MB, in MR2 Map can have 4GB and Reduce 5GB, Tez container size is set to 2560 MB and tez.grouping.max-size is set to 1073741824. If you need more information feel free to ask. I am currently running out of ideas on how to debug this as I have a limited access to Tez container logs, so any inputs will be highly appreciated. Thanks ! Loïc Loïc CHANEL System Big Data engineer MS - Worldline Analytics Platform - Worldline (Villeurbanne, France)
Re: User is not allowed to impersonate
Hello Andrey, Can you check the keytabs are properly generated and handled on Hive host ? Using maybe klist with -t option. Regards, Loïc Loïc CHANEL System Big Data engineer MS - WASABI - Worldline (Villeurbanne, France) 2017-05-04 15:42 GMT+02:00 Markovich <amriv...@gmail.com>: > i'm still unable to resolve this... > > INFO [Thread-17]: thrift.ThriftCLIService > (ThriftHttpCLIService.java:run(152)) > - Started ThriftHttpCLIService in http mode on port 10001 > path=/cliservice/* with 5...500 worker threads > 2017-05-04 13:40:14,195 INFO [HiveServer2-HttpHandler-Pool: Thread-60]: > thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(145)) - Could not > validate cookie sent, will try to generate a new cookie > 2017-05-04 13:40:14,198 INFO [HiveServer2-HttpHandler-Pool: Thread-60]: > thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(398)) - > Failed to authenticate with http/_HOST kerberos principal, trying with > hive/_HOST kerberos principal > 2017-05-04 13:40:14,199 ERROR [HiveServer2-HttpHandler-Pool: Thread-60]: > thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(406)) - > Failed to authenticate with hive/_HOST kerberos principal > 2017-05-04 13:40:14,199 ERROR [HiveServer2-HttpHandler-Pool: Thread-60]: > thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(209)) - Error: > org.apache.hive.service.auth.HttpAuthenticationException: > java.lang.reflect.UndeclaredThrowableException > at org.apache.hive.service.cli.thrift.ThriftHttpServlet. > doKerberosAuth(ThriftHttpServlet.java:407) > at org.apache.hive.service.cli.thrift.ThriftHttpServlet. > doPost(ThriftHttpServlet.java:159) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > at org.eclipse.jetty.servlet.ServletHolder.handle( > ServletHolder.java:565) > at org.eclipse.jetty.servlet.ServletHandler.doHandle( > ServletHandler.java:479) > at org.eclipse.jetty.server.session.SessionHandler. > doHandle(SessionHandler.java:225) > at org.eclipse.jetty.server.handler.ContextHandler. > doHandle(ContextHandler.java:1031) > at org.eclipse.jetty.servlet.ServletHandler.doScope( > ServletHandler.java:406) > at org.eclipse.jetty.server.session.SessionHandler. > doScope(SessionHandler.java:186) > at org.eclipse.jetty.server.handler.ContextHandler. > doScope(ContextHandler.java:965) > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > ScopedHandler.java:117) > at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > HandlerWrapper.java:111) > at org.eclipse.jetty.server.Server.handle(Server.java:349) > at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest( > AbstractHttpConnection.java:449) > at org.eclipse.jetty.server.AbstractHttpConnection$ > RequestHandler.content(AbstractHttpConnection.java:925) > at org.eclipse.jetty.http.HttpParser.parseNext( > HttpParser.java:857) > at org.eclipse.jetty.http.HttpParser.parseAvailable( > HttpParser.java:235) > at org.eclipse.jetty.server.AsyncHttpConnection.handle( > AsyncHttpConnection.java:76) > at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle( > SelectChannelEndPoint.java:609) > at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run( > SelectChannelEndPoint.java:45) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.reflect.UndeclaredThrowableException > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1742) > at org.apache.hive.service.cli.thrift.ThriftHttpServlet. > doKerberosAuth(ThriftHttpServlet.java:404) > ... 23 more > Caused by: org.apache.hive.service.auth.HttpAuthenticationException: > Authorization header received from the client is empty. > at org.apache.hive.service.cli.thrift.ThriftHttpServlet. > getAuthHeader(ThriftHttpServlet.java:548) > at org.apache.hive.service.cli.thrift.ThriftHttpServlet. > access$100(ThriftHttpServlet.java:74) > at org.apache.hive.service.cli.thrift.ThriftHttpServlet$ > HttpKerberosServerAction.run(ThriftHttpServlet.java:449) > at org.apache.hive.service.cli.thrift.ThriftHttpServlet$ > HttpKerberosServerAction.run(ThriftHttpServlet.java:412) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.secur
Re: Start HiveServer2 with Kerberos meet FATAL error
Hi, What is it you're executing when you get that error ? It seems you're trying to log into HiveServer with unix username instead of your Kerberos keytab. Regards, Loïc Loïc CHANEL System Big Data engineer MS - WASABI - Worldline (Villeurbanne, France) 2016-10-18 4:40 GMT+02:00 Micro dong <microle.d...@gmail.com>: > I'm trying to configure HiveServer2(hive-1.2.1) With Kerberos。Here is my > Hive's configuration file。 > ** > *hive.server2.authentication* > *KERBEROS* > * * > * * > *hive.server2.authentication.kerberos.principal* > *hive2/_h...@hadoop.com <h...@hadoop.com>* > * * > * * > *hive.server2.authentication.kerberos.keytab* > */home/work/software/hive/conf/hive.keytab* > * * > > the keytab file is in its location, its owner is work. But when I try to > start the HiveServer2, I see this message on the log: > > *2016-10-18 10:20:24,867 FATAL [Thread-9]: thrift.ThriftCLIService > (ThriftBinaryCLIService.java:run(101)) - Error starting HiveServer2: could > not start ThriftBinaryCLIService* > *javax.security.auth.login.LoginException: Kerberos principal should have > 3 parts: work** at * > *org.apache.hive.service.auth.HiveAuthFactory.getAuthTransFactory(HiveAuthFactory.java:147)* > > *atorg.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:58)* > *at java.lang.Thread.run(Thread.java:722)* > > Here *work* is my unix login name. Any help would be highly appreciated. > > -- > Best regards >
Re: Permission denied: user=root, access=WRITE, inode="/user/root":hdfs:hdfs:drwxr-xr-x
What are you trying to do when you get that trace ? The problem seems to be that you're trying to read from directory /user/root owned by hdfs with user root, that has no right to do so. Regards, Loïc Loïc CHANEL System Big Data engineer MS - WASABI - Worldline (Villeurbanne, France) 2016-10-05 10:26 GMT+02:00 Raj hadoop <raj.had...@gmail.com>: > Hi All, > > Could someone help in to solve this issue, > > Logging initialized using configuration in file:/etc/hive/2.4.2.0-258/0/ > hive-log4j.properties > Exception in thread "main" java.lang.RuntimeException: > org.apache.hadoop.security.AccessControlException: Permission denied: > user=root, access=WRITE, inode="/user/root":hdfs:hdfs:drwxr-xr-x > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > check(FSPermissionChecker.java:319) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > check(FSPermissionChecker.java:292) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > checkPermission(FSPermissionChecker.java:213) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > checkPermission(FSPermissionChecker.java:190) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkPermission(FSDirectory.java:1780) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkPermission(FSDirectory.java:1764) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkAncestorAccess(FSDirectory.java:1747) > at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs( > FSDirMkdirOp.java:71) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs( > FSNamesystem.java:3972) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. > mkdirs(NameNodeRpcServer.java:1081) > at org.apache.hadoop.hdfs.protocolPB. > ClientNamenodeProtocolServerSideTranslatorPB.mkdirs( > ClientNamenodeProtocolServerSideTranslatorPB.java:630) > at org.apache.hadoop.hdfs.protocol.proto. > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( > ClientNamenodeProtocolProtos.java) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1709) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2200) > > at org.apache.hadoop.hive.ql.session.SessionState.start( > SessionState.java:516) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:680) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: org.apache.hadoop.security.AccessControlException: Permission > denied: user=root, access=WRITE, inode="/user/root":hdfs:hdfs:drwxr-xr-x > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > check(FSPermissionChecker.java:319) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > check(FSPermissionChecker.java:292) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > checkPermission(FSPermissionChecker.java:213) > at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker. > checkPermission(FSPermissionChecker.java:190) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkPermission(FSDirectory.java:1780) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkPermission(FSDirectory.java:1764) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory. > checkAncestorAccess(FSDirectory.java:1747) > at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs( > FSDirMkdirOp.java:71) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs( > FSNamesystem.java:3972) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. > mkdirs(NameNodeRpcServer.java:1081) >
Re: Quota for rogue ad-hoc queries
On the topic of timeout, if I may say, they are a dangerous way to deal with requests as a "good" request may last longer than an "evil" one. Be sure timeouts won't kill any important job before putting them into place. You can set these things on in the components (Tez, MapReduce ...) parameters, but not directly into YARN. At least it was the case when I tried this (one year ago). Regards, Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-09-01 16:52 GMT+02:00 Stephen Sprague <sprag...@gmail.com>: > > rogue queries > > so this really isn't limited to just hive is it? any dbms system perhaps > has to contend with this. even malicious rogue queries as a matter of fact. > > timeouts are cheap way systems handle this - assuming time is related to > resource. i'm sure beeline or whatever client you use has a timeout feature. > > maybe one could write a separate service - say a governor - that watches > over YARN (or hdfs or whatever resource is rare) - and terminates the > process if it goes beyond a threshold. think OOM killer. > > but, yeah, i admittedly don't know of something out there already you can > just tap into but YARN's Resource Manager seems to be place i'd research > for starters. Just look look at its name. :) > > my unsolicited 2 cents. > > > > On Wed, Aug 31, 2016 at 10:24 PM, ravi teja <raviort...@gmail.com> wrote: > >> Thanks Mich, >> >> Unfortunately we have many insert queries. >> Are there any other ways? >> >> Thanks, >> Ravi >> >> On Wed, Aug 31, 2016 at 9:45 PM, Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Trt this >>> >>> hive.limit.optimize.fetch.max >>> >>>- Default Value: 5 >>>- Added In: Hive 0.8.0 >>> >>> Maximum number of rows allowed for a smaller subset of data for simple >>> LIMIT, if it is a fetch query. Insert queries are not restricted by this >>> limit. >>> >>> >>> HTH >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> On 31 August 2016 at 13:42, ravi teja <raviort...@gmail.com> wrote: >>> >>>> Hi Community, >>>> >>>> Many users run adhoc hive queries on our platform. >>>> Some rogue queries managed to fill up the hdfs space and causing >>>> mainstream queries to fail. >>>> >>>> We wanted to limit the data generated by these adhoc queries. >>>> We are aware of strict param which limits the data being scanned, but >>>> it is of less help as huge number of user tables aren't partitioned. >>>> >>>> Is there a way we can limit the data generated from hive per query, >>>> like a hve parameter for setting HDFS quotas for job level *scratch* >>>> directory or any other approach? >>>> What's the general approach to gaurdrail such multi-tenant cases. >>>> >>>> Thanks in advance, >>>> Ravi >>>> >>> >>> >> >
Re: [ANNOUNCE] New Hive Committer - Wei Zheng
Congratulations ! Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-03-10 11:44 GMT+01:00 Jeff Zhang <zjf...@gmail.com>: > Congratulations, Wei ! > > On Thu, Mar 10, 2016 at 3:27 PM, Lefty Leverenz <leftylever...@gmail.com> > wrote: > >> Congratulations! >> >> -- Lefty >> >> On Wed, Mar 9, 2016 at 10:30 PM, Dmitry Tolpeko <dmtolp...@gmail.com> >> wrote: >> >>> Congratulations, Wei! >>> >>> On Thu, Mar 10, 2016 at 5:48 AM, Chao Sun <sunc...@apache.org> wrote: >>> >>>> Congratulations! >>>> >>>> On Wed, Mar 9, 2016 at 6:44 PM, Prasanth Jayachandran < >>>> pjayachand...@hortonworks.com> wrote: >>>> >>>>> Congratulations Wei! >>>>> >>>>> On Mar 9, 2016, at 8:43 PM, Sergey Shelukhin <ser...@hortonworks.com >>>>> <mailto:ser...@hortonworks.com>> wrote: >>>>> >>>>> Congrats! >>>>> >>>>> From: Szehon Ho <sze...@cloudera.com<mailto:sze...@cloudera.com>> >>>>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" < >>>>> user@hive.apache.org<mailto:user@hive.apache.org>> >>>>> Date: Wednesday, March 9, 2016 at 17:40 >>>>> To: "user@hive.apache.org<mailto:user@hive.apache.org>" < >>>>> user@hive.apache.org<mailto:user@hive.apache.org>> >>>>> Cc: "d...@hive.apache.org<mailto:d...@hive.apache.org>" < >>>>> d...@hive.apache.org<mailto:d...@hive.apache.org>>, "w...@apache.org >>>>> <mailto:w...@apache.org>" <w...@apache.org<mailto:w...@apache.org>> >>>>> Subject: Re: [ANNOUNCE] New Hive Committer - Wei Zheng >>>>> >>>>> Congratulations Wei! >>>>> >>>>> On Wed, Mar 9, 2016 at 5:26 PM, Vikram Dixit K <vik...@apache.org >>>>> <mailto:vik...@apache.org>> wrote: >>>>> The Apache Hive PMC has voted to make Wei Zheng a committer on the >>>>> Apache Hive Project. Please join me in congratulating Wei. >>>>> >>>>> Thanks >>>>> Vikram. >>>>> >>>>> >>>>> >>>> >>> >> > > > -- > Best Regards > > Jeff Zhang >
Re: Hive Query Timeout in hive-jdbc
Actually, Hive doesn't support timeout, but Tez and MapReduce does. Therefore, you can set a timeout on these tools to kill failed queries. Hope this helps, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-02-02 11:10 GMT+01:00 董亚军 <ric.d...@liulishuo.com>: > hive does not support timeout on the client side. > > and I think it is not recommended that if the client exit with timeout > exception, the hiveserver side may also running the job. this will result > in inconsistent state. > > On Tue, Feb 2, 2016 at 4:49 PM, Satya Harish Appana < > satyaharish.app...@gmail.com> wrote: > >> Hi Team, >> >> I am trying to connect to hiveServer via hive-jdbc. >> Can we configure client side timeout at each query executed inside each >> jdbc connection. (When I looked at HiveStatement.setQueryTimeout method it >> says operation unsupported). >> Is there any other way of timing out and cancelling the connection and >> throwing Exception, if it alive for over a period of 4 mins or so >> (configurable at client side). >> >> PS : Queries that I am executing over jdbc are simple ddl statements. >> (hive external table create statements and drop table statements). >> >> >> Regards, >> Satya Harish. >> > >
Re: Hive Query Timeout in hive-jdbc
Then indeed Tez and MR timeout won't be any help, sorry. I would be very interested in your solution though. Regards, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-02-02 11:27 GMT+01:00 Satya Harish Appana <satyaharish.app...@gmail.com >: > Queries I am running over Hive JDBC are ddl statements(none of the queries > are select or insert. which will result in an execution engine(tez/mr) job > to be launched.. all the queries are create external table .. and drop > table .. and alter table add partitions). > > > On Tue, Feb 2, 2016 at 3:54 PM, Loïc Chanel <loic.cha...@telecomnancy.net> > wrote: > >> Actually, Hive doesn't support timeout, but Tez and MapReduce does. >> Therefore, you can set a timeout on these tools to kill failed queries. >> Hope this helps, >> >> Loïc >> >> Loïc CHANEL >> System & virtualization engineer >> TO - XaaS Ind - Worldline (Villeurbanne, France) >> >> 2016-02-02 11:10 GMT+01:00 董亚军 <ric.d...@liulishuo.com>: >> >>> hive does not support timeout on the client side. >>> >>> and I think it is not recommended that if the client exit with timeout >>> exception, the hiveserver side may also running the job. this will result >>> in inconsistent state. >>> >>> On Tue, Feb 2, 2016 at 4:49 PM, Satya Harish Appana < >>> satyaharish.app...@gmail.com> wrote: >>> >>>> Hi Team, >>>> >>>> I am trying to connect to hiveServer via hive-jdbc. >>>> Can we configure client side timeout at each query executed inside each >>>> jdbc connection. (When I looked at HiveStatement.setQueryTimeout method it >>>> says operation unsupported). >>>> Is there any other way of timing out and cancelling the connection and >>>> throwing Exception, if it alive for over a period of 4 mins or so >>>> (configurable at client side). >>>> >>>> PS : Queries that I am executing over jdbc are simple ddl statements. >>>> (hive external table create statements and drop table statements). >>>> >>>> >>>> Regards, >>>> Satya Harish. >>>> >>> >>> >> > > > -- > > > Regards, > Satya Harish Appana, > Software Development Engineer II, > Flipkart,Bangalore, > Ph:+91-9538797174. >
Re: beeline and kerberos
As I had the same problem a few months ago, I think your solution is in this thread : http://mail-archives.apache.org/mod_mbox/hive-user/201509.mbox/%3CCAPsi++Zsgvro4JTJPRXNyjzCTtSv7=4zjsipml51kvbforo...@mail.gmail.com%3E Regards, Loïc Loïc CHANEL System & virtualization engineer TO - XaaS Ind - Worldline (Villeurbanne, France) 2016-01-09 22:49 GMT+01:00 Margus Roo <mar...@roo.ee>: > One more notice > > When I do: > [margusja@sandbox ~]$ hdfs dfs -ls / > > I see in krb5kdc log: > Jan 09 21:36:53 sandbox.hortonworks.com krb5kdc[8565](info): TGS_REQ (6 > etypes {18 17 16 23 1 3}) 10.0.2.15: ISSUE: authtime 1452375310, etypes > {rep=18 tkt=18 ses=18}, margu...@example.com for nn/ > sandbox.hortonworks@example.com > > > but when I use beeline I see there no lines in krb5kdc log. > > When I do > [margusja@sandbox ~]$ kdestroy > > and hdfs dfs -ls / - I see there no lines also in krb5kdc log. > > I am so confused - What beeline expecting? I do kinit and I am getting > ticket before using beeline. > > Margus (margusja) Roo > http://margus.roo.ee > skype: margusja > +372 51 48 780 > > On 09/01/16 17:49, Margus Roo wrote: > >> Hi >> >> I am trying to use beeline with hive + kerberos (Hortonworks sandbox 2.3) >> >> The problem is that I can use hdfs but not beeline and I do not know what >> is wrong. >> >> Console output: >> [margusja@sandbox ~]$ kdestroy >> [margusja@sandbox ~]$ hdfs dfs -ls /user/ >> 16/01/09 15:45:32 WARN ipc.Client: Exception encountered while connecting >> to the server : javax.security.sasl.SaslException: GSS initiate failed >> [Caused by GSSException: No valid credentials provided (Mechanism level: >> Failed to find any Kerberos tgt)] >> ls: Failed on local exception: java.io.IOException: >> javax.security.sasl.SaslException: GSS initiate failed [Caused by >> GSSException: No valid credentials provided (Mechanism level: Failed to >> find any Kerberos tgt)]; Host Details : local host is: " >> sandbox.hortonworks.com/10.0.2.15"; destination host is: " >> sandbox.hortonworks.com":8020; >> [margusja@sandbox ~]$ kinit margusja >> Password for margu...@example.com: >> [margusja@sandbox ~]$ hdfs dfs -ls /user/ >> Found 11 items >> drwxrwx--- - ambari-qa hdfs 0 2015-10-27 12:39 >> /user/ambari-qa >> drwxr-xr-x - guest guest 0 2015-10-27 12:55 /user/guest >> drwxr-xr-x - hcat hdfs 0 2015-10-27 12:43 /user/hcat >> drwx-- - hdfs hdfs 0 2015-10-27 13:22 /user/hdfs >> drwx-- - hive hdfs 0 2016-01-08 19:44 /user/hive >> drwxrwxrwx - hue hdfs 0 2015-10-27 12:55 /user/hue >> drwxrwxr-x - oozie hdfs 0 2015-10-27 12:44 /user/oozie >> drwxr-xr-x - solr hdfs 0 2015-10-27 12:48 /user/solr >> drwxrwxr-x - spark hdfs 0 2015-10-27 12:41 /user/spark >> drwxr-xr-x - unit hdfs 0 2015-10-27 12:46 /user/unit >> >> So I think margusja's credential is ok >> >> Now I try to use beeline: >> [margusja@sandbox ~]$ beeline -u "jdbc:hive2:// >> 127.0.0.1:1/default;principal=hive/sandbox.hortonworks@example.com >> " >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in >> [jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly- >> 1.4.1.2.3.2.0-2950 >> -hadoop2.7.1.2.3.2.0-2950.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in >> [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >> explanation. >> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >> WARNING: Use "yarn jar" to launch YARN applications. >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in >> [jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly- >> 1.4.1.2.3.2.0-2950 >> -hadoop2.7.1.2.3.2.0-2950.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in >> [jar:file:/usr/hdp/2.3.2.0-2950/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >> explanation. >> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >> Connecting to jdbc:hive2:// >> 127.0.0.1:1/default;principal=hive/sandbox.hortonworks@example.com >> 16/01/09 15:46:59 [mai
Re: HiveServer with LDAP
I mean that I can't see any bind in Hive logs, and that my identity is not asserted. Anyone can log as anyone, there is no password or identity verification. Actually, I think even if my account were blocked or there was a limitation on the number of logins my LDAP server tolerate (which is not the case), I should not be able to connect to Hive, but I am. Do you see what I mean ? Maybe I'm not very clear. Regards, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-09-19 11:12 GMT+02:00 Jörn Franke <jornfra...@gmail.com>: > What do you mean by it is not working? > You may also check the logs of your lap server... > Maybe there is also a limitations of number of logins in your lap server... > Maybe the account is temporarily blocked because you entered the password > wrongly too many times... > > Le ven. 18 sept. 2015 à 10:34, Loïc Chanel <loic.cha...@telecomnancy.net> > a écrit : > >> Hi all, >> >> I try to use LDAP authentication to allow users to get to HiveServer, but >> even though I configured some properties, it seems like it doesn't work at >> all, and I can't find any logs explaining why. >> >> Do anyone of you know which property to set in which log4j file to see >> debug logs for that function ? >> Thanks in advance for your help, >> >> >> Loïc >> >> >> Loïc CHANEL >> Engineering student at TELECOM Nancy >> Trainee at Worldline - Villeurbanne >> >
HiveServer with LDAP
Hi all, I try to use LDAP authentication to allow users to get to HiveServer, but even though I configured some properties, it seems like it doesn't work at all, and I can't find any logs explaining why. Do anyone of you know which property to set in which log4j file to see debug logs for that function ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: HiveServer2 & Kerberos
You were right ! Thanks a lot, I didn't checked this property as I thought Ambari set it to true when enabling Kerberos. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-09-09 19:53 GMT+02:00 Takahiko Saito <tysa...@gmail.com>: > Hi Loic, > > One possible solution is if hive.server2.enable.doAs is set false in > hive-site.xml, you can change it to true and restart HiveServer2. And then > try to connect via beeline. > > Cheers, > > On Wed, Sep 9, 2015 at 8:02 AM, Loïc Chanel <loic.cha...@telecomnancy.net> > wrote: > >> Hi guys ! >> >> Sorry to interrupt but I need to go back to the first reason of this >> thread : I can't connect to hive anymore. >> I upgraded my cluster to HDP 2.3, and I saw that the way to connect to >> Hive via Beeline & Kerberos hasn't changed, but the exact command that >> worked before doesn't work anymore. >> Instead of connecting, Beeline returns me : >> Error: Failed to open new session: java.lang.RuntimeException: >> java.lang.RuntimeException: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): >> User: hive/hiveserverh...@example.com is not allowed to impersonate >> testUser (state=,code=0) >> >> The logs are not more explicit, as there is an exception with the same >> conclusion : Caused by: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): >> User: hive/hiveserverh...@example.com is not allowed to impersonate >> testUser >> >> Do any of you have an idea about where this could come from ? >> >> >> >> Loïc CHANEL >> Engineering student at TELECOM Nancy >> Trainee at Worldline - Villeurbanne >> >> 2015-08-31 13:51 GMT+02:00 Lars Francke <lars.fran...@gmail.com>: >> >>> That said, +1 to adding a check that we are using kerberos and skipping >>>> the prompt if we are. I think we probably don't even need to parse the URL >>>> to detect that. Just checking on the auth type property( >>>> hive.server2.authentication) is KERBEROS or not should do the trick. >>>> >>> >>> I have not looked into this at all but Beeline being a generic client >>> does it even use that property? I mean I could connect to any server, >>> right? Will try to take a look. >>> >>> >>>> [1] >>>> https://github.com/apache/hive/blob/3991dba30c5068cac296f32e24e97cf87efa266c/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java#L450-L455 >>>> >>>> On Wed, Aug 26, 2015 at 5:40 PM, Lars Francke <lars.fran...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> On Wed, Aug 26, 2015 at 4:53 PM, kulkarni.swar...@gmail.com < >>>>> kulkarni.swar...@gmail.com> wrote: >>>>> >>>>>> > my understanding is that after using kerberos authentication, you >>>>>> probably don’t need the password. >>>>>> >>>>>> That is not an accurate statement. Beeline is a JDBC client as >>>>>> compared to Hive CLI which is a thrift client to talk to HIveServer2. So >>>>>> it >>>>>> would need the password to establish that JDBC connection. If you look at >>>>>> the beeline console code[1], it actually first tries to read the >>>>>> "javax.jdo.option.ConnectionUserName" and >>>>>> "javax.jdo.option.ConnectionPassword" property which is the same username >>>>>> and password that you have setup your backing metastore DB with. If it is >>>>>> MySWL, it would be the password you set MySQL with or empty if you >>>>>> haven't(or are using derby). Kerberos is merely a tool for you to >>>>>> authenticate yourself so that you cannot impersonate yourself as someone >>>>>> else. >>>>>> >>>>> >>>>> I don't think what you're saying is accurate. >>>>> >>>>> 1) Hive CLI does not talk to HiveServer2 >>>>> >>>>> 2) Beeline talks to HiveServer2 and needs some way to authenticate >>>>> itself depending on the configuration of HS2. >>>>> >>>>> HS2 can be configured to authenticate in one of these ways if I'm up >>>>> to date: >>>>> >>>>> * NOSASL: no password needed >>>>> * KERBEROS (SASL): no password needed >>>>>
Re: HiveServer2 & Kerberos
Hi guys ! Sorry to interrupt but I need to go back to the first reason of this thread : I can't connect to hive anymore. I upgraded my cluster to HDP 2.3, and I saw that the way to connect to Hive via Beeline & Kerberos hasn't changed, but the exact command that worked before doesn't work anymore. Instead of connecting, Beeline returns me : Error: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive/hiveserverh...@example.com is not allowed to impersonate testUser (state=,code=0) The logs are not more explicit, as there is an exception with the same conclusion : Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive/hiveserverh...@example.com is not allowed to impersonate testUser Do any of you have an idea about where this could come from ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-31 13:51 GMT+02:00 Lars Francke <lars.fran...@gmail.com>: > That said, +1 to adding a check that we are using kerberos and skipping >> the prompt if we are. I think we probably don't even need to parse the URL >> to detect that. Just checking on the auth type property( >> hive.server2.authentication) is KERBEROS or not should do the trick. >> > > I have not looked into this at all but Beeline being a generic client does > it even use that property? I mean I could connect to any server, right? > Will try to take a look. > > >> [1] >> https://github.com/apache/hive/blob/3991dba30c5068cac296f32e24e97cf87efa266c/jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java#L450-L455 >> >> On Wed, Aug 26, 2015 at 5:40 PM, Lars Francke <lars.fran...@gmail.com> >> wrote: >> >>> >>> On Wed, Aug 26, 2015 at 4:53 PM, kulkarni.swar...@gmail.com < >>> kulkarni.swar...@gmail.com> wrote: >>> >>>> > my understanding is that after using kerberos authentication, you >>>> probably don’t need the password. >>>> >>>> That is not an accurate statement. Beeline is a JDBC client as compared >>>> to Hive CLI which is a thrift client to talk to HIveServer2. So it would >>>> need the password to establish that JDBC connection. If you look at the >>>> beeline console code[1], it actually first tries to read the >>>> "javax.jdo.option.ConnectionUserName" and >>>> "javax.jdo.option.ConnectionPassword" property which is the same username >>>> and password that you have setup your backing metastore DB with. If it is >>>> MySWL, it would be the password you set MySQL with or empty if you >>>> haven't(or are using derby). Kerberos is merely a tool for you to >>>> authenticate yourself so that you cannot impersonate yourself as someone >>>> else. >>>> >>> >>> I don't think what you're saying is accurate. >>> >>> 1) Hive CLI does not talk to HiveServer2 >>> >>> 2) Beeline talks to HiveServer2 and needs some way to authenticate >>> itself depending on the configuration of HS2. >>> >>> HS2 can be configured to authenticate in one of these ways if I'm up to >>> date: >>> >>> * NOSASL: no password needed >>> * KERBEROS (SASL): no password needed >>> * NONE (SASL) using the AnonymousAuthenticationProviderImpl: no >>> password needed >>> * LDAP (SASL) using the LdapAuthenticationProviderImpl: username and >>> password required >>> * PAM (SASL) using the PamAuthenticationProviderImpl: username and >>> password required >>> * CUSTOM (SASL) using the CustomAuthenticationProviderImpl: username >>> and password required >>> >>> By tar the most common configurations are NONE (default I think) and >>> KERBEROS. Both don't need a username and password provided so it does not >>> make sense to ask for one every time. >>> >>> The only good reason I can think of to ask for a password is so that it >>> doesn't appear in a shell/beeline history and/or on screen. I'm sure there >>> are others? >>> The username can be safely provided in the URL if needed so I don't >>> think asking for that every time is reasonable either. >>> >>> What would be a good way to deal with this? I'm tempted to just rip out >>> those prompts. The other option would be to parse the connection URL and >>> check whether it's the Kerberos mode. >>> >>>> >>>> [1] >>>
Re: [ANNOUNCE] New Hive Committer - Lars Francke
Congrats Lars ! :) Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-09-07 10:54 GMT+02:00 Carl Steinbach <c...@apache.org>: > The Apache Hive PMC has voted to make Lars Francke a committer on the > Apache Hive Project. > > Please join me in congratulating Lars! > > Thanks. > > - Carl > >
Re: HiveServer2 Kerberos
I understand the behavior, but when Kerberos is enabled, isn't that a bit redundant ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-26 17:53 GMT+02:00 kulkarni.swar...@gmail.com kulkarni.swar...@gmail.com: my understanding is that after using kerberos authentication, you probably don’t need the password. That is not an accurate statement. Beeline is a JDBC client as compared to Hive CLI which is a thrift client to talk to HIveServer2. So it would need the password to establish that JDBC connection. If you look at the beeline console code[1], it actually first tries to read the javax.jdo.option.ConnectionUserName and javax.jdo.option.ConnectionPassword property which is the same username and password that you have setup your backing metastore DB with. If it is MySWL, it would be the password you set MySQL with or empty if you haven't(or are using derby). Kerberos is merely a tool for you to authenticate yourself so that you cannot impersonate yourself as someone else. [1] https://github.com/apache/hive/blob/3991dba30c5068cac296f32e24e97cf87efa266c/beeline/src/java/org/apache/hive/beeline/Commands.java#L1117-L1125 On Wed, Aug 26, 2015 at 10:13 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Here it is : https://issues.apache.org/jira/browse/HIVE-11653 Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-25 23:10 GMT+02:00 Sergey Shelukhin ser...@hortonworks.com: Sure! From: Loïc Chanel loic.cha...@telecomnancy.net Reply-To: user@hive.apache.org user@hive.apache.org Date: Tuesday, August 25, 2015 at 00:23 To: user@hive.apache.org user@hive.apache.org Subject: Re: HiveServer2 Kerberos It is the case. Would you like me to fill a JIRA about it ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-24 19:24 GMT+02:00 Sergey Shelukhin ser...@hortonworks.com: If that is the case it sounds like a bug… From: Jary Du jary...@gmail.com Reply-To: user@hive.apache.org user@hive.apache.org Date: Thursday, August 20, 2015 at 08:56 To: user@hive.apache.org user@hive.apache.org Subject: Re: HiveServer2 Kerberos My understanding is that it will always ask you user/password even though you don’t need them. It is just the way how hive is setup. On Aug 20, 2015, at 8:28 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: !connect jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl org.apache.hive.jdbc.HiveDriver scan complete in 13ms Connecting to jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl Enter password for jdbc:hive2:// 192.168.6.210:1/chaneldb;principal=hive/hiveh...@westeros.wl: And if I press enter everything works perfectly, because I am using Kerberos authentication, that's actually why I was asking what is Hive asking for, because in my case, it seems that I shouldn't be asked for a password when connecting. Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-20 17:06 GMT+02:00 Jary Du jary...@gmail.com: How does Beeline ask you? What happens if you just press enter? On Aug 20, 2015, at 12:15 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Indeed, I don't need the password, but why is Beeline asking me for one ? To what does it correspond ? Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:22 GMT+02:00 Jary Du jary...@gmail.com: Correct me if I am wrong, my understanding is that after using kerberos authentication, you probably don’t need the password. Hope it helps Thanks, Jary On Aug 19, 2015, at 9:09 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: By the way, thanks a lot for your help, because your solution works, but I'm still interested in knowing what is the password I did not enter. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:07 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net : All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good
Re: HiveServer2 Kerberos
Here it is : https://issues.apache.org/jira/browse/HIVE-11653 Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-25 23:10 GMT+02:00 Sergey Shelukhin ser...@hortonworks.com: Sure! From: Loïc Chanel loic.cha...@telecomnancy.net Reply-To: user@hive.apache.org user@hive.apache.org Date: Tuesday, August 25, 2015 at 00:23 To: user@hive.apache.org user@hive.apache.org Subject: Re: HiveServer2 Kerberos It is the case. Would you like me to fill a JIRA about it ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-24 19:24 GMT+02:00 Sergey Shelukhin ser...@hortonworks.com: If that is the case it sounds like a bug… From: Jary Du jary...@gmail.com Reply-To: user@hive.apache.org user@hive.apache.org Date: Thursday, August 20, 2015 at 08:56 To: user@hive.apache.org user@hive.apache.org Subject: Re: HiveServer2 Kerberos My understanding is that it will always ask you user/password even though you don’t need them. It is just the way how hive is setup. On Aug 20, 2015, at 8:28 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: !connect jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl org.apache.hive.jdbc.HiveDriver scan complete in 13ms Connecting to jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl Enter password for jdbc:hive2:// 192.168.6.210:1/chaneldb;principal=hive/hiveh...@westeros.wl: And if I press enter everything works perfectly, because I am using Kerberos authentication, that's actually why I was asking what is Hive asking for, because in my case, it seems that I shouldn't be asked for a password when connecting. Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-20 17:06 GMT+02:00 Jary Du jary...@gmail.com: How does Beeline ask you? What happens if you just press enter? On Aug 20, 2015, at 12:15 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Indeed, I don't need the password, but why is Beeline asking me for one ? To what does it correspond ? Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:22 GMT+02:00 Jary Du jary...@gmail.com: Correct me if I am wrong, my understanding is that after using kerberos authentication, you probably don’t need the password. Hope it helps Thanks, Jary On Aug 19, 2015, at 9:09 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: By the way, thanks a lot for your help, because your solution works, but I'm still interested in knowing what is the password I did not enter. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:07 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re
Re: HiveServer2 Kerberos
It is the case. Would you like me to fill a JIRA about it ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-24 19:24 GMT+02:00 Sergey Shelukhin ser...@hortonworks.com: If that is the case it sounds like a bug… From: Jary Du jary...@gmail.com Reply-To: user@hive.apache.org user@hive.apache.org Date: Thursday, August 20, 2015 at 08:56 To: user@hive.apache.org user@hive.apache.org Subject: Re: HiveServer2 Kerberos My understanding is that it will always ask you user/password even though you don’t need them. It is just the way how hive is setup. On Aug 20, 2015, at 8:28 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: !connect jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl org.apache.hive.jdbc.HiveDriver scan complete in 13ms Connecting to jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl Enter password for jdbc:hive2:// 192.168.6.210:1/chaneldb;principal=hive/hiveh...@westeros.wl: And if I press enter everything works perfectly, because I am using Kerberos authentication, that's actually why I was asking what is Hive asking for, because in my case, it seems that I shouldn't be asked for a password when connecting. Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-20 17:06 GMT+02:00 Jary Du jary...@gmail.com: How does Beeline ask you? What happens if you just press enter? On Aug 20, 2015, at 12:15 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Indeed, I don't need the password, but why is Beeline asking me for one ? To what does it correspond ? Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:22 GMT+02:00 Jary Du jary...@gmail.com: Correct me if I am wrong, my understanding is that after using kerberos authentication, you probably don’t need the password. Hope it helps Thanks, Jary On Aug 19, 2015, at 9:09 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: By the way, thanks a lot for your help, because your solution works, but I'm still interested in knowing what is the password I did not enter. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:07 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net : Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm
Re: HiveServer2 Kerberos
Indeed, I don't need the password, but why is Beeline asking me for one ? To what does it correspond ? Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:22 GMT+02:00 Jary Du jary...@gmail.com: Correct me if I am wrong, my understanding is that after using kerberos authentication, you probably don’t need the password. Hope it helps Thanks, Jary On Aug 19, 2015, at 9:09 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: By the way, thanks a lot for your help, because your solution works, but I'm still interested in knowing what is the password I did not enter. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:07 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: HiveServer2 Kerberos
!connect jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl org.apache.hive.jdbc.HiveDriver scan complete in 13ms Connecting to jdbc:hive2:// 192.168.6.210:1/db;principal=hive/hiveh...@westeros.wl Enter password for jdbc:hive2:// 192.168.6.210:1/chaneldb;principal=hive/hiveh...@westeros.wl: And if I press enter everything works perfectly, because I am using Kerberos authentication, that's actually why I was asking what is Hive asking for, because in my case, it seems that I shouldn't be asked for a password when connecting. Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-20 17:06 GMT+02:00 Jary Du jary...@gmail.com: How does Beeline ask you? What happens if you just press enter? On Aug 20, 2015, at 12:15 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Indeed, I don't need the password, but why is Beeline asking me for one ? To what does it correspond ? Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:22 GMT+02:00 Jary Du jary...@gmail.com: Correct me if I am wrong, my understanding is that after using kerberos authentication, you probably don’t need the password. Hope it helps Thanks, Jary On Aug 19, 2015, at 9:09 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: By the way, thanks a lot for your help, because your solution works, but I'm still interested in knowing what is the password I did not enter. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:07 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: HiveServer2 Kerberos
Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: HiveServer2 Kerberos
By the way, thanks a lot for your help, because your solution works, but I'm still interested in knowing what is the password I did not enter. Thanks again, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 18:07 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: HiveServer2 Kerberos
Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: HiveServer2 Kerberos
All right, but then, what is the password hive asks for ? Hive's one ? How do I know its value ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:51 GMT+02:00 Jary Du jary...@gmail.com: For Beeline connection string, it should be !connect jdbc:hive2://host:port/db;principal=Server_Principal_of_HiveServer2”. Please make sure it is the hive’s principal, not the user’s. And when you kinit, it should be kinit user’s keytab, not the hive’s keytab. On Aug 19, 2015, at 8:46 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Yeah, I forgot to mention it, but each time I did a kinit user/hive before launching beeline, as I read somewhere that Beeline does not handle Kerberos connection. So, as I can make klist before launching beeline and having a good result, the problem does not come from this. Thanks a lot for your response though. Do you have another idea ? Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 17:42 GMT+02:00 Jary Du jary...@gmail.com: The Beeline client must have a valid Kerberos ticket in the ticket cache before attempting to connect. ( http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_dataintegration/content/ch_using-hive-clients-examples.html ) So you need kinit first to have the valid Kerberos ticket int the ticket cache before using beeline to connect to HS2. Jary On Aug 19, 2015, at 8:36 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi again, As I searched another way to make some requests with Kerberos enabled for security on HiveServer, I found that this request should do the same : !connect jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl org.apache.hive.jdbc.HiveDriver But now I've got another error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1/default;principal=user/h...@westeros.wl: Peer indicated failure: GSS initiate failed (state=08S01,code=0) As I saw that it was maybe a simple Kerberos ticket related problem, I tried to re-generate Kerberos keytabs, and to ensure that Hive has the path to access to its keytab, but nothing changed. Does anyone has an idea about how to solve this issue ? Thanks in advance for your help :) Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-08-19 12:01 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
HiveServer2 Kerberos
Hi all, I have a little issue with HiveServer2 since I have enabled Kerberos. I'm unable to connect to the service via Beeline. When doing !connect jdbc:hive2://192.168.6.210:1 hive hive org.apache.hive.jdbc.HiveDriver I keep receiving the same error : Error: Could not open client transport with JDBC Uri: jdbc:hive2:// 192.168.6.210:1: Peer indicated failure: Unsupported mechanism type PLAIN (state=08S01,code=0) Does anyone had the same problem ? Or know how to solve it ? Thanks in advance, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Getting error in creating a hive table via java
The type String doesn't exist in SQL. I think you want to use VARCHAR instead ;-) Regards, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-31 13:56 GMT+02:00 Sateesh Karuturi sateesh.karutu...@gmail.com: I would like to create a table in hive using Java. Using the following way to do it: public class HiveCreateTable { private static String driverName = com.facebook.presto.jdbc.PrestoDriver; public static void main(String[] args) throws SQLException { // Register driver and create driver instance try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } System.out.println(haii); Connection con = DriverManager.getConnection(jdbc:presto://192.168.1.119:8082/default, hadoop, password); con.setCatalog(hive); con.setSchema(log); Statement stmt = con.createStatement(); String tableName = sample; ResultSet res = stmt.executeQuery(create table access_log2 (cip string, csusername string, cscomputername string)); System.out.println(Table employee created.); con.close(); }} Exception in thread main java.sql.SQLException: Query failed (#20150731_101653_8_hv68j): Unknown type for column 'cip'
Re: Computation timeout
Indeed, I was checking this out on the exact same page, but I'm almost convinced that I saw on a documentation that the default value was 3000 for the check.interval. As I can't find it again, let's say I was tired and my eyes betrayed me. Thanks a lot, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-30 9:46 GMT+02:00 Lefty Leverenz leftylever...@gmail.com: You're right about the typos, but both parameters have defaults of 0 ms: - hive.server2.session.check.interval https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.session.check.interval - hive.server2.idle.operation.timeout https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.idle.operation.timeout -- Lefty On Thu, Jul 30, 2015 at 3:31 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Rats, I think I just figured it out. #2 Is NEGATIVE 3000, right ? I set it to positive yesterday. As for #1, I think it is the default value, so I am not sure I have to set it. Can you confirm that there is a typo on the name of your properties (missing last letter) and that is not the actual name of the properties ? I'll try again and keep you informed Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 20:15 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: this works for me: In hive-site.xml: 1. hive.server2.session.check.interva=3000; 2. hive.server2.idle.operation.timeou=-3; restart HiveServer2. at beeline, I do analyze table X compute statistics for columns, which takes longer than 30s. it was aborted by HS2 because of above settings. I guess it didn't work for you because you didn't have #1. --Xuefu On Wed, Jul 29, 2015 at 9:23 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I don't think your solution works, as after more than 4 minutes I could still see logs of my job showing that it was running. Do you have a way to check that even if the job was running, it was not being killed by Hive ? Or another solution ? Thanks for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:26 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Yes, I set it to negative 60. It's not a problem if the session is killed. That's actually what I try to do, because I can't allow to a user to try to end an infinite request. Therefore I'll try your solution :) Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:14 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Okay. To confirm, you set it to negative 60s? The next thing you can try is to set hive.server2.idle.session.timeou=6 (60sec) and hive.server2.idle.session.check.operation=false. I'm pretty sure this works, but the user's session will be killed though. --Xuefu On Wed, Jul 29, 2015 at 7:02 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I confirm : I just tried hive.server2.idle.operation.timeout setting it to -60 (seconds), but my veery slow job have not been killed. The issue here is what if another user come and try to submit a MapReduce job but the cluster is stuck in an infinite loop ?. Do you or anyone else have another idea ? Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:34 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net : No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Computation timeout
My bad, I think I just mixed up the properties. At the end of the day, everything seems to work as you described. Thanks a lot ! Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-30 9:31 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Rats, I think I just figured it out. #2 Is NEGATIVE 3000, right ? I set it to positive yesterday. As for #1, I think it is the default value, so I am not sure I have to set it. Can you confirm that there is a typo on the name of your properties (missing last letter) and that is not the actual name of the properties ? I'll try again and keep you informed Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 20:15 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: this works for me: In hive-site.xml: 1. hive.server2.session.check.interva=3000; 2. hive.server2.idle.operation.timeou=-3; restart HiveServer2. at beeline, I do analyze table X compute statistics for columns, which takes longer than 30s. it was aborted by HS2 because of above settings. I guess it didn't work for you because you didn't have #1. --Xuefu On Wed, Jul 29, 2015 at 9:23 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I don't think your solution works, as after more than 4 minutes I could still see logs of my job showing that it was running. Do you have a way to check that even if the job was running, it was not being killed by Hive ? Or another solution ? Thanks for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:26 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Yes, I set it to negative 60. It's not a problem if the session is killed. That's actually what I try to do, because I can't allow to a user to try to end an infinite request. Therefore I'll try your solution :) Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:14 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Okay. To confirm, you set it to negative 60s? The next thing you can try is to set hive.server2.idle.session.timeou=6 (60sec) and hive.server2.idle.session.check.operation=false. I'm pretty sure this works, but the user's session will be killed though. --Xuefu On Wed, Jul 29, 2015 at 7:02 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I confirm : I just tried hive.server2.idle.operation.timeout setting it to -60 (seconds), but my veery slow job have not been killed. The issue here is what if another user come and try to submit a MapReduce job but the cluster is stuck in an infinite loop ?. Do you or anyone else have another idea ? Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:34 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net : No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Computation timeout
Rats, I think I just figured it out. #2 Is NEGATIVE 3000, right ? I set it to positive yesterday. As for #1, I think it is the default value, so I am not sure I have to set it. Can you confirm that there is a typo on the name of your properties (missing last letter) and that is not the actual name of the properties ? I'll try again and keep you informed Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 20:15 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: this works for me: In hive-site.xml: 1. hive.server2.session.check.interva=3000; 2. hive.server2.idle.operation.timeou=-3; restart HiveServer2. at beeline, I do analyze table X compute statistics for columns, which takes longer than 30s. it was aborted by HS2 because of above settings. I guess it didn't work for you because you didn't have #1. --Xuefu On Wed, Jul 29, 2015 at 9:23 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I don't think your solution works, as after more than 4 minutes I could still see logs of my job showing that it was running. Do you have a way to check that even if the job was running, it was not being killed by Hive ? Or another solution ? Thanks for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:26 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Yes, I set it to negative 60. It's not a problem if the session is killed. That's actually what I try to do, because I can't allow to a user to try to end an infinite request. Therefore I'll try your solution :) Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:14 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Okay. To confirm, you set it to negative 60s? The next thing you can try is to set hive.server2.idle.session.timeou=6 (60sec) and hive.server2.idle.session.check.operation=false. I'm pretty sure this works, but the user's session will be killed though. --Xuefu On Wed, Jul 29, 2015 at 7:02 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I confirm : I just tried hive.server2.idle.operation.timeout setting it to -60 (seconds), but my veery slow job have not been killed. The issue here is what if another user come and try to submit a MapReduce job but the cluster is stuck in an infinite loop ?. Do you or anyone else have another idea ? Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:34 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Computation timeout
I don't think your solution works, as after more than 4 minutes I could still see logs of my job showing that it was running. Do you have a way to check that even if the job was running, it was not being killed by Hive ? Or another solution ? Thanks for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:26 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: Yes, I set it to negative 60. It's not a problem if the session is killed. That's actually what I try to do, because I can't allow to a user to try to end an infinite request. Therefore I'll try your solution :) Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:14 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Okay. To confirm, you set it to negative 60s? The next thing you can try is to set hive.server2.idle.session.timeou=6 (60sec) and hive.server2.idle.session.check.operation=false. I'm pretty sure this works, but the user's session will be killed though. --Xuefu On Wed, Jul 29, 2015 at 7:02 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I confirm : I just tried hive.server2.idle.operation.timeout setting it to -60 (seconds), but my veery slow job have not been killed. The issue here is what if another user come and try to submit a MapReduce job but the cluster is stuck in an infinite loop ?. Do you or anyone else have another idea ? Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:34 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Computation timeout
No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Computation timeout
Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Computation timeout
Yes, I set it to negative 60. It's not a problem if the session is killed. That's actually what I try to do, because I can't allow to a user to try to end an infinite request. Therefore I'll try your solution :) Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 16:14 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Okay. To confirm, you set it to negative 60s? The next thing you can try is to set hive.server2.idle.session.timeou=6 (60sec) and hive.server2.idle.session.check.operation=false. I'm pretty sure this works, but the user's session will be killed though. --Xuefu On Wed, Jul 29, 2015 at 7:02 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: I confirm : I just tried hive.server2.idle.operation.timeout setting it to -60 (seconds), but my veery slow job have not been killed. The issue here is what if another user come and try to submit a MapReduce job but the cluster is stuck in an infinite loop ?. Do you or anyone else have another idea ? Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:34 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne
Re: Computation timeout
I confirm : I just tried hive.server2.idle.operation.timeout setting it to -60 (seconds), but my veery slow job have not been killed. The issue here is what if another user come and try to submit a MapReduce job but the cluster is stuck in an infinite loop ?. Do you or anyone else have another idea ? Thanks, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:34 GMT+02:00 Loïc Chanel loic.cha...@telecomnancy.net: No, because I thought the idea of infinite operation was not very compatible with the idle word (as the operation will not stop running), but I'll try :-) Thanks for the idea, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne 2015-07-29 15:27 GMT+02:00 Xuefu Zhang xzh...@cloudera.com: Have you tried hive.server2.idle.operation.timeout? --Xuefu On Wed, Jul 29, 2015 at 5:52 AM, Loïc Chanel loic.cha...@telecomnancy.net wrote: Hi all, As I'm trying to build a secured and multi-tenant Hadoop cluster with Hive, I am desperately trying to set a timeout to Hive requests. My idea is that some users can make mistakes such as a join with wrong keys, and therefore start an infinite loop believing that they are just launching a very heavy job. Therefore, I'd like to set a limit to the time a request should take, in order to kill the job automatically if it exceeds it. As such a notion cannot be set directly in YARN, I saw that MapReduce2 provides with its own native timeout property, and I would like to know if Hive provides with the same property someway. Did anyone heard about such a thing ? Thanks in advance for your help, Loïc Loïc CHANEL Engineering student at TELECOM Nancy Trainee at Worldline - Villeurbanne