[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821516#comment-17821516 ] Goutam Ghosh commented on SPARK-21918: -- Can the path by [~angerszhuuu] be validated ? > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > Labels: bulk-closed > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032776#comment-17032776 ] Chenhao Wu commented on SPARK-21918: is this still in progress?? > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901847#comment-16901847 ] angerszhu commented on SPARK-21918: --- I have make a patch for this problem. [https://github.com/apache/spark/pull/25201] > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794382#comment-16794382 ] Yuming Wang commented on SPARK-21918: - DDL should be supported after the metastore is upgraded to Hive 2.0.0([HIVE-11157|https://issues.apache.org/jira/browse/HIVE-11157]). > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773890#comment-16773890 ] angerszhu commented on SPARK-21918: --- Come back boy, :([~huLiu] > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769755#comment-16769755 ] t oo commented on SPARK-21918: -- please don't leave us [~huLiu] > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519149#comment-16519149 ] fengchaoge commented on SPARK-21918: Hello Hu Liu, can you share you patch? we are suffering DDL and DML problems for STS. we will be much appreciated if you provide the patch,spark community will be honor of you ! > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497456#comment-16497456 ] fengchaoge commented on SPARK-21918: Hu Liu gone? > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16381396#comment-16381396 ] Thilak Raj Balasubramanian commented on SPARK-21918: [~huLiu] This feature is a very important feature and we are waiting for this feature > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352415#comment-16352415 ] Maciej Bryński commented on SPARK-21918: Ping [~huLiu] > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, >Priority: Major > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270040#comment-16270040 ] Reid Chan commented on SPARK-21918: --- Expecting > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226457#comment-16226457 ] junzhang commented on SPARK-21918: -- [~huLiu] how about the patch? we are suffering DDL and DML problems for STS. It will be much appreciated if provide the patch as soon as possible. > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155162#comment-16155162 ] Marco Gaido commented on SPARK-21918: - Yes, I think this would be great, thanks. > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155120#comment-16155120 ] Hu Liu, commented on SPARK-21918: - [~mgaido] I just simply tested the command. I connected to STS via beeline by using session user different from the user who started STS.After running the create table command, I check the owner of hdfs path which is the session user. When I tried to drop table owned by user who started STS and got permission denied exception. For DML, their is opened issue: [link https://issues.apache.org/jira/browse/SPARK-5159]. I can fix those issue together if necessary > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155047#comment-16155047 ] Marco Gaido commented on SPARK-21918: - What I meant is that if we want to support doAs, we shouldn't just support it for DDL operations, but also for all DML & DQL. Your fix I am pretty sure won't affect the DML & DQL behavior, ie. we would support the doAs only for DDL operations with your change. This means that there would be a hybrid situation: for DDL we'd have doAs working, for DML & DQL no. This is not a desirable condition. PS For my sake of curiosity, may I ask you how you tested that your DDL commands were run using the session user? Thanks. > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155022#comment-16155022 ] Hu Liu, commented on SPARK-21918: - [~mgaido] Yes, all the jobs are executed using the same user. but the problem is't in STS. The STS open session with impersonation when doAS is enabled {code:java} if (cliService.getHiveConf().getBoolVar(ConfVars.HIVE_SERVER2_ENABLE_DOAS) && (userName != null)) { String delegationTokenStr = getDelegationToken(userName); sessionHandle = cliService.openSessionWithImpersonation(protocol, userName, req.getPassword(), ipAddress, req.getConfiguration(), delegationTokenStr); } else { {code} And run sql by session ugi in HiveSessionProxy. For DDL operation, spark sql use Hive object in HiveClientImplement.java to communicate with metastore. Currently the Hive object is shared between different threads that why all jobs is executed using same user in HiveClientImpl.java {code:java} private def client: Hive = { if (clientLoader.cachedHive != null) { clientLoader.cachedHive.asInstanceOf[Hive] } else { val c = Hive.get(conf) clientLoader.cachedHive = c c } } {code} Actually Hive object store different instance for different thread and class HiveSessionImplwithUGI have already create Hive object for current user session {code:java} // create a new metastore connection for this particular user session Hive.set(null); try { sessionHive = Hive.get(getHiveConf()); } catch (HiveException e) { throw new HiveSQLException("Failed to setup metastore connection", e); } {code} If we could pass the Hive object for current user session to the work thread, we can fix this problem I have already fixed it and could run DDL operation using the session user. > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154033#comment-16154033 ] Marco Gaido commented on SPARK-21918: - What do you mean by "works correctly"? Actually all the jobs are executed using the user who started STS. > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153713#comment-16153713 ] Hu Liu, commented on SPARK-21918: - [~mgaido] It seems that doAS works correctly in hiveThrift server(HiveSessionImplwithUGI) which run sql via spark > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21918) HiveClient shouldn't share Hive object between different thread
[ https://issues.apache.org/jira/browse/SPARK-21918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16153672#comment-16153672 ] Marco Gaido commented on SPARK-21918: - hive.server2.enable.doAs=true is currently not supported in STS. > HiveClient shouldn't share Hive object between different thread > --- > > Key: SPARK-21918 > URL: https://issues.apache.org/jira/browse/SPARK-21918 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Hu Liu, > > I'm testing the spark thrift server and found that all the DDL statements are > run by user hive even if hive.server2.enable.doAs=true > The root cause is that Hive object is shared between different thread in > HiveClientImpl > {code:java} > private def client: Hive = { > if (clientLoader.cachedHive != null) { > clientLoader.cachedHive.asInstanceOf[Hive] > } else { > val c = Hive.get(conf) > clientLoader.cachedHive = c > c > } > } > {code} > But in impersonation mode, we should just share the Hive object inside the > thread so that the metastore client in Hive could be associated with right > user. > we can pass the Hive object of parent thread to child thread when running > the sql to fix it > I have already had a initial patch for review and I'm glad to work on it if > anyone could assign it to me. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org