[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-20665: -- Status: Open (was: Patch Available) > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.0, 2.3.2, 4.0.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Attachments: HIVE-20665.1.patch, HIVE-20665.2.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-20665: -- Status: Patch Available (was: Open) > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.0, 2.3.2, 4.0.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Attachments: HIVE-20665.1.patch, HIVE-20665.2.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-20665: -- Attachment: HIVE-20665.2.patch > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.3.2, 3.1.0, 4.0.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Attachments: HIVE-20665.1.patch, HIVE-20665.2.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated HIVE-20665: -- Attachment: (was: HIVE-20665.1.patch) > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.3.2, 3.1.0, 4.0.0 >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Attachments: HIVE-20665.1.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Attachment: HIVE-20665.1.patch > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.3.2, 3.1.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HIVE-20665.1.patch, HIVE-20665.1.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Status: Open (was: Patch Available) > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.0, 2.3.2, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HIVE-20665.1.patch, HIVE-20665.1.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Status: Patch Available (was: Open) > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.0, 2.3.2, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HIVE-20665.1.patch, HIVE-20665.1.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code it can be seen that for every thread launched, it is all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Description: When parallel tasks are enabled in Hive, all of the resulting queries share the same Hive configuration. This is problematic as each query will modify the same {{HiveConf}} object with things like query ID and query text. This will overwrite each other and cause {{ConcurrentModificationException}} issues. {code:java|title=SQLOperation.java} public Object run() throws HiveSQLException { Hive.set(parentHive, false); // TODO: can this result in cross-thread reuse of session state? SessionState.setCurrentSessionState(parentSessionState); PerfLogger.setPerfLogger(SessionState.getPerfLogger()); LogUtils.registerLoggingContext(queryState.getConf()); try { if (asyncPrepare) { prepare(queryState); } runQuery(); } catch (HiveSQLException e) { // ... {code} [Code Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] >From this code it can be seen that for every thread launched, it is all >calling {{setCurrentSessionState}}. {code:java|title=SessionStates.java} /** * Sets the given session state in the thread local var for sessions. */ public static void setCurrentSessionState(SessionState startSs) { tss.get().attach(startSs); } // SessionState is not available in runtime and Hive.get().getConf() is not safe to call private static class SessionStates { private SessionState state; private HiveConf conf; private void attach(SessionState state) { this.state = state; attach(state.getConf()); } private void attach(HiveConf conf) { // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // this.conf = conf; ClassLoader classLoader = conf.getClassLoader(); if (classLoader != null) { Thread.currentThread().setContextClassLoader(classLoader); } } } {code} [Code Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] Ensure that all threads get their own copy of the {{HiveConf}} object to use and modify. was: When parallel tasks are enabled in Hive, all of the resulting queries share the same Hive configuration. This is problematic as each query will modify the same {{HiveConf}} object with things like query ID and query text. This will overwrite each other and cause {{ConcurrentModificationException}} issues. {code:java|title=SQLOperation.java} public Object run() throws HiveSQLException { Hive.set(parentHive, false); // TODO: can this result in cross-thread reuse of session state? SessionState.setCurrentSessionState(parentSessionState); PerfLogger.setPerfLogger(SessionState.getPerfLogger()); LogUtils.registerLoggingContext(queryState.getConf()); try { if (asyncPrepare) { prepare(queryState); } runQuery(); } catch (HiveSQLException e) { // ... {code} [Code Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] >From this code we can see that for every thread launched, they are all calling >{{setCurrentSessionState}}. {code:java|title=SessionStates.java} /** * Sets the given session state in the thread local var for sessions. */ public static void setCurrentSessionState(SessionState startSs) { tss.get().attach(startSs); } // SessionState is not available in runtime and Hive.get().getConf() is not safe to call private static class SessionStates { private SessionState state; private HiveConf conf; private void attach(SessionState state) { this.state = state; attach(state.getConf()); } private void attach(HiveConf conf) { // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // this.conf = conf; ClassLoader classLoader = conf.getClassLoader(); if (classLoader != null) { Thread.currentThread().setContextClassLoader(classLoader); } } } {code} [Code Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] Ensure that all threads get their own copy of the {{HiveConf}} object to use and modify. > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL:
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Attachment: HIVE-20665.1.patch > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.3.2, 3.1.0, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HIVE-20665.1.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code we can see that for every thread launched, they are all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Status: Patch Available (was: Open) > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.1.0, 2.3.2, 4.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Major > Attachments: HIVE-20665.1.patch > > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code we can see that for every thread launched, they are all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state = state; > attach(state.getConf()); > } > private void attach(HiveConf conf) { > // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // > this.conf = conf; > ClassLoader classLoader = conf.getClassLoader(); > if (classLoader != null) { > Thread.currentThread().setContextClassLoader(classLoader); > } > } > } > {code} > [Code > Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] > Ensure that all threads get their own copy of the {{HiveConf}} object to use > and modify. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HIVE-20665: --- Description: When parallel tasks are enabled in Hive, all of the resulting queries share the same Hive configuration. This is problematic as each query will modify the same {{HiveConf}} object with things like query ID and query text. This will overwrite each other and cause {{ConcurrentModificationException}} issues. {code:java|title=SQLOperation.java} public Object run() throws HiveSQLException { Hive.set(parentHive, false); // TODO: can this result in cross-thread reuse of session state? SessionState.setCurrentSessionState(parentSessionState); PerfLogger.setPerfLogger(SessionState.getPerfLogger()); LogUtils.registerLoggingContext(queryState.getConf()); try { if (asyncPrepare) { prepare(queryState); } runQuery(); } catch (HiveSQLException e) { // ... {code} [Code Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] >From this code we can see that for every thread launched, they are all calling >{{setCurrentSessionState}}. {code:java|title=SessionStates.java} /** * Sets the given session state in the thread local var for sessions. */ public static void setCurrentSessionState(SessionState startSs) { tss.get().attach(startSs); } // SessionState is not available in runtime and Hive.get().getConf() is not safe to call private static class SessionStates { private SessionState state; private HiveConf conf; private void attach(SessionState state) { this.state = state; attach(state.getConf()); } private void attach(HiveConf conf) { // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- // this.conf = conf; ClassLoader classLoader = conf.getClassLoader(); if (classLoader != null) { Thread.currentThread().setContextClassLoader(classLoader); } } } {code} [Code Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556] Ensure that all threads get their own copy of the {{HiveConf}} object to use and modify. was: When parallel tasks are enabled in Hive, all of the resulting queries share the same Hive configuration. This is problematic as each query will modify the same {{HiveConf}} object with things like query ID and query text. This will overwrite each other and cause {{ConcurrentModificationException}} issues. > Hive Parallel Tasks - Hive Configuration ConcurrentModificationException > > > Key: HIVE-20665 > URL: https://issues.apache.org/jira/browse/HIVE-20665 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.3.2, 3.1.0, 4.0.0 >Reporter: BELUGA BEHR >Priority: Major > > When parallel tasks are enabled in Hive, all of the resulting queries share > the same Hive configuration. This is problematic as each query will modify > the same {{HiveConf}} object with things like query ID and query text. This > will overwrite each other and cause {{ConcurrentModificationException}} > issues. > {code:java|title=SQLOperation.java} > public Object run() throws HiveSQLException { > Hive.set(parentHive, false); > // TODO: can this result in cross-thread reuse of session state? > SessionState.setCurrentSessionState(parentSessionState); > PerfLogger.setPerfLogger(SessionState.getPerfLogger()); > LogUtils.registerLoggingContext(queryState.getConf()); > try { > if (asyncPrepare) { > prepare(queryState); > } > runQuery(); > } catch (HiveSQLException e) { > // ... > {code} > [Code > Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319] > From this code we can see that for every thread launched, they are all > calling {{setCurrentSessionState}}. > {code:java|title=SessionStates.java} > /** >* Sets the given session state in the thread local var for sessions. >*/ > public static void setCurrentSessionState(SessionState startSs) { > tss.get().attach(startSs); > } > // SessionState is not available in runtime and Hive.get().getConf() is not > safe to call > private static class SessionStates { > private SessionState state; > private HiveConf conf; > private void attach(SessionState state) { > this.state =