[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2019-10-31 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-20665:
--
Status: Open  (was: Patch Available)

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 2.3.2, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-20665.1.patch, HIVE-20665.2.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2019-10-31 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-20665:
--
Status: Patch Available  (was: Open)

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 2.3.2, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-20665.1.patch, HIVE-20665.2.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2019-10-31 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-20665:
--
Attachment: HIVE-20665.2.patch

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.2, 3.1.0, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-20665.1.patch, HIVE-20665.2.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2019-10-31 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-20665:
--
Attachment: (was: HIVE-20665.1.patch)

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.2, 3.1.0, 4.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HIVE-20665.1.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-11-08 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Attachment: HIVE-20665.1.patch

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.2, 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-20665.1.patch, HIVE-20665.1.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-11-08 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Status: Open  (was: Patch Available)

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 2.3.2, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-20665.1.patch, HIVE-20665.1.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-11-08 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Status: Patch Available  (was: Open)

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 2.3.2, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-20665.1.patch, HIVE-20665.1.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code it can be seen that for every thread launched, it is all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-10-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Description: 
When parallel tasks are enabled in Hive, all of the resulting queries share the 
same Hive configuration.  This is problematic as each query will modify the 
same {{HiveConf}} object with things like query ID and query text.  This will 
overwrite each other and cause {{ConcurrentModificationException}} issues.

{code:java|title=SQLOperation.java}
public Object run() throws HiveSQLException {
  Hive.set(parentHive, false);
  // TODO: can this result in cross-thread reuse of session state?
  SessionState.setCurrentSessionState(parentSessionState);
  PerfLogger.setPerfLogger(SessionState.getPerfLogger());
  LogUtils.registerLoggingContext(queryState.getConf());
  try {
if (asyncPrepare) {
  prepare(queryState);
}
runQuery();
  } catch (HiveSQLException e) {
// ...
{code}

[Code 
Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]

>From this code it can be seen that for every thread launched, it is all 
>calling {{setCurrentSessionState}}.

{code:java|title=SessionStates.java}
  /**
   * Sets the given session state in the thread local var for sessions.
   */
  public static void setCurrentSessionState(SessionState startSs) {
tss.get().attach(startSs);
  }
  // SessionState is not available in runtime and Hive.get().getConf() is not 
safe to call
  private static class SessionStates {
private SessionState state;
private HiveConf conf;
private void attach(SessionState state) {
  this.state = state;
  attach(state.getConf());
}
private void attach(HiveConf conf) {
  // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
  this.conf = conf;

  ClassLoader classLoader = conf.getClassLoader();
  if (classLoader != null) {
Thread.currentThread().setContextClassLoader(classLoader);
  }
}
  }
{code}

[Code 
Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]

Ensure that all threads get their own copy of the {{HiveConf}} object to use 
and modify.

  was:
When parallel tasks are enabled in Hive, all of the resulting queries share the 
same Hive configuration.  This is problematic as each query will modify the 
same {{HiveConf}} object with things like query ID and query text.  This will 
overwrite each other and cause {{ConcurrentModificationException}} issues.

{code:java|title=SQLOperation.java}
public Object run() throws HiveSQLException {
  Hive.set(parentHive, false);
  // TODO: can this result in cross-thread reuse of session state?
  SessionState.setCurrentSessionState(parentSessionState);
  PerfLogger.setPerfLogger(SessionState.getPerfLogger());
  LogUtils.registerLoggingContext(queryState.getConf());
  try {
if (asyncPrepare) {
  prepare(queryState);
}
runQuery();
  } catch (HiveSQLException e) {
// ...
{code}

[Code 
Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]

>From this code we can see that for every thread launched, they are all calling 
>{{setCurrentSessionState}}.

{code:java|title=SessionStates.java}
  /**
   * Sets the given session state in the thread local var for sessions.
   */
  public static void setCurrentSessionState(SessionState startSs) {
tss.get().attach(startSs);
  }
  // SessionState is not available in runtime and Hive.get().getConf() is not 
safe to call
  private static class SessionStates {
private SessionState state;
private HiveConf conf;
private void attach(SessionState state) {
  this.state = state;
  attach(state.getConf());
}
private void attach(HiveConf conf) {
  // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
  this.conf = conf;

  ClassLoader classLoader = conf.getClassLoader();
  if (classLoader != null) {
Thread.currentThread().setContextClassLoader(classLoader);
  }
}
  }
{code}

[Code 
Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]

Ensure that all threads get their own copy of the {{HiveConf}} object to use 
and modify.


> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: 

[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-10-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Attachment: HIVE-20665.1.patch

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.2, 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-20665.1.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code we can see that for every thread launched, they are all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-10-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Status: Patch Available  (was: Open)

> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0, 2.3.2, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HIVE-20665.1.patch
>
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code we can see that for every thread launched, they are all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state = state;
>   attach(state.getConf());
> }
> private void attach(HiveConf conf) {
>   // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
>   this.conf = conf;
>   ClassLoader classLoader = conf.getClassLoader();
>   if (classLoader != null) {
> Thread.currentThread().setContextClassLoader(classLoader);
>   }
> }
>   }
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]
> Ensure that all threads get their own copy of the {{HiveConf}} object to use 
> and modify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20665) Hive Parallel Tasks - Hive Configuration ConcurrentModificationException

2018-10-01 Thread BELUGA BEHR (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-20665:
---
Description: 
When parallel tasks are enabled in Hive, all of the resulting queries share the 
same Hive configuration.  This is problematic as each query will modify the 
same {{HiveConf}} object with things like query ID and query text.  This will 
overwrite each other and cause {{ConcurrentModificationException}} issues.

{code:java|title=SQLOperation.java}
public Object run() throws HiveSQLException {
  Hive.set(parentHive, false);
  // TODO: can this result in cross-thread reuse of session state?
  SessionState.setCurrentSessionState(parentSessionState);
  PerfLogger.setPerfLogger(SessionState.getPerfLogger());
  LogUtils.registerLoggingContext(queryState.getConf());
  try {
if (asyncPrepare) {
  prepare(queryState);
}
runQuery();
  } catch (HiveSQLException e) {
// ...
{code}

[Code 
Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]

>From this code we can see that for every thread launched, they are all calling 
>{{setCurrentSessionState}}.

{code:java|title=SessionStates.java}
  /**
   * Sets the given session state in the thread local var for sessions.
   */
  public static void setCurrentSessionState(SessionState startSs) {
tss.get().attach(startSs);
  }
  // SessionState is not available in runtime and Hive.get().getConf() is not 
safe to call
  private static class SessionStates {
private SessionState state;
private HiveConf conf;
private void attach(SessionState state) {
  this.state = state;
  attach(state.getConf());
}
private void attach(HiveConf conf) {
  // -- SHALLOW COPY HERE, ALL THREADS SHARING SAME REFERENCE -- //
  this.conf = conf;

  ClassLoader classLoader = conf.getClassLoader();
  if (classLoader != null) {
Thread.currentThread().setContextClassLoader(classLoader);
  }
}
  }
{code}

[Code 
Here|https://github.com/apache/hive/blob/7795c0a7dc59941671f8845d78b16d9e5ddc9ea3/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L540-L556]

Ensure that all threads get their own copy of the {{HiveConf}} object to use 
and modify.

  was:
When parallel tasks are enabled in Hive, all of the resulting queries share the 
same Hive configuration.  This is problematic as each query will modify the 
same {{HiveConf}} object with things like query ID and query text.  This will 
overwrite each other and cause {{ConcurrentModificationException}} issues.




> Hive Parallel Tasks - Hive Configuration ConcurrentModificationException
> 
>
> Key: HIVE-20665
> URL: https://issues.apache.org/jira/browse/HIVE-20665
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.2, 3.1.0, 4.0.0
>Reporter: BELUGA BEHR
>Priority: Major
>
> When parallel tasks are enabled in Hive, all of the resulting queries share 
> the same Hive configuration.  This is problematic as each query will modify 
> the same {{HiveConf}} object with things like query ID and query text.  This 
> will overwrite each other and cause {{ConcurrentModificationException}} 
> issues.
> {code:java|title=SQLOperation.java}
> public Object run() throws HiveSQLException {
>   Hive.set(parentHive, false);
>   // TODO: can this result in cross-thread reuse of session state?
>   SessionState.setCurrentSessionState(parentSessionState);
>   PerfLogger.setPerfLogger(SessionState.getPerfLogger());
>   LogUtils.registerLoggingContext(queryState.getConf());
>   try {
> if (asyncPrepare) {
>   prepare(queryState);
> }
> runQuery();
>   } catch (HiveSQLException e) {
> // ...
> {code}
> [Code 
> Here|https://github.com/apache/hive/blob/6e27a5315a44c55ef3b178e7212c9068de322d01/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L308-L319]
> From this code we can see that for every thread launched, they are all 
> calling {{setCurrentSessionState}}.
> {code:java|title=SessionStates.java}
>   /**
>* Sets the given session state in the thread local var for sessions.
>*/
>   public static void setCurrentSessionState(SessionState startSs) {
> tss.get().attach(startSs);
>   }
>   // SessionState is not available in runtime and Hive.get().getConf() is not 
> safe to call
>   private static class SessionStates {
> private SessionState state;
> private HiveConf conf;
> private void attach(SessionState state) {
>   this.state =