[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627833#comment-16627833 ] Misha Dmitriev commented on HIVE-17684: --- Hi [~KaiXu] yes, your issue is the same that we try to improve here. Note, however, that if/when this change will be integrated, {{MapJoinMemoryExhaustionException}} is not guaranteed to go away in your use case. That is, maybe your Spark executor really doesn't have enough memory to process the given table, and thus the exception is thrown correctly. But this change should reduce the chance of this exception being thrown in wrong circumstances, when there is actually enough memory. In the mean time, a workaround is to increase the JVM heap for your Spark executors. [~stakiar] it looks like the last patch finally passed tests, so it can be integrated? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch, HIVE-17684.10.patch, HIVE-17684.11.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622437#comment-16622437 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] there is once again some strange compilation failure. Please take a look. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch, HIVE-17684.10.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: In Progress) Resubmitting the same patch; hope there will be no unrelated compilation error this time. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch, HIVE-17684.10.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.10.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch, HIVE-17684.10.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.10.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: (was: HIVE-17684.10.patch) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: In Progress (was: Patch Available) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616608#comment-16616608 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] there seems to be some unrelated compilation failure. What do we do now? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: In Progress) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: In Progress (was: Patch Available) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.09.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615513#comment-16615513 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] looks like the test failures are due to some silly NumberFormatException somewhere in the new code. I'll see if I can fix that. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615186#comment-16615186 ] Misha Dmitriev commented on HIVE-17684: --- Probably {{MapJoinMemoryExhaustionHandler}} just doesn't get triggered in HoMR... but ok, I agree that in such a difficult-to-test area it's safer to be conservative and avoid changing things unless they are definitely broken. Let's wait for the test results for your latest patch then. I guess depending on the outcome it may be submitted as is or may need some small finishing touches. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614242#comment-16614242 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] this looks generally good. I noticed one or two typos; will take a deeper look into the code tomorrow. The main question is: did you actually see the old {{MapJoinMemoryExhaustionHandler}} working correctly, at least with Hive-on-MR? My impression so far was that it's just broken, because the JVM cannot report the exact amount of used memory (i.e. occupied by non-garbage objects), except maybe after a full GC. If so, then keeping this code around will mean adding more "cruft" to Hive, which is already not the cleanest code base, and make the job of maintainters, yourself included, harder. Please let me know what you think. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613115#comment-16613115 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] looks like with the latest patch, that uses the reliable GC time monitoring implementation, the remaining 5 failed tests are those that actually expect MapJoinMemoryExhaustionError to be thrown. To get this exception thrown reliably, we probably need to set the {{hive.mapjoin.max.gc.time.percentage}} configuration property to something like 1. However, I am not sure how to set it just for the few selected tests - please advise. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: Open) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.07.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Open (was: Patch Available) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609712#comment-16609712 ] Misha Dmitriev commented on HIVE-17684: --- I've made the above changes - that was easy. However, when I tried to run some tests locally, they failed. Turns out that {{org.apache.hadoop.util.GcTimeMonitor}} constructor has a {{Preconditions.checkArgument(maxGcTimePercentage <= 100)}} check. Turns out that sanity check sometimes have unwanted effect... In this situation, it looks like the fastest way to address the problem is to copy the above GcTimeMonitor class into the Hive code base, and then modify it so that instead of the {{GarbageCollectionMXBean}} it uses the older, battle-tested mechanism that's same as in the existing {{JvmPauseMonitor}} class. Makes sense? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606717#comment-16606717 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] I've also checked a number of logs, and looks like all or most failed tests indeed fail because GC time >= 100 per cent is reported. I suspect that this may be due to the fact that the underlying {{org.apache.hadoop.util.GcTimeMonitor}} class uses the JMX API (class {{java.lang.management.GarbageCollectorMXBean}}) internally. I remember seeing or reading that this API may sometimes wrongly report e.g. the time spent in concurrent object marking GC phase (which doesn't pause the JVM) alongside the normal pause time. If so, these percentages above at or above 100 would be explainable by incorrect accounting. Looks like this happens only in the heavy GC situation - I didn't see it in benchmarks in normal mode or when running these tests locally. I guess that in these circumstances the right solution would be to make {{criticalGcTimePercentage}} a configurable variable in HiveConf, and get rid of the current {{CRITICAL_GC_TIME_PERCENTAGE_TEST/PROD}} If {{criticalGcTimePercentage}} can be configured separately in normal tests (to something like 1000% to be on the safe side), in tests that are expected to fail (to 0%) and in prod (probably to something like the current 50% plus possibly some "safety margin") - then hopefully the things will work as expected in all cases. But can it be configured in that way? Longer term, if we see evidence that the current implementation of GcTimeMonitor miscalculates the GC time considerably, we can switch to the existing {{org.apache.hadoop.hive.common.JvmPauseMonitor}} that uses a different method of estimating the GC time. This class can just be extended to report the GC time as percentage for our purposes. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606351#comment-16606351 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] thank you for your patch, I've just applied it locally. After your updates, there are still 34 failed tests in the last run above. May I ask you to take a quick look at why they failed? (I mean those tests that don't expect the "exhausted memory" outcome) In the mean time, I'll see how to make these tests behave the same with the new memory handler. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580366#comment-16580366 ] Misha Dmitriev commented on HIVE-17684: --- I've resumed working on this, and finally found the reason for the mysterious failure described in my previous comment. It turns out that the following test outputs: ./clientpositive/auto_join25.q.out ./clientpositive/auto_join_without_localtask.q.out ./clientpositive/infer_bucket_sort_convert_join.q.out ./clientpositive/mapjoin_hook.q.out actually _do_ contain the "Hive Runtime Error: Map local work exhausted memory" string, i.e. they expect the old memory exhaustion check to be triggered. With my new code, this check doesn't get triggered anymore (probably correctly), so these tests fail. I think a sufficient level of Hive expertise is needed to determine how the expected output should look like in the new situation, so I would prefer to leave it to you [~stakiar]. But looks like we still have a high number of failing CLI tests, that likely fail because the "in tests" flag is not set when they run, and thus my new high GC time get triggered. For these tests to pass, I guess we either need to make all CLI tests set the "in tests" flag somehow, or find out another JVM property or something that is set only when these tests run. Any suggestions? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564366#comment-16564366 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] tried to debug it, but not sure what's going on. The following command {{mvn test -Dtest=TestCliDriver -Dqfile=auto_join_without_localtask.q,acid_mapjoin.q,auto_join33.q}} fails for me locally with the message below, but when I try to add some extra logging to my code, rebuild Hive and rerun, somehow it doesn't look like my changes get applied. I am going to leave for vacation in a few hours, so won't be able to spend more time on this now. Please feel free to debug it yourself or wait until I am back in ~9 days. {code:java} Hive Runtime Error: Map local work exhausted memory > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask > ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.mr.MapRedTask > Hive Runtime Error: Map local work exhausted memory > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask > ATTEMPT: Execute BackupTask: > org.apache.hadoop.hive.ql.exec.mr.MapRedTask{code} > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562481#comment-16562481 ] Misha Dmitriev commented on HIVE-17684: --- Actually, looks like some of these CLI tests fail for me locally, when I don't think there is much memory pressure on Hive. Need to understand what's going on. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562445#comment-16562445 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] a large number of CLI tests failed because of MapJoinMemoryExhaustionError. I suspect that in CLI tests the Hive server runs just as usual, with the "in tests" flag not set? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: In Progress) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.04.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: In Progress (was: Patch Available) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556103#comment-16556103 ] Misha Dmitriev commented on HIVE-17684: --- I agree that any fixing/reconfiguring of tests should be done in a separate patch. I would also agree that tests that exercise GcTimeMonitor should not be used as an indirect way to verify that the test infrastructure is working well - it would be too confusing. But perhaps it's worth creating a Jira about better monitoring the Hive test JVMs or some such. So, for now we should either not run this code in tests, or run it with a rather high GC time threshold. But how does any code in Hive find out whether it's running in tests or in production? So far I've found {{HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TESTS). But from looking into the source code I am not sure it always works as intended, and access to a Configuration instance needed by the call above seems to be tricky in Operator.java - this thing is passed in quite late, in the initialize() method. Probably I can work around this, but I would like to confirm first that this HIVE_IN_TESTS is the right thing to use here.}} > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555039#comment-16555039 ] Misha Dmitriev commented on HIVE-17684: --- But looks like when these qtests run sequentially (as they probably do by default), on a single JVM with the 2GB heap size, there is no GC overload, medium to low CPU utilization, but super-slow execution speed. So I suspect that on Jenkins the tests (within a single 'mvn test -Dtest=TestCliDriver') somehow run in parallel (and maybe it's configured to use slower heap as well). Is it possible to run qtests with parallelization? Can we check how much memory is given on Jenkins to the JVM running them? We can, of course, reduce the aggressiveness of the GC monitor in unit tests. But if it turns out that, unknown to everyone, they just waste a lot of time in GC, then a better solution would be to reduce parallelism or increase heap. They would probably run faster as a result. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554991#comment-16554991 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] thank you for looking into this. When I ran the above test locally, it passed without issues. I also tried to run all tests via {{cd itest; mvn test -Dtest=TestCliDriver}}. This hasn't finished so far (after ~2 hours I think), but when I monitor the JVM that runs tests with jstat, I see no excessive GC activity at all. So can it happen that in the Jenkins test environment, probably on a bigger machine with many CPU cores, there are e.g. multiple tests execute in parallel against the same HS2 instance? If so, and/or if its heap size is insufficient, I guess in principle it can happen that GC pauses become really long/frequent. But if they indeed take 60% of the time, then it's bad. For one thing, it would mean that our tests run much slower than they should. Is it possible to get access to the machine that runs these tests on Jenkins and do some basic GC monitoring? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554729#comment-16554729 ] Misha Dmitriev commented on HIVE-17684: --- Thank you for looking into this, [~stakiar] Can you please remind me where (in which source file) such serializers are kept? I remember doing this once or twice, but it was quite long ago. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: In Progress) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: In Progress (was: Patch Available) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.03.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553563#comment-16553563 ] Misha Dmitriev commented on HIVE-17684: --- Will get to this later today or tomorrow. On Mon, Jul 23, 2018 at 3:50 PM, Sahil Takiar (JIRA) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553279#comment-16553279 ] Misha Dmitriev commented on HIVE-19937: --- The last patch looks good to me. The only slight concern that I got from looking at it once again is the following: in one or two places you switched from passing around Strings to passing around Paths, and subsequently switched some HashMap(s) to HashMap. Note that a lookup in a map where keys are complex objects is slower, because Path.equals(Path) is slower than String.equals(String) - it may involve comparison of many strings, etc. I haven't seen any reports on Hive CPU performance problems, and I hope this code is not on a critical path, and/or that GC-related savings would offset the potential hashmap lookup slowdown... but anyway, I guess it's worth remembering about this. > Intern fields in MapWork on deserialization > --- > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, > HIVE-19937.3.patch, HIVE-19937.4.patch, HIVE-19937.5.patch, > post-patch-report.html, report.html > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549768#comment-16549768 ] Misha Dmitriev commented on HIVE-19937: --- +1 One small comment: instead of {{this.baseFileName = baseFileName == null ? null : baseFileName.intern();}} you can just use {{this.baseFileName = StringUtils.internIfNotNull(baseFileName);}} > Intern fields in MapWork on deserialization > --- > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, > HIVE-19937.3.patch, HIVE-19937.4.patch, post-patch-report.html, report.html > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549667#comment-16549667 ] Misha Dmitriev commented on HIVE-19937: --- Thank you for publishing the updated jxray report, [~stakiar] Looks like things work as expected. One small question/concern: I see that this test uses a pretty small heap, less than 1GB. Is it expected that all the overheads that you are fixing will grow proportionally with a bigger heap? > Intern fields in MapWork on deserialization > --- > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, > HIVE-19937.3.patch, post-patch-report.html, report.html > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545643#comment-16545643 ] Misha Dmitriev commented on HIVE-19668: --- [~vihangk1] yes, the previous patch passed all tests, but looks like there were some checkstyle problems with it. Unfortunately, the checkstyle report is gone, so I've just fixed several lines which I think could be problematic, and resubmitted the patch. Are these checkstyle issues considered important? > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Attachment: HIVE-19668.05.patch > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: Patch Available (was: In Progress) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: In Progress (was: Patch Available) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: Patch Available (was: In Progress) The previous patch may or may not have been applied, thus I've just updated my local git repo clone and generated a new patch file. > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Attachment: HIVE-19668.04.patch > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: In Progress (was: Patch Available) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, HIVE-19668.04.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: Patch Available (was: In Progress) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Attachment: HIVE-19668.03.patch > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: In Progress (was: Patch Available) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542265#comment-16542265 ] Misha Dmitriev commented on HIVE-19668: --- Thank you for checking, [~vihangk1] [~aihuaxu] and [~stakiar]. In the end, it turns out that at least some failures are reproducible locally, and my changes are responsible. Not all {{CommonToken}}s can be made {{ImmutableToken}}s, because for some of them the type may be rewritten in some special operators later. I've already found one such type in the past, and now eliminating others. Will post the updated patch once I am done. > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Use BeanSerializer for MapWork to carry calls to String.intern
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539149#comment-16539149 ] Misha Dmitriev commented on HIVE-19937: --- A few small comments about the last patch: # In the new *Deserializer code, there are many lines looking like 'partitionDesc.setXXX(partitionDesc.getXXX())'. Such code looks quite non-obvious, so it would be good to add comments there explaining that this is done to intern possible duplicate strings. # Looks like there is a typo: a method that's really a getter has the set... name: + public Map> setEventSourceColumnTypeMap() { + return eventSourceColumnTypeMap; + } # Still would be good to publish a jxray report for the same test before and after this change, to make sure that the problems are indeed fixed. > Use BeanSerializer for MapWork to carry calls to String.intern > -- > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, > HIVE-19937.3.patch, report.html > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Use BeanSerializer for MapWork to carry calls to String.intern
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537362#comment-16537362 ] Misha Dmitriev commented on HIVE-19937: --- The analysis above looks good. I remember seeing similar issues with Kryo (and in general, all other kinds of serialization) in the past, and dealing with them in a similar way. The new patch looks good to me. I assume that you analyzed a heap dump before and after this change and verified that the problems that you wanted to address indeed went away. A few small comments/nits: * MapOperator.java, 'contexts = new LinkedHashMap, MapOpCtx>()' - if this code has any idea of the expected size of LinkedHashMap, you may want to create it with the appropriate capacity. This is especially relevant when such maps are small - then the default capacity of 16 makes them waste a lot of memory. * MapWork.java, 'if (includedBuckets != null) { this.includedBuckets = ...' - you can make this code a bit shorter using the conditional operator, i.e. 'this.includedBuckets = (includedBuckets != null) ? includedBuckets : null' Same in several other methods. > Use BeanSerializer for MapWork to carry calls to String.intern > -- > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, report.html > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Intern JobConf objects in Spark tasks
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532187#comment-16532187 ] Misha Dmitriev commented on HIVE-19937: --- Thank you for sharing the jxray report, [~stakiar]. If it reflects the situation in real-life applications accurately enough, then it looks like the sources of duplicate strings are not as much {{JobConf}} tables as various other things, that you can easily see if you expand the "Expensive fields" and "Full reference chains" in section 7: # Most of the duplicate strings (~9% out of 13.5% total) come from data fields of {{java.net.URI}}. All these URIs, in turn, come from {{org.apache.hadoop.fs.Path.uri}}. {{Path}}s come from more than one sources, but the biggest one is this reference chain: {code:java} ↖java.net.URI.schemeSpecificPart ↖org.apache.hadoop.fs.Path.uri ↖{j.u.LinkedHashMap}.keys ↖org.apache.hadoop.hive.ql.plan.MapWork.pathToAliases{code} It turns out that in the past I have already taken care of interning strings in such URIs, see e.g. this method in MapWork.java: {code:java} public void setPathToAliases(final LinkedHashMap> pathToAliases) { for (Path p : pathToAliases.keySet()) { StringInternUtils.internUriStringsInPath(p); } this.pathToAliases = pathToAliases; }{code} but it turns out that there are also other methods that can add {{Path}}s to {{pathToAliases}}: two flavors of {{addPathToAlias()}} and maybe something else. I think we need to modify all these methods so that they also call {{StringInternUtils.internUriStringsInPath()}} for {{Path}}s that are passed to them. This will remove the said 9% of duplicate strings. # One other source of duplicate strings in URIs referenced by {{Path}}s is the map in {{ProjectionPusher.pathToPartitionInfo}}. I think this would be fixed if in the following line in this class {code:java} pathToPartitionInfo.put(Path.getPathWithoutSchemeAndAuthority(entry.getKey()), ...{code} you insert the {{StringInternUtils.internUriStringsInPath()}} call. # The very first line in the "Full reference chains" says that 2% of memory is wasted by duplicate strings that are values in {{CopyOnFirstWriteProperties}} tables, that are reachable via this reference chain {code:java} org.apache.hadoop.hive.common.CopyOnFirstWriteProperties.table ↖org.apache.hadoop.hive.ql.plan.PartitionDesc.properties ↖{j.u.LinkedHashMap}.values ↖org.apache.hadoop.hive.ql.plan.MapWork.pathToPartitionInfo{code} This is a bit unexpected, given that, as you noticed before, we already take care of interning this table's values in {{PartitionDesc#internProperties. }}{{}}Probably some uninterned string values are later added to this table, probably by the code that obtains this table by calling {{getProperties()}}. I hope with your knowledge of Hive code you will manage to determine the culprit here. One more clue is the contents of the duplicate strings coming from these tables, e.g. ||*Num strings* || *String value* || | 36|"hdfs://vc0501.halxg.cloudera.com:8020/user/systest/tpcds_1000_decimal_parquet/store_sales/ss_sold_date_sk=2452497"| | 36|"hdfs://vc0501.halxg.cloudera.com:8020/user/systest/tpcds_1000_decimal_parquet/store_sales/ss_sold_date_sk=2452422"| # There are several other sources of duplicate strings that jxray reports. They cause much less overhead, but some may be still worth fixing. Let me know if you need help with them. Interestingly, as far as I can see, strings coming from {{JobConf}} waste just about 0.2% of memory. Also, as far as I can see in section 2, {{java.util.Properties}} objects together consume 8.5% of memory, which is significant. But most of that comes from {{TableDesc#properties}}. {{JobConf#properties}} uses just 0.8% of memory, so probably not worth optimizing. > Intern JobConf objects in Spark tasks > - > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch, report.html > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528883#comment-16528883 ] Misha Dmitriev commented on HIVE-19668: --- [~aihuaxu] I've checked the logs of failed tests, but couldn't find anything obviously related to my changes. However, my experience with Hive is not deep at all. May I ask you to check these logs before they disappear? > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Intern JobConf objects in Spark tasks
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528876#comment-16528876 ] Misha Dmitriev commented on HIVE-19937: --- [~stakiar] regarding the behavior of {{CopyOnFirstWriteProperties}} - such fine-grain behavior would be easy to implement. It will require changing the implementation of this class so that it has pointers to two hashtables: one for properties that are specific/unique for the given instance of {{COFWP}} and another table with properties that are common/default for all instances of {{COFWP}}. Each get() call should first check the first (specific) hashtable and then the second (default) hashtable, and each put() call should work only with the first hashtable. This would make sense in a situation when there is a sufficiently big number of common properties, but every/almost every table also has some specific properties. In contrast, the current {{CopyOnFirstWriteProperties}} works best when most tables are exactly the same and only a few are different. Well, after writing all this I realize that the proposed changed implementation of {{COFWP}} would probably be better in all scenarios. But before deciding on anything, we definitely should measure where the memory goes in realistic scenarios. Regarding interning only values in {{PartitionDesc#internProperties}} : yes, I think this was intentional - I carefully analyzed heap dumps before making this change, so if it was worth interning the keys, I would have done that too. Most probably when these tables are created, the Strings for keys already come from some source where they are already interned. > Intern JobConf objects in Spark tasks > - > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19937) Intern JobConf objects in Spark tasks
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527045#comment-16527045 ] Misha Dmitriev edited comment on HIVE-19937 at 6/29/18 2:24 AM: I took a quick look, and I am not sure this is done correctly. The code below {code:java} jobConf.forEach(entry -> { StringInternUtils.internIfNotNull(entry.getKey()); StringInternUtils.internIfNotNull(entry.getValue()); }){code} goes over each table entry and just invokes intern() for each key and value. {{intern()}} returns an existing, "canonical" string for each string that is duplicate. But the code doesn't store the returned strings back into the table. To intern both keys and values in a hashtable, you typically need to create a new table and effectively "intern and transfer" the contents from the old table to the new table. Sometimes it may be possible to be more creative and actually create a table with interned contents right away. Here it probably could be done if you added some custom kryo deserialization code for such tables. But maybe that's too big an effort. As always, it would be good to measure how much memory was wasted before this change and saved after it. This helps to prevent errors and to see how much was actually achieved. If {{jobConf}} is an instance of {{java.lang.Properties}}, and there are many duplicates of such tables, then memory is wasted by both string contents of these tables and by tables themselves (each table uses many extra Java objects internally). So you may consider checking the {{org.apache.hadoop.hive.common.CopyOnFirstWriteProperties}} class that I once added for a somewhat similar use case. was (Author: mi...@cloudera.com): I took a quick look, and I am not sure this is done correctly. The code below {code:java} jobConf.forEach(entry -> { StringInternUtils.internIfNotNull(entry.getKey()); StringInternUtils.internIfNotNull(entry.getValue()); }){code} goes over each table entry and just invokes intern() for each key and value. {{intern()}} returns an existing, "canonical" string for each string that is duplicate. But the code doesn't store the returned strings back into the table. To intern both keys and values in a hashtable, you typically need to create a new table and effectively "intern and transfer" the contents from the old table to the new table. Sometimes it may be possible to be more creative and actually create a table with interned contents right away. Here it probably could be done if you added some custom kryo deserialization code for such tables. But maybe that's too big an effort. As always, it would be good to see how much memory was wasted before this change and saved after it. This helps to prevent errors and to see how much was actually achieved. If {{jobConf}} is an instance of {{java.lang.Properties}}, and there are many duplicates of such tables, then memory is wasted by both string contents of these tables and by tables themselves (each table uses many extra Java objects internally). So you may consider checking the {{org.apache.hadoop.hive.common.CopyOnFirstWriteProperties}} class that I once added for a somewhat similar use case. > Intern JobConf objects in Spark tasks > - > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19937) Intern JobConf objects in Spark tasks
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527045#comment-16527045 ] Misha Dmitriev edited comment on HIVE-19937 at 6/29/18 2:23 AM: I took a quick look, and I am not sure this is done correctly. The code below {code:java} jobConf.forEach(entry -> { StringInternUtils.internIfNotNull(entry.getKey()); StringInternUtils.internIfNotNull(entry.getValue()); }){code} goes over each table entry and just invokes intern() for each key and value. {{intern()}} returns an existing, "canonical" string for each string that is duplicate. But the code doesn't store the returned strings back into the table. To intern both keys and values in a hashtable, you typically need to create a new table and effectively "intern and transfer" the contents from the old table to the new table. Sometimes it may be possible to be more creative and actually create a table with interned contents right away. Here it probably could be done if you added some custom kryo deserialization code for such tables. But maybe that's too big an effort. As always, it would be good to see how much memory was wasted before this change and saved after it. This helps to prevent errors and to see how much was actually achieved. If {{jobConf}} is an instance of {{java.lang.Properties}}, and there are many duplicates of such tables, then memory is wasted by both string contents of these tables and by tables themselves (each table uses many extra Java objects internally). So you may consider checking the {{org.apache.hadoop.hive.common.CopyOnFirstWriteProperties}} class that I once added for a somewhat similar use case. was (Author: mi...@cloudera.com): I took a quick look, and I am not sure this is done correctly. The code below {code:java} jobConf.forEach(entry -> { StringInternUtils.internIfNotNull(entry.getKey()); StringInternUtils.internIfNotNull(entry.getValue()); }){code} > Intern JobConf objects in Spark tasks > - > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19937) Intern JobConf objects in Spark tasks
[ https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527045#comment-16527045 ] Misha Dmitriev commented on HIVE-19937: --- I took a quick look, and I am not sure this is done correctly. The code below {code:java} jobConf.forEach(entry -> { StringInternUtils.internIfNotNull(entry.getKey()); StringInternUtils.internIfNotNull(entry.getValue()); }){code} > Intern JobConf objects in Spark tasks > - > > Key: HIVE-19937 > URL: https://issues.apache.org/jira/browse/HIVE-19937 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Attachments: HIVE-19937.1.patch > > > When fixing HIVE-16395, we decided that each new Spark task should clone the > {{JobConf}} object to prevent any {{ConcurrentModificationException}} from > being thrown. However, setting this variable comes at a cost of storing a > duplicate {{JobConf}} object for each Spark task. These objects can take up a > significant amount of memory, we should intern them so that Spark tasks > running in the same JVM don't store duplicate copies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: Patch Available (was: In Progress) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Attachment: HIVE-19668.02.patch > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, > image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: In Progress (was: Patch Available) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Attachment: HIVE-19668.01.patch > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Status: Patch Available (was: In Progress) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-19668 started by Misha Dmitriev. - > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Summary: Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings (was: 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's) > Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and > duplicate strings > -- > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken objects the contents of their 'text' Strings > are the same, but each has its own copy of that String). Other duplicate > strings come from other sources, that are easy enough to fix by adding > String.intern() calls. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19668) 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-19668: -- Description: I've recently analyzed a HS2 heap dump, obtained when there was a huge memory spike during compilation of some big query. The analysis was done with jxray ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of the 20G heap was used by data structures associated with query parsing ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple opportunities for optimizations here. One of them is to stop the code from creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See a sample of these objects in the attached image: !image-2018-05-22-17-41-39-572.png|width=879,height=399! Looks like these particular {{CommonToken}} objects are constants, that don't change once created. I see some code, e.g. in {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are apparently repeatedly created with e.g. {{new CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds are instead created once and reused, we will save more than 1/10th of the heap in this scenario. Plus, since these objects are small but very numerous, getting rid of them will remove a gread deal of pressure from the GC. Another source of waste are duplicate strings, that collectively waste 26.1% of memory. Some of them come from CommonToken objects that have the same text (i.e. for multiple CommonToken objects the contents of their 'text' Strings are the same, but each has its own copy of that String). Other duplicate strings come from other sources, that are easy enough to fix by adding String.intern() calls. was: I've recently analyzed a HS2 heap dump, obtained when there was a huge memory spike during compilation of some big query. The analysis was done with jxray ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of the 20G heap was used by data structures associated with query parsing ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple opportunities for optimizations here. One of them is to stop the code from creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See a sample of these objects in the attached image: !image-2018-05-22-17-41-39-572.png|width=879,height=399! Looks like these particular {{CommonToken}} objects are constants, that don't change once created. I see some code, e.g. in {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are apparently repeatedly created with e.g. {{new CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds are instead created once and reused, we will save more than 1/10th of the heap in this scenario. Plus, since these objects are small but very numerous, getting rid of them will remove a gread deal of pressure from the GC. > 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's > - > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. > Another source of waste are duplicate strings, that collectively waste 26.1% > of memory. Some of them come from CommonToken objects that have the same text > (i.e. for multiple CommonToken
[jira] [Assigned] (HIVE-19668) 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's
[ https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev reassigned HIVE-19668: - > 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's > - > > Key: HIVE-19668 > URL: https://issues.apache.org/jira/browse/HIVE-19668 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: image-2018-05-22-17-41-39-572.png > > > I've recently analyzed a HS2 heap dump, obtained when there was a huge memory > spike during compilation of some big query. The analysis was done with jxray > ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of > the 20G heap was used by data structures associated with query parsing > ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple > opportunities for optimizations here. One of them is to stop the code from > creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See > a sample of these objects in the attached image: > !image-2018-05-22-17-41-39-572.png|width=879,height=399! > Looks like these particular {{CommonToken}} objects are constants, that don't > change once created. I see some code, e.g. in > {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are > apparently repeatedly created with e.g. {{new > CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds > are instead created once and reused, we will save more than 1/10th of the > heap in this scenario. Plus, since these objects are small but very numerous, > getting rid of them will remove a gread deal of pressure from the GC. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470954#comment-16470954 ] Misha Dmitriev commented on HIVE-19041: --- Thank you for looking into the details, [~vihangk1] I've checked the code and I agree with you - these strings are really short-lived. Furthemore, the worst offenders, org.apache.hadoop.hive.metastore.model.MStorageDescriptor.inputFormat,outputFormat, are actually copies of the corresponding fields of {{StorageDescriptor}}. That is, they reference the same string instances. So as soon as you intern the strings in {{StorageDescriptor}}, {{MStorageDescriptor}} will stop referencing duplicate strings as well. Thus, this patch is good to go from my prospective. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch, > HIVE-19041.03.patch, HIVE-19041.04.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469733#comment-16469733 ] Misha Dmitriev commented on HIVE-19041: --- Agree - in our internal heap dump analysis, the above {{*Request}} classes don't show up at all as sources of duplicate strings. However, several class.field combinations do show up, that are not in this patch. Namely: {{org.apache.hadoop.hive.metastore.model.MStorageDescriptor.inputFormat,outputFormat}} {{org.apache.hadoop.hive.metastore.model.MSerDeInfo.serializationLib}} {{org.apache.hadoop.hive.metastore.model.MPartition.partitionName,values}} They contribute relatively little overhead (about 3% together), but probably it's still worth interning them to be on the safe side. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch, > HIVE-19041.03.patch, HIVE-19041.04.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463236#comment-16463236 ] Misha Dmitriev commented on HIVE-19041: --- [~gopalv] yes, since JDK 1.7 built-in string interning is far superior to one based on WeakHashMap (which was used by Hadoop weak interner). Check the above article on interning for further details. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463235#comment-16463235 ] Misha Dmitriev commented on HIVE-19041: --- Yes, all interned strings are kept in the JVM internal equivalent of a concurrent WeakHashMap. Since it's highly specialized, it's very fast, and has no extra overhead when more strings are added to it (because it's quite large and preallocated, so actually every running JVM already bears this memory overhead of a few MB). If you are really interested, check this article: [http://java-performance.info/string-intern-in-java-6-7-8/] Basically, the only thing that you may be concerned with when using String.intern(), is the CPU overhead. But in my experience, unless interning is used, mistakingly, for strings that are very short-lived anyway, the impact of reduced GC outweighs the impact of of extra CPU cycles consumed by the intern() call. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462886#comment-16462886 ] Misha Dmitriev commented on HIVE-19041: --- [~vihangk1] does the jxray report show that comments (or something else that you haven't interned yet) waste a noticeable amount of memory? My experience is that it's often difficult to guess, so any real measured evidence is valuable. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16879) Improve Cache Key
[ https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400630#comment-16400630 ] Misha Dmitriev commented on HIVE-16879: --- I agree about the negligible CPU performance impact of String.intern(), especially when compared with reduced heap size and GC time. Again, I think this is a good change, assuming that it's applied in the right place. However, my experience is that guessing doesn't always work when you try to determine where _exactly_ memory is wasted. Do you have access to some running Hive instances where you would expect this to be a problem? Then, at a minimum, you can run 'jmap -histo:live' to get the number of Key instances and roughly estimate memory used by the strings that Keys reference. And the best thing would be to take a heap dump (jmap -dump:live,format=b,...) and analyze it with a tool, e.g. [www.jxray.com,|http://www.jxray.com,/] that immediately tells you the memory overhead of duplicate strings. You will immediately see whether Keys cause noticeable overhead, and/or what other classes cause it. > Improve Cache Key > - > > Key: HIVE-16879 > URL: https://issues.apache.org/jira/browse/HIVE-16879 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch > > > Improve cache key for cache implemented in > {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}. > # Cache some of the key components themselves (db name, table name) using > {{String}} intern method to conserve memory for repeated keys, to improve > {{equals}} method as now references can be used for equality, and hashcodes > will be cached as well as per {{String}} clash hashcode method. > # Upgrade _debug_ logging to not generate text unless required > # Changed _equals_ method to check first for the item most likely to be > different, column name -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-16879) Improve Cache Key
[ https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399668#comment-16399668 ] Misha Dmitriev commented on HIVE-16879: --- This looks like nice optimization work, assuming that the right things are optimized. Did you measure that the duplicate strings referenced by the fields of Key indeed waste a noticeable amount of memory? If yes, what tool did you use and can you share your findings? Is it really the case that dbName and tblName cause enough duplication to benefit from interning, but colName does not? > Improve Cache Key > - > > Key: HIVE-16879 > URL: https://issues.apache.org/jira/browse/HIVE-16879 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Trivial > Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch > > > Improve cache key for cache implemented in > {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}. > # Cache some of the key components themselves (db name, table name) using > {{String}} intern method to conserve memory for repeated keys, to improve > {{equals}} method as now references can be used for equality, and hashcodes > will be cached as well as per {{String}} clash hashcode method. > # Upgrade _debug_ logging to not generate text unless required > # Changed _equals_ method to check first for the item most likely to be > different, column name -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead
[ https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363134#comment-16363134 ] Misha Dmitriev commented on HIVE-6430: -- Thank you [~akolb]! This is nice work of the kind I wish I can do more :) > MapJoin hash table has large memory overhead > > > Key: HIVE-6430 > URL: https://issues.apache.org/jira/browse/HIVE-6430 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 0.14.0 > > Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, > HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, > HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, > HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, > HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, > HIVE-6430.14.patch, HIVE-6430.patch > > > Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 > for row) can take several hundred bytes, which is ridiculous. I am reducing > the size of MJKey and MJRowContainer in other jiras, but in general we don't > need to have java hash table there. We can either use primitive-friendly > hashtable like the one from HPPC (Apache-licenced), or some variation, to map > primitive keys to single row storage structure without an object per row > (similar to vectorization). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319247#comment-16319247 ] Misha Dmitriev commented on HIVE-17684: --- Hi [~stakiar], just a reminder that I cannot make further progress with this fix. It will probably be best if Hive gets upgraded to Hadoop 3.0.0 first per https://issues.apache.org/jira/browse/HIVE-18319, and then my patch is reapplied. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313767#comment-16313767 ] Misha Dmitriev commented on HIVE-16489: --- Hi [~szita] - apparently yes, just closed it. > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-16489 > URL: https://issues.apache.org/jira/browse/HIVE-16489 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > > I've just analyzed an HMS heap dump. It turns out that it contains a lot of > duplicate strings, that waste 26.4% of the heap. Most of them come from > HashMaps referenced by > org.apache.hadoop.hive.metastore.api.Partition.parameters. Below is the > relevant section of the jxray (www.jxray.com) report. Looking at > Partition.java, I see that in the past somebody has already added code to > intern keys and values in the parameters table when it's first set up. > However, looks like when more key-value pairs are added, they are not > interned, and that probably explains the reason for all these duplicate > strings. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > Top duplicate strings: > Ovhd Num char[]s Num objs Value > 46,088K (0.4%) 58715871 > "HBa4rRAAGx2MEmludGVyZXN0cmF0ZXNwcmVhZBgM/wD/AP8AXqEAERYBFQAXIEAWuK0QAA1s > ...[length 4000]" > 46,088K (0.4%) 58715871 > "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJBgcGBAcLBQYGCAgGCQYG > ...[length 4000]" > ... > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10", 623 of > "CQUJBQcFCAcGBwUFCgUIDAgEBwgFBQcHBwgGBwYEBQoLCggFCAYHBgcIBwkIDgcG ...[length > 4000]", 623 of > "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJ ...[length > 4000]", 623 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]", 623 of > "AAMAAAEAAAEAAQABAAEHAwAKAgAEAwAAAgAEAAMD ...[length > 4000]" > ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev resolved HIVE-16489. --- Resolution: Duplicate > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-16489 > URL: https://issues.apache.org/jira/browse/HIVE-16489 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > > I've just analyzed an HMS heap dump. It turns out that it contains a lot of > duplicate strings, that waste 26.4% of the heap. Most of them come from > HashMaps referenced by > org.apache.hadoop.hive.metastore.api.Partition.parameters. Below is the > relevant section of the jxray (www.jxray.com) report. Looking at > Partition.java, I see that in the past somebody has already added code to > intern keys and values in the parameters table when it's first set up. > However, looks like when more key-value pairs are added, they are not > interned, and that probably explains the reason for all these duplicate > strings. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > Top duplicate strings: > Ovhd Num char[]s Num objs Value > 46,088K (0.4%) 58715871 > "HBa4rRAAGx2MEmludGVyZXN0cmF0ZXNwcmVhZBgM/wD/AP8AXqEAERYBFQAXIEAWuK0QAA1s > ...[length 4000]" > 46,088K (0.4%) 58715871 > "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJBgcGBAcLBQYGCAgGCQYG > ...[length 4000]" > ... > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10", 623 of > "CQUJBQcFCAcGBwUFCgUIDAgEBwgFBQcHBwgGBwYEBQoLCggFCAYHBgcIBwkIDgcG ...[length > 4000]", 623 of > "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJ ...[length > 4000]", 623 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]", 623 of > "AAMAAAEAAAEAAQABAAEHAwAKAgAEAwAAAgAEAAMD ...[length > 4000]" > ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299373#comment-16299373 ] Misha Dmitriev commented on HIVE-17684: --- [~stakiar] How do I run these {{TestSparkCliDriver}} tests? Note the "Spark" thing - looks like they are different from TestCliDriver tests that I know how to run. Also in the test report all these test names look the same. In the mean time, I've just tried the first two of the failed {{TestCliDriver}} tests locally. The first one passed for me. The second one keeps failing, with the error below. This looks confusing. Note that the error message mentions "exhausted memory". However, I cannot find any exception stack traces in the hive.log file. Please advise. {code} [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hive.cli.TestCliDriver [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 102.789 s <<< FAILURE! - in org.apache.hadoop.hive.cli.TestCliDriver [ERROR] testCliDriver[auto_join_without_localtask](org.apache.hadoop.hive.cli.TestCliDriver) Time elapsed: 21.46 s <<< FAILURE! java.lang.AssertionError: Client Execution succeeded but contained differences (error code = 1) after executing auto_join_without_localtask.q 1047a1048,1053 > Hive Runtime Error: Map local work exhausted memory > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask > ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.mr.MapRedTask > Hive Runtime Error: Map local work exhausted memory > FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask > ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.mr.MapRedTask 1054c1060 < RUN: Stage-9:MAPRED --- > RUN: Stage-1:MAPRED 1057c1063 < RUN: Stage-6:MAPRED --- > RUN: Stage-2:MAPRED at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.hive.ql.QTestUtil.failedDiff(QTestUtil.java:2244) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:183) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.junit.runners.Suite.runChild(Suite.java:127) at org.junit.runners.Suite.runChild(Suite.java:26) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297728#comment-16297728 ] Misha Dmitriev commented on HIVE-17684: --- I've fixed some checkstyle warnings (several others, e.g. about indentation, seem strange, so I'd rather ignore them). I've looked at the test failures and I am not sure how to debug this. Note that 3 previous runs of the same jenkins build have 14..16 test failures each, so I suspect something is already broken here. In my case there are a lot more test failures, but I suspect that most of them are irrelevant given that my change is quite small. I wonder if my update of the hadoop dependency could have such an effect? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: In Progress) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.02.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: In Progress (was: Patch Available) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296040#comment-16296040 ] Misha Dmitriev commented on HIVE-17684: --- Thank you for taking a look, [~stakiar]. Yes, naturally this code builds for me locally: {code} $ mvn clean install -DskipTests ... [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 03:58 min [INFO] Finished at: 2017-12-18T13:12:43-08:00 [INFO] Final Memory: 369M/2219M [INFO] {code} The error in this build looks somewhat strange in that it mentions datanucleus. Another strange thing that I see in the console log is a few lines above: {code} error: a/pom.xml: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java: does not exist in index error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java: does not exist in index Going to apply patch with: git apply -p1 {code} I had a suspicion that maybe my local code base is too far behind, so I've just run 'git fetch; git rebase' - this reapplied my change without problems. So I am not sure what's going on here. > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Attachment: HIVE-17684.01.patch > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17684: -- Status: Patch Available (was: In Progress) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > Attachments: HIVE-17684.01.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev reassigned HIVE-17684: - Assignee: Misha Dmitriev (was: Sahil Takiar) > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17684 started by Misha Dmitriev. - > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Misha Dmitriev > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler
[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254068#comment-16254068 ] Misha Dmitriev commented on HIVE-17684: --- The problem with {{MapJoinMemoryExhaustionHandler}} is a large percentage of false alarms about memory exhaustion. Without it, however, Hive may go into a "GC death spiral", where the JVM runs back-to-back full GCs, but doesn't fail for long enough. Because user threads are unable to run most of the time, the executor stops responding, and the Spark driver eventually drops it after some time. This results in hard-to-debug failures, because from the logs it's not clear why the executor stopped responding. I recently added the new https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GcTimeMonitor.java class to hadoop, which allows the user to accurately monitor the percentage of time that the JVM spends in GC. When this percentage grows above ~50% over a 1-minute period, it's almost always a signal that the JVM is in the above "GC death spiral". Even if it is not, extremely long GC pauses are very bad for performance, and it makes sense to treat them in the same way as OOM, i.e. fail the task and ask the user to increase their executors' heap size. I ran some experiments where I replaced MapJoinMemoryExhaustionHandler with checking GC time percentage reported by GcTimeMonitor, and it work well. GcTimeMonitor will become available for other projects when Hadoop 3.0.0-GA is released (which, according to Hadoop developers, should happen in a few weeks). Currently Hive depends on Hadoop 3.0.0-beta1, so to use GcTimeMonitor in Hive, we will need to change this dependency to Hadoop GA. Are there any objections against: (a) dependency change from Hadoop 3.0.0-beta1 to 3.0.0-GA (b) replacing MapJoinMemoryExhaustionHandler with GcTimeMonitor ? > HoS memory issues with MapJoinMemoryExhaustionHandler > - > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17237: -- Status: Patch Available (was: In Progress) Just rebased the change and submitted the new patch. > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-17237.01.patch, HIVE-17237.02.patch > > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17237: -- Attachment: HIVE-17237.02.patch > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-17237.01.patch, HIVE-17237.02.patch > > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17237: -- Status: In Progress (was: Patch Available) > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-17237.01.patch, HIVE-17237.02.patch > > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116963#comment-16116963 ] Misha Dmitriev commented on HIVE-17237: --- This is to save memory and improve performance. String.intern() has always been an "official" solution to the string duplication problem. However, until JDK 7 it was not very scalable. This forced people to start using their own interners based on WeakHashMap or ConcurrentHashMap. But, as we know, these data structures are not economical at all in terms of memory - there is an overhead of 32 bytes or more per an interned string. Starting from JDK 7, Sun/Oracle finally paid attention and made several improvements to String.intern(), that greatly improved its performance. The internal hashtable used by String.intern() is also much more economical in terms of memory, and preallocated. So since JDK 7, it became counterproductive to use custom string interners. > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-17237.01.patch > > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and
[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17237: -- Status: Patch Available (was: Open) > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-17237.01.patch > > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-17237: -- Attachment: HIVE-17237.01.patch > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-17237.01.patch > > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters
[ https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev reassigned HIVE-17237: - > HMS wastes 26.4% of memory due to dup strings in > metastore.api.Partition.parameters > --- > > Key: HIVE-17237 > URL: https://issues.apache.org/jira/browse/HIVE-17237 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > > I've analyzed a heap dump from a production Hive installation using jxray > (www.jxray.com) It turns out that there are a lot of duplicate strings in > memory, that waste 26.4% of the heap. Most of them come from HashMaps > referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. > Below is the relevant section of the jxray report. > Looking at Partition.java, I see that in the past somebody has already added > code to intern keys and values in the parameters table when it's first set > up. However, when more key-value pairs are added, they are not interned, and > that probably explains the reason for all these duplicate strings. Also when > a Partition instance is deserialized, no interning of parameters is currently > done. > {code} > 6. DUPLICATE STRINGS > Total strings: 3,273,557 Unique strings: 460,390 Duplicate values: 110,232 > Overhead: 3,220,458K (26.4%) > > === > 7. REFERENCE CHAINS FOR DUPLICATE STRINGS > 2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing > arrays: > 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", > 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of > "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length > 3560]" > ... and 419200 more strings, of which 36376 are unique > Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", > 28 of "2", 21 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > 463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing > arrays: > 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of > "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980" > ... and 84009 more strings, of which 34065 are unique > Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 > of "2", 3 of "0" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68] > 233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays: > 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 > of "10" ... and 44568 more strings, of which 27285 are unique > Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of > "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3" > <-- {j.u.HashMap}.values <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- Java Local (j.u.ArrayList) > [@4f4cfbd10,@536122408,@726616778] > ... > 52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays: > <-- {j.u.HashMap}.keys <-- > org.apache.hadoop.hive.metastore.api.Partition.parameters <-- > {j.u.ArrayList} <-- > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success > <-- Java Local > (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result) > [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects
[ https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987241#comment-15987241 ] Misha Dmitriev commented on HIVE-16079: --- Thank you [~spena]. Yes, this patch contains changes that are not backward compatible with JDK versions earlier than 8. So I guess it can only be committed to master. > HS2: high memory pressure due to duplicate Properties objects > - > > Key: HIVE-16079 > URL: https://issues.apache.org/jira/browse/HIVE-16079 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Fix For: 3.0.0 > > Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, > HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt > > > I've created a Hive table with 2000 partitions, each backed by two files, > with one row in each file. When I execute some number of concurrent queries > against this table, e.g. as follows > {code} > for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p > admin -e "select count(i_f_1) from misha_table;" & done > {code} > it results in a big memory spike. With 20 queries I caused an OOM in a HS2 > server with -Xmx200m and with 50 queries - in the one with -Xmx500m. > I am attaching the results of jxray (www.jxray.com) analysis of a heap dump > that was generated in the 50queries/500m heap scenario. It suggests that > there are several opportunities to reduce memory pressure with not very > invasive changes to the code. One (duplicate strings) has been addressed in > https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going > to address the fact that almost 20% of memory is used by instances of > java.util.Properties. These objects are highly duplicate, since for each > partition each concurrently running query creates its own copy of Partion, > PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 > partitions) Properties in memory. By interning/deduplicating these objects we > may be able to save perhaps 15% of memory. > Note, however, that if there are queries that mutate partitions, the > corresponding Properties would be mutated as well. Thus we cannot simply use > a single "canonicalized" Properties object at all times for all Partition > objects representing the same DB partition. Instead, I am going to introduce > a special CopyOnFirstWriteProperties class. Such an object initially > internally references a canonicalized Properties object, and keeps doing so > while only read methods are called. However, once any mutating method is > called, the given CopyOnFirstWriteProperties copies the data into its own > table from the canonicalized table, and uses it ever after. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects
[ https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981588#comment-15981588 ] Misha Dmitriev commented on HIVE-16079: --- vector_if_expr test fails in pretty much every Hive build. accumulo_index fails in every 2nd..3rd build, so it looks flaky as well. > HS2: high memory pressure due to duplicate Properties objects > - > > Key: HIVE-16079 > URL: https://issues.apache.org/jira/browse/HIVE-16079 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, > HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt > > > I've created a Hive table with 2000 partitions, each backed by two files, > with one row in each file. When I execute some number of concurrent queries > against this table, e.g. as follows > {code} > for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p > admin -e "select count(i_f_1) from misha_table;" & done > {code} > it results in a big memory spike. With 20 queries I caused an OOM in a HS2 > server with -Xmx200m and with 50 queries - in the one with -Xmx500m. > I am attaching the results of jxray (www.jxray.com) analysis of a heap dump > that was generated in the 50queries/500m heap scenario. It suggests that > there are several opportunities to reduce memory pressure with not very > invasive changes to the code. One (duplicate strings) has been addressed in > https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going > to address the fact that almost 20% of memory is used by instances of > java.util.Properties. These objects are highly duplicate, since for each > partition each concurrently running query creates its own copy of Partion, > PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 > partitions) Properties in memory. By interning/deduplicating these objects we > may be able to save perhaps 15% of memory. > Note, however, that if there are queries that mutate partitions, the > corresponding Properties would be mutated as well. Thus we cannot simply use > a single "canonicalized" Properties object at all times for all Partition > objects representing the same DB partition. Instead, I am going to introduce > a special CopyOnFirstWriteProperties class. Such an object initially > internally references a canonicalized Properties object, and keeps doing so > while only read methods are called. However, once any mutating method is > called, the given CopyOnFirstWriteProperties copies the data into its own > table from the canonicalized table, and uses it ever after. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects
[ https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979322#comment-15979322 ] Misha Dmitriev commented on HIVE-16079: --- Uploaded the 3rd patch, that addresses Sergio's comments. > HS2: high memory pressure due to duplicate Properties objects > - > > Key: HIVE-16079 > URL: https://issues.apache.org/jira/browse/HIVE-16079 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, > HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt > > > I've created a Hive table with 2000 partitions, each backed by two files, > with one row in each file. When I execute some number of concurrent queries > against this table, e.g. as follows > {code} > for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p > admin -e "select count(i_f_1) from misha_table;" & done > {code} > it results in a big memory spike. With 20 queries I caused an OOM in a HS2 > server with -Xmx200m and with 50 queries - in the one with -Xmx500m. > I am attaching the results of jxray (www.jxray.com) analysis of a heap dump > that was generated in the 50queries/500m heap scenario. It suggests that > there are several opportunities to reduce memory pressure with not very > invasive changes to the code. One (duplicate strings) has been addressed in > https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going > to address the fact that almost 20% of memory is used by instances of > java.util.Properties. These objects are highly duplicate, since for each > partition each concurrently running query creates its own copy of Partion, > PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 > partitions) Properties in memory. By interning/deduplicating these objects we > may be able to save perhaps 15% of memory. > Note, however, that if there are queries that mutate partitions, the > corresponding Properties would be mutated as well. Thus we cannot simply use > a single "canonicalized" Properties object at all times for all Partition > objects representing the same DB partition. Instead, I am going to introduce > a special CopyOnFirstWriteProperties class. Such an object initially > internally references a canonicalized Properties object, and keeps doing so > while only read methods are called. However, once any mutating method is > called, the given CopyOnFirstWriteProperties copies the data into its own > table from the canonicalized table, and uses it ever after. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects
[ https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated HIVE-16079: -- Attachment: HIVE-16079.03.patch > HS2: high memory pressure due to duplicate Properties objects > - > > Key: HIVE-16079 > URL: https://issues.apache.org/jira/browse/HIVE-16079 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, > HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt > > > I've created a Hive table with 2000 partitions, each backed by two files, > with one row in each file. When I execute some number of concurrent queries > against this table, e.g. as follows > {code} > for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p > admin -e "select count(i_f_1) from misha_table;" & done > {code} > it results in a big memory spike. With 20 queries I caused an OOM in a HS2 > server with -Xmx200m and with 50 queries - in the one with -Xmx500m. > I am attaching the results of jxray (www.jxray.com) analysis of a heap dump > that was generated in the 50queries/500m heap scenario. It suggests that > there are several opportunities to reduce memory pressure with not very > invasive changes to the code. One (duplicate strings) has been addressed in > https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going > to address the fact that almost 20% of memory is used by instances of > java.util.Properties. These objects are highly duplicate, since for each > partition each concurrently running query creates its own copy of Partion, > PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 > partitions) Properties in memory. By interning/deduplicating these objects we > may be able to save perhaps 15% of memory. > Note, however, that if there are queries that mutate partitions, the > corresponding Properties would be mutated as well. Thus we cannot simply use > a single "canonicalized" Properties object at all times for all Partition > objects representing the same DB partition. Instead, I am going to introduce > a special CopyOnFirstWriteProperties class. Such an object initially > internally references a canonicalized Properties object, and keeps doing so > while only read methods are called. However, once any mutating method is > called, the given CopyOnFirstWriteProperties copies the data into its own > table from the canonicalized table, and uses it ever after. -- This message was sent by Atlassian JIRA (v6.3.15#6346)