[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-25 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627833#comment-16627833
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Hi [~KaiXu] yes, your issue is the same that we try to improve here.

Note, however, that if/when this change will be integrated, 
{{MapJoinMemoryExhaustionException}} is not guaranteed to go away in your use 
case. That is, maybe your Spark executor really doesn't have enough memory to 
process the given table, and thus the exception is thrown correctly. But this 
change should reduce the chance of this exception being thrown in wrong 
circumstances, when there is actually enough memory.

In the mean time, a workaround is to increase the JVM heap for your Spark 
executors.

[~stakiar] it looks like the last patch finally passed tests, so it can be 
integrated?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch, HIVE-17684.11.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-20 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622437#comment-16622437
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] there is once again some strange compilation failure. Please take a 
look.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-19 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

Resubmitting the same patch; hope there will be no unrelated compilation error 
this time.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-19 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.10.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch, HIVE-17684.10.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-19 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.10.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-19 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: (was: HIVE-17684.10.patch)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-19 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: In Progress  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-16 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616608#comment-16616608
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] there seems to be some unrelated compilation failure. What do we do 
now?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-14 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-14 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: In Progress  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-14 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.09.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, 
> HIVE-17684.09.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-14 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615513#comment-16615513
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] looks like the test failures are due to some silly 
NumberFormatException somewhere in the new code. I'll see if I can fix that.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-14 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615186#comment-16615186
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Probably {{MapJoinMemoryExhaustionHandler}} just doesn't get triggered in 
HoMR... but ok, I agree that in such a difficult-to-test area it's safer to be 
conservative and avoid changing things unless they are definitely broken.

Let's wait for the test results for your latest patch then. I guess depending 
on the outcome it may be submitted as is or may need some small finishing 
touches.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-13 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614242#comment-16614242
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] this looks generally good. I noticed one or two typos; will take a 
deeper look into the code tomorrow.

The main question is: did you actually see the old 
{{MapJoinMemoryExhaustionHandler}} working correctly, at least with Hive-on-MR? 
My impression so far was that it's just broken, because the JVM cannot report 
the exact amount of used memory (i.e. occupied by non-garbage objects), except 
maybe after a full GC. If so, then keeping this code around will mean adding 
more "cruft" to Hive, which is already not the cleanest code base, and make the 
job of maintainters, yourself included, harder. Please let me know what you 
think.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-13 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613115#comment-16613115
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] looks like with the latest patch, that uses the reliable GC time 
monitoring implementation, the remaining 5 failed tests are those that actually 
expect MapJoinMemoryExhaustionError to be thrown. To get this exception thrown 
reliably, we probably need to set the {{hive.mapjoin.max.gc.time.percentage}} 
configuration property to something like 1. However, I am not sure how to set 
it just for the few selected tests - please advise.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-11 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: Open)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-11 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.07.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch, HIVE-17684.07.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-11 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Open  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-10 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609712#comment-16609712
 ] 

Misha Dmitriev commented on HIVE-17684:
---

I've made the above changes - that was easy. However, when I tried to run some 
tests locally, they failed. Turns out that 
{{org.apache.hadoop.util.GcTimeMonitor}} constructor has a 
{{Preconditions.checkArgument(maxGcTimePercentage <= 100)}} check. Turns out 
that sanity check sometimes have unwanted effect...

In this situation, it looks like the fastest way to address the problem is to 
copy the above GcTimeMonitor class into the Hive code base, and then modify it 
so that instead of the {{GarbageCollectionMXBean}} it uses the older, 
battle-tested mechanism that's same as in the existing {{JvmPauseMonitor}} 
class. Makes sense?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-06 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606717#comment-16606717
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] I've also checked a number of logs, and looks like all or most 
failed tests indeed fail because GC time >= 100 per cent is reported.

I suspect that this may be due to the fact that the underlying 
{{org.apache.hadoop.util.GcTimeMonitor}} class uses the JMX API (class 
{{java.lang.management.GarbageCollectorMXBean}}) internally. I remember seeing 
or reading that this API may sometimes wrongly report e.g. the time spent in 
concurrent object marking GC phase (which doesn't pause the JVM) alongside the 
normal pause time. If so, these percentages above at or above 100 would be 
explainable by incorrect accounting. Looks like this happens only in the heavy 
GC situation - I didn't see it in benchmarks in normal mode or when running 
these tests locally.

I guess that in these circumstances the right solution would be to make 
{{criticalGcTimePercentage}} a configurable variable in HiveConf, and get rid 
of the current {{CRITICAL_GC_TIME_PERCENTAGE_TEST/PROD}} If 
{{criticalGcTimePercentage}} can be configured separately in normal tests (to 
something like 1000% to be on the safe side), in tests that are expected to 
fail (to 0%) and in prod (probably to something like the current 50% plus 
possibly some "safety margin") - then hopefully the things will work as 
expected in all cases. But can it be configured in that way?

Longer term, if we see evidence that the current implementation of 
GcTimeMonitor miscalculates the GC time considerably, we can switch to the 
existing {{org.apache.hadoop.hive.common.JvmPauseMonitor}} that uses a 
different method of estimating the GC time. This class can just be extended to 
report the GC time as percentage for our purposes.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, 
> HIVE-17684.06.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-09-06 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606351#comment-16606351
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] thank you for your patch, I've just applied it locally.

After your updates, there are still 34 failed tests in the last run above. May 
I ask you to take a quick look at why they failed? (I mean those tests that 
don't expect the "exhausted memory" outcome)

In the mean time, I'll see how to make these tests behave the same with the new 
memory handler.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-08-14 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580366#comment-16580366
 ] 

Misha Dmitriev commented on HIVE-17684:
---

I've resumed working on this, and finally found the reason for the mysterious 
failure described in my previous comment. It turns out that the following test 
outputs:

./clientpositive/auto_join25.q.out
./clientpositive/auto_join_without_localtask.q.out
./clientpositive/infer_bucket_sort_convert_join.q.out
./clientpositive/mapjoin_hook.q.out

actually _do_ contain the "Hive Runtime Error: Map local work exhausted memory" 
string, i.e. they expect the old memory exhaustion check to be triggered. With 
my new code, this check doesn't get triggered anymore (probably correctly), so 
these tests fail. I think a sufficient level of Hive expertise is needed to 
determine how the expected output should look like in the new situation, so I 
would prefer to leave it to you [~stakiar].

But looks like we still have a high number of failing CLI tests, that likely 
fail because the "in tests" flag is not set when they run, and thus my new high 
GC time get triggered. For these tests to pass, I guess we either need to make 
all CLI tests set the "in tests" flag somehow, or find out another JVM property 
or something that is set only when these tests run. Any suggestions?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-31 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16564366#comment-16564366
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] tried to debug it, but not sure what's going on. The following 
command

{{mvn test -Dtest=TestCliDriver 
-Dqfile=auto_join_without_localtask.q,acid_mapjoin.q,auto_join33.q}}

fails for me locally with the message below, but when I try to add some extra 
logging to my code, rebuild Hive and rerun, somehow it doesn't look like my 
changes get applied. I am going to leave for vacation in a few hours, so won't 
be able to spend more time on this now. Please feel free to debug it yourself 
or wait until I am back in ~9 days.
{code:java}
Hive Runtime Error: Map local work exhausted memory
> FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> Hive Runtime Error: Map local work exhausted memory
> FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> ATTEMPT: Execute BackupTask: 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask{code}

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-30 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562481#comment-16562481
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Actually, looks like some of these CLI tests fail for me locally, when I don't 
think there is much memory pressure on Hive. Need to understand what's going on.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-30 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562445#comment-16562445
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] a large number of CLI tests failed because of 
MapJoinMemoryExhaustionError. I suspect that in CLI tests the Hive server runs 
just as usual, with the "in tests" flag not set?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-27 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-27 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.04.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-27 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: In Progress  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch, HIVE-17684.04.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-25 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556103#comment-16556103
 ] 

Misha Dmitriev commented on HIVE-17684:
---

I agree that any fixing/reconfiguring of tests should be done in a separate 
patch. I would also agree that tests that exercise GcTimeMonitor should not be 
used as an indirect way to verify that the test infrastructure is working well 
- it would be too confusing. But perhaps it's worth creating a Jira about 
better monitoring the Hive test JVMs or some such.

So, for now we should either not run this code in tests, or run it with a 
rather high GC time threshold. But how does any code in Hive find out whether 
it's running in tests or in production? So far I've found 
{{HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TESTS). But from looking 
into the source code I am not sure it always works as intended, and access to a 
Configuration instance needed by the call above seems to be tricky in 
Operator.java - this thing is passed in quite late, in the initialize() method. 
Probably I can work around this, but I would like to confirm first that this 
HIVE_IN_TESTS is the right thing to use here.}}

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-24 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555039#comment-16555039
 ] 

Misha Dmitriev commented on HIVE-17684:
---

But looks like when these qtests run sequentially (as they probably do by 
default), on a single JVM with the 2GB heap size, there is no GC overload, 
medium to low CPU utilization, but super-slow execution speed. So I suspect 
that on Jenkins the tests (within a single 'mvn test -Dtest=TestCliDriver') 
somehow run in parallel (and maybe it's configured to use slower heap as well). 
Is it possible to run qtests with parallelization? Can we check how much memory 
is given on Jenkins to the JVM running them?

We can, of course, reduce the aggressiveness of the GC monitor in unit tests. 
But if it turns out that, unknown to everyone, they just waste a lot of time in 
GC, then a better solution would be to reduce parallelism or increase heap. 
They would probably run faster as a result.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-24 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554991#comment-16554991
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] thank you for looking into this. When I ran the above test locally, 
it passed without issues. I also tried to run all tests via {{cd itest; mvn 
test -Dtest=TestCliDriver}}. This hasn't finished so far (after ~2 hours I 
think), but when I monitor the JVM that runs tests with jstat, I see no 
excessive GC activity at all.

So can it happen that in the Jenkins test environment, probably on a bigger 
machine with many CPU cores, there are e.g. multiple tests execute in parallel 
against the same HS2 instance? If so, and/or if its heap size is insufficient, 
I guess in principle it can happen that GC pauses become really long/frequent. 
But if they indeed take 60% of the time, then it's bad. For one thing, it would 
mean that our tests run much slower than they should.

Is it possible to get access to the machine that runs these tests on Jenkins 
and do some basic GC monitoring?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-24 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554729#comment-16554729
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Thank you for looking into this, [~stakiar] Can you please remind me where (in 
which source file) such serializers are kept? I remember doing this once or 
twice, but it was quite long ago.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-23 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-23 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: In Progress  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-23 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.03.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, 
> HIVE-17684.03.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-07-23 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553563#comment-16553563
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Will get to this later today or tomorrow.


On Mon, Jul 23, 2018 at 3:50 PM, Sahil Takiar (JIRA) 



> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization

2018-07-23 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553279#comment-16553279
 ] 

Misha Dmitriev commented on HIVE-19937:
---

The last patch looks good to me.

The only slight concern that I got from looking at it once again is the 
following: in one or two places you switched from passing around Strings to 
passing around Paths, and subsequently switched some  HashMap(s) to HashMap. Note that a lookup in a map where keys 
are complex objects is slower, because Path.equals(Path) is slower than 
String.equals(String) - it may involve comparison of many strings, etc. I 
haven't seen any reports on Hive CPU performance problems, and I hope this code 
is not on a critical path, and/or that GC-related savings would offset the 
potential hashmap lookup slowdown... but anyway, I guess it's worth remembering 
about this.

> Intern fields in MapWork on deserialization
> ---
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, 
> HIVE-19937.3.patch, HIVE-19937.4.patch, HIVE-19937.5.patch, 
> post-patch-report.html, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization

2018-07-19 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549768#comment-16549768
 ] 

Misha Dmitriev commented on HIVE-19937:
---

+1

One small comment: instead of
 {{this.baseFileName = baseFileName == null ? null : baseFileName.intern();}}
you can just use
 {{this.baseFileName = StringUtils.internIfNotNull(baseFileName);}}

> Intern fields in MapWork on deserialization
> ---
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, 
> HIVE-19937.3.patch, HIVE-19937.4.patch, post-patch-report.html, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Intern fields in MapWork on deserialization

2018-07-19 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549667#comment-16549667
 ] 

Misha Dmitriev commented on HIVE-19937:
---

Thank you for publishing the updated jxray report, [~stakiar] Looks like things 
work as expected. One small question/concern: I see that this test uses a 
pretty small heap, less than 1GB. Is it expected that all the overheads that 
you are fixing will grow proportionally with a bigger heap?

> Intern fields in MapWork on deserialization
> ---
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, 
> HIVE-19937.3.patch, post-patch-report.html, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-16 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545643#comment-16545643
 ] 

Misha Dmitriev commented on HIVE-19668:
---

[~vihangk1] yes, the previous patch passed all tests, but looks like there were 
some checkstyle problems with it. Unfortunately, the checkstyle report is gone, 
so I've just fixed several lines which I think could be problematic, and 
resubmitted the patch. Are these checkstyle issues considered important?

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-16 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.05.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-16 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-16 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: In Progress  (was: Patch Available)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, HIVE-19668.05.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-13 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

The previous patch may or may not have been applied, thus I've just updated my 
local git repo clone and generated a new patch file.

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-13 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.04.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-13 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: In Progress  (was: Patch Available)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, HIVE-19668.04.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.03.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> HIVE-19668.03.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: In Progress  (was: Patch Available)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-07-12 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542265#comment-16542265
 ] 

Misha Dmitriev commented on HIVE-19668:
---

Thank you for checking, [~vihangk1] [~aihuaxu] and [~stakiar]. In the end, it 
turns out that at least some failures are reproducible locally, and my changes 
are responsible. Not all {{CommonToken}}s can be made {{ImmutableToken}}s, 
because for some of them the type may be rewritten in some special operators 
later. I've already found one such type in the past, and now eliminating 
others. Will post the updated patch once I am done.

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Use BeanSerializer for MapWork to carry calls to String.intern

2018-07-10 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539149#comment-16539149
 ] 

Misha Dmitriev commented on HIVE-19937:
---

A few small comments about the last patch:
 # In the new *Deserializer code, there are many lines looking like 
'partitionDesc.setXXX(partitionDesc.getXXX())'. Such code looks quite 
non-obvious, so it would be good to add comments there explaining that this is 
done to intern possible duplicate strings.
 # Looks like there is a typo: a method that's really a getter has the set... 
name:
+ public Map> setEventSourceColumnTypeMap() {
 +   return eventSourceColumnTypeMap; 
+ }
 # Still would be good to publish a jxray report for the same test before and 
after this change, to make sure that the problems are indeed fixed.

> Use BeanSerializer for MapWork to carry calls to String.intern
> --
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, 
> HIVE-19937.3.patch, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Use BeanSerializer for MapWork to carry calls to String.intern

2018-07-09 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537362#comment-16537362
 ] 

Misha Dmitriev commented on HIVE-19937:
---

The analysis above looks good. I remember seeing similar issues with Kryo (and 
in general, all other kinds of serialization) in the past, and dealing with 
them in a similar way.

The new patch looks good to me. I assume that you analyzed a heap dump before 
and after this change and verified that the problems that you wanted to address 
indeed went away. A few small comments/nits:
 * MapOperator.java, 'contexts = new LinkedHashMap, MapOpCtx>()' - 
if this code has any idea of the expected size of LinkedHashMap, you may want 
to create it with the appropriate capacity. This is especially relevant when 
such maps are small - then the default capacity of 16 makes them waste a lot of 
memory.
 * MapWork.java, 'if (includedBuckets != null) { this.includedBuckets = ...' - 
you can make this code a bit shorter using the conditional operator, i.e. 
'this.includedBuckets = (includedBuckets != null) ? includedBuckets : null' 
Same in several other methods.

> Use BeanSerializer for MapWork to carry calls to String.intern
> --
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, HIVE-19937.2.patch, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Intern JobConf objects in Spark tasks

2018-07-03 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16532187#comment-16532187
 ] 

Misha Dmitriev commented on HIVE-19937:
---

Thank you for sharing the jxray report, [~stakiar].

If it reflects the situation in real-life applications accurately enough, then 
it looks like the sources of duplicate strings are not as much {{JobConf}} 
tables as various other things, that you can easily see if you expand the 
"Expensive fields" and "Full reference chains" in section 7:
 # Most of the duplicate strings (~9% out of 13.5% total) come from data fields 
of {{java.net.URI}}. All these URIs, in turn, come from 
{{org.apache.hadoop.fs.Path.uri}}. {{Path}}s come from more than one sources, 
but the biggest one is this reference chain: 

{code:java}
↖java.net.URI.schemeSpecificPart
↖org.apache.hadoop.fs.Path.uri
↖{j.u.LinkedHashMap}.keys
↖org.apache.hadoop.hive.ql.plan.MapWork.pathToAliases{code}
It turns out that in the past I have already taken care of interning strings in 
such URIs, see e.g. this method in MapWork.java:

{code:java}
public void setPathToAliases(final LinkedHashMap> 
pathToAliases) {
  for (Path p : pathToAliases.keySet()) {
StringInternUtils.internUriStringsInPath(p);
  }
  this.pathToAliases = pathToAliases;
}{code}
but it turns out that there are also other methods that can add {{Path}}s to 
{{pathToAliases}}: two flavors of {{addPathToAlias()}} and maybe something 
else. I think we need to modify all these methods so that they also call 
{{StringInternUtils.internUriStringsInPath()}} for {{Path}}s that are passed to 
them. This will remove the said 9% of duplicate strings.

 # One other source of duplicate strings in URIs referenced by {{Path}}s is the 
map in {{ProjectionPusher.pathToPartitionInfo}}. I think this would be fixed if 
in the following line in this class

{code:java}
pathToPartitionInfo.put(Path.getPathWithoutSchemeAndAuthority(entry.getKey()), 
...{code}
you insert the {{StringInternUtils.internUriStringsInPath()}} call.
 # The very first line in the "Full reference chains" says that 2% of memory is 
wasted by duplicate strings that are values in {{CopyOnFirstWriteProperties}} 
tables, that are reachable via this reference chain

{code:java}
org.apache.hadoop.hive.common.CopyOnFirstWriteProperties.table
↖org.apache.hadoop.hive.ql.plan.PartitionDesc.properties
↖{j.u.LinkedHashMap}.values
↖org.apache.hadoop.hive.ql.plan.MapWork.pathToPartitionInfo{code}
This is a bit unexpected, given that, as you noticed before, we already take 
care of interning this table's values in {{PartitionDesc#internProperties. 
}}{{}}Probably some uninterned string values are later added to this table, 
probably by the code that obtains this table by calling {{getProperties()}}. I 
hope with your knowledge of Hive code you will manage to determine the culprit 
here. One more clue is the contents of the duplicate strings coming from these 
tables, e.g.

||*Num strings* || *String value* ||
| 
36|"hdfs://vc0501.halxg.cloudera.com:8020/user/systest/tpcds_1000_decimal_parquet/store_sales/ss_sold_date_sk=2452497"|
| 
36|"hdfs://vc0501.halxg.cloudera.com:8020/user/systest/tpcds_1000_decimal_parquet/store_sales/ss_sold_date_sk=2452422"|

 # There are several other sources of duplicate strings that jxray reports. 
They cause much less overhead, but some may be still worth fixing. Let me know 
if you need help with them. Interestingly, as far as I can see, strings coming 
from {{JobConf}} waste just about 0.2% of memory.

Also, as far as I can see in section 2, {{java.util.Properties}} objects 
together consume 8.5% of memory, which is significant. But most of that comes 
from {{TableDesc#properties}}. {{JobConf#properties}} uses just 0.8% of memory, 
so probably not worth optimizing.

> Intern JobConf objects in Spark tasks
> -
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch, report.html
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-30 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528883#comment-16528883
 ] 

Misha Dmitriev commented on HIVE-19668:
---

[~aihuaxu] I've checked the logs of failed tests, but couldn't find anything 
obviously related to my changes. However, my experience with Hive is not deep 
at all. May I ask you to check these logs before they disappear?

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Intern JobConf objects in Spark tasks

2018-06-30 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528876#comment-16528876
 ] 

Misha Dmitriev commented on HIVE-19937:
---

[~stakiar] regarding the behavior of {{CopyOnFirstWriteProperties}} - such 
fine-grain behavior would be easy to implement. It will require changing the 
implementation of this class so that it has pointers to two hashtables: one for 
properties that are specific/unique for the given instance of {{COFWP}} and 
another table with properties that are common/default for all instances of 
{{COFWP}}. Each get() call should first check the first (specific) hashtable 
and then the second (default) hashtable, and each put() call should work only 
with the first hashtable. This would make sense in a situation when there is a 
sufficiently big number of common properties, but every/almost every table also 
has some specific properties. In contrast, the current 
{{CopyOnFirstWriteProperties}} works best when most tables are exactly the same 
and only a few are different. Well, after writing all this I realize that the 
proposed changed implementation of {{COFWP}} would probably be better in all 
scenarios. But before deciding on anything, we definitely should measure where 
the memory goes in realistic scenarios.

Regarding interning only values in {{PartitionDesc#internProperties}} : yes, I 
think this was intentional - I carefully analyzed heap dumps before making this 
change, so if it was worth interning the keys, I would have done that too. Most 
probably when these tables are created, the Strings for keys already come from 
some source where they are already interned.

> Intern JobConf objects in Spark tasks
> -
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19937) Intern JobConf objects in Spark tasks

2018-06-28 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527045#comment-16527045
 ] 

Misha Dmitriev edited comment on HIVE-19937 at 6/29/18 2:24 AM:


I took a quick look, and I am not sure this is done correctly. The code below
{code:java}
jobConf.forEach(entry -> {
  StringInternUtils.internIfNotNull(entry.getKey());
  StringInternUtils.internIfNotNull(entry.getValue());
}){code}
goes over each table entry and just invokes intern() for each key and value. 
{{intern()}} returns an existing, "canonical" string for each string that is 
duplicate. But the code doesn't store the returned strings back into the table. 
To intern both keys and values in a hashtable, you typically need to create a 
new table and effectively "intern and transfer" the contents from the old table 
to the new table. Sometimes it may be possible to be more creative and actually 
create a table with interned contents right away. Here it probably could be 
done if you added some custom kryo deserialization code for such tables. But 
maybe that's too big an effort.

As always, it would be good to measure how much memory was wasted before this 
change and saved after it. This helps to prevent errors and to see how much was 
actually achieved.

If {{jobConf}} is an instance of {{java.lang.Properties}}, and there are many 
duplicates of such tables, then memory is wasted by both string contents of 
these tables and by tables themselves (each table uses many extra Java objects 
internally). So you may consider checking the 
{{org.apache.hadoop.hive.common.CopyOnFirstWriteProperties}} class that I once 
added for a somewhat similar use case.


was (Author: mi...@cloudera.com):
I took a quick look, and I am not sure this is done correctly. The code below
{code:java}
jobConf.forEach(entry -> {
  StringInternUtils.internIfNotNull(entry.getKey());
  StringInternUtils.internIfNotNull(entry.getValue());
}){code}
goes over each table entry and just invokes intern() for each key and value. 
{{intern()}} returns an existing, "canonical" string for each string that is 
duplicate. But the code doesn't store the returned strings back into the table. 
To intern both keys and values in a hashtable, you typically need to create a 
new table and effectively "intern and transfer" the contents from the old table 
to the new table. Sometimes it may be possible to be more creative and actually 
create a table with interned contents right away. Here it probably could be 
done if you added some custom kryo deserialization code for such tables. But 
maybe that's too big an effort.

As always, it would be good to see how much memory was wasted before this 
change and saved after it. This helps to prevent errors and to see how much was 
actually achieved.

If {{jobConf}} is an instance of {{java.lang.Properties}}, and there are many 
duplicates of such tables, then memory is wasted by both string contents of 
these tables and by tables themselves (each table uses many extra Java objects 
internally). So you may consider checking the 
{{org.apache.hadoop.hive.common.CopyOnFirstWriteProperties}} class that I once 
added for a somewhat similar use case.

> Intern JobConf objects in Spark tasks
> -
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19937) Intern JobConf objects in Spark tasks

2018-06-28 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527045#comment-16527045
 ] 

Misha Dmitriev edited comment on HIVE-19937 at 6/29/18 2:23 AM:


I took a quick look, and I am not sure this is done correctly. The code below
{code:java}
jobConf.forEach(entry -> {
  StringInternUtils.internIfNotNull(entry.getKey());
  StringInternUtils.internIfNotNull(entry.getValue());
}){code}
goes over each table entry and just invokes intern() for each key and value. 
{{intern()}} returns an existing, "canonical" string for each string that is 
duplicate. But the code doesn't store the returned strings back into the table. 
To intern both keys and values in a hashtable, you typically need to create a 
new table and effectively "intern and transfer" the contents from the old table 
to the new table. Sometimes it may be possible to be more creative and actually 
create a table with interned contents right away. Here it probably could be 
done if you added some custom kryo deserialization code for such tables. But 
maybe that's too big an effort.

As always, it would be good to see how much memory was wasted before this 
change and saved after it. This helps to prevent errors and to see how much was 
actually achieved.

If {{jobConf}} is an instance of {{java.lang.Properties}}, and there are many 
duplicates of such tables, then memory is wasted by both string contents of 
these tables and by tables themselves (each table uses many extra Java objects 
internally). So you may consider checking the 
{{org.apache.hadoop.hive.common.CopyOnFirstWriteProperties}} class that I once 
added for a somewhat similar use case.


was (Author: mi...@cloudera.com):
I took a quick look, and I am not sure this is done correctly. The code below
{code:java}
jobConf.forEach(entry -> {
  StringInternUtils.internIfNotNull(entry.getKey());
  StringInternUtils.internIfNotNull(entry.getValue());
}){code}

> Intern JobConf objects in Spark tasks
> -
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19937) Intern JobConf objects in Spark tasks

2018-06-28 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527045#comment-16527045
 ] 

Misha Dmitriev commented on HIVE-19937:
---

I took a quick look, and I am not sure this is done correctly. The code below
{code:java}
jobConf.forEach(entry -> {
  StringInternUtils.internIfNotNull(entry.getKey());
  StringInternUtils.internIfNotNull(entry.getValue());
}){code}

> Intern JobConf objects in Spark tasks
> -
>
> Key: HIVE-19937
> URL: https://issues.apache.org/jira/browse/HIVE-19937
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-19937.1.patch
>
>
> When fixing HIVE-16395, we decided that each new Spark task should clone the 
> {{JobConf}} object to prevent any {{ConcurrentModificationException}} from 
> being thrown. However, setting this variable comes at a cost of storing a 
> duplicate {{JobConf}} object for each Spark task. These objects can take up a 
> significant amount of memory, we should intern them so that Spark tasks 
> running in the same JVM don't store duplicate copies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-27 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-27 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.02.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, HIVE-19668.02.patch, 
> image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-27 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: In Progress  (was: Patch Available)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-02 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Attachment: HIVE-19668.01.patch

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-02 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Status: Patch Available  (was: In Progress)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HIVE-19668.01.patch, image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-02 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-19668 started by Misha Dmitriev.
-
> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and duplicate strings

2018-06-01 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Summary: Over 30% of the heap wasted by duplicate 
org.antlr.runtime.CommonToken's and duplicate strings  (was: 11.8% of the heap 
wasted due to duplicate org.antlr.runtime.CommonToken's)

> Over 30% of the heap wasted by duplicate org.antlr.runtime.CommonToken's and 
> duplicate strings
> --
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken objects the contents of their 'text' Strings 
> are the same, but each has its own copy of that String). Other duplicate 
> strings come from other sources, that are easy enough to fix by adding 
> String.intern() calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19668) 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's

2018-06-01 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-19668:
--
Description: 
I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
spike during compilation of some big query. The analysis was done with jxray 
([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
the 20G heap was used by data structures associated with query parsing 
({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
opportunities for optimizations here. One of them is to stop the code from 
creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See a 
sample of these objects in the attached image:

!image-2018-05-22-17-41-39-572.png|width=879,height=399!

Looks like these particular {{CommonToken}} objects are constants, that don't 
change once created. I see some code, e.g. in 
{{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
apparently repeatedly created with e.g. {{new 
CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds are 
instead created once and reused, we will save more than 1/10th of the heap in 
this scenario. Plus, since these objects are small but very numerous, getting 
rid of them will remove a gread deal of pressure from the GC.

Another source of waste are duplicate strings, that collectively waste 26.1% of 
memory. Some of them come from CommonToken objects that have the same text 
(i.e. for multiple CommonToken objects the contents of their 'text' Strings are 
the same, but each has its own copy of that String). Other duplicate strings 
come from other sources, that are easy enough to fix by adding String.intern() 
calls.

  was:
I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
spike during compilation of some big query. The analysis was done with jxray 
([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
the 20G heap was used by data structures associated with query parsing 
({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
opportunities for optimizations here. One of them is to stop the code from 
creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See a 
sample of these objects in the attached image:

!image-2018-05-22-17-41-39-572.png|width=879,height=399!

Looks like these particular {{CommonToken}} objects are constants, that don't 
change once created. I see some code, e.g. in 
{{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
apparently repeatedly created with e.g. {{new 
CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds are 
instead created once and reused, we will save more than 1/10th of the heap in 
this scenario. Plus, since these objects are small but very numerous, getting 
rid of them will remove a gread deal of pressure from the GC.


> 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's
> -
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.
> Another source of waste are duplicate strings, that collectively waste 26.1% 
> of memory. Some of them come from CommonToken objects that have the same text 
> (i.e. for multiple CommonToken 

[jira] [Assigned] (HIVE-19668) 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's

2018-05-22 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev reassigned HIVE-19668:
-


> 11.8% of the heap wasted due to duplicate org.antlr.runtime.CommonToken's
> -
>
> Key: HIVE-19668
> URL: https://issues.apache.org/jira/browse/HIVE-19668
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: image-2018-05-22-17-41-39-572.png
>
>
> I've recently analyzed a HS2 heap dump, obtained when there was a huge memory 
> spike during compilation of some big query. The analysis was done with jxray 
> ([www.jxray.com).|http://www.jxray.com)./] It turns out that more than 90% of 
> the 20G heap was used by data structures associated with query parsing 
> ({{org.apache.hadoop.hive.ql.parse.QBExpr}}). There are probably multiple 
> opportunities for optimizations here. One of them is to stop the code from 
> creating duplicate instances of {{org.antlr.runtime.CommonToken}} class. See 
> a sample of these objects in the attached image:
> !image-2018-05-22-17-41-39-572.png|width=879,height=399!
> Looks like these particular {{CommonToken}} objects are constants, that don't 
> change once created. I see some code, e.g. in 
> {{org.apache.hadoop.hive.ql.parse.CalcitePlanner}}, where such objects are 
> apparently repeatedly created with e.g. {{new 
> CommonToken(HiveParser.TOK_INSERT, "TOK_INSERT")}} If these 33 token kinds 
> are instead created once and reused, we will save more than 1/10th of the 
> heap in this scenario. Plus, since these objects are small but very numerous, 
> getting rid of them will remove a gread deal of pressure from the GC.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-10 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470954#comment-16470954
 ] 

Misha Dmitriev commented on HIVE-19041:
---

Thank you for looking into the details, [~vihangk1] I've checked the code and I 
agree with you - these strings are really short-lived. Furthemore, the worst 
offenders, 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor.inputFormat,outputFormat,
 are actually copies of the corresponding fields of {{StorageDescriptor}}. That 
is, they reference the same string instances. So as soon as you intern the 
strings in {{StorageDescriptor}}, {{MStorageDescriptor}} will stop referencing 
duplicate strings as well.

 

Thus, this patch is good to go from my prospective.

 

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch, 
> HIVE-19041.03.patch, HIVE-19041.04.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-09 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469733#comment-16469733
 ] 

Misha Dmitriev commented on HIVE-19041:
---

Agree - in our internal heap dump analysis, the above {{*Request}} classes 
don't show up at all as sources of duplicate strings.

However, several class.field combinations do show up, that are not in this 
patch. Namely:

{{org.apache.hadoop.hive.metastore.model.MStorageDescriptor.inputFormat,outputFormat}}

{{org.apache.hadoop.hive.metastore.model.MSerDeInfo.serializationLib}}

{{org.apache.hadoop.hive.metastore.model.MPartition.partitionName,values}}

They contribute relatively little overhead (about 3% together), but probably 
it's still worth interning them to be on the safe side.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch, 
> HIVE-19041.03.patch, HIVE-19041.04.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463236#comment-16463236
 ] 

Misha Dmitriev commented on HIVE-19041:
---

[~gopalv] yes, since JDK 1.7 built-in string interning is far superior to one 
based on WeakHashMap (which was used by Hadoop weak interner). Check the above 
article on interning for further details.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463235#comment-16463235
 ] 

Misha Dmitriev commented on HIVE-19041:
---

Yes, all interned strings are kept in the JVM internal equivalent of a 
concurrent WeakHashMap. Since it's highly specialized, it's very fast, and has 
no extra overhead when more strings are added to it (because it's quite large 
and preallocated, so actually every running JVM already bears this memory 
overhead of a few MB). If you are really interested, check this article: 
[http://java-performance.info/string-intern-in-java-6-7-8/] 

Basically, the only thing that you may be concerned with when using 
String.intern(), is the CPU overhead. But in my experience, unless interning is 
used, mistakingly, for strings that are very short-lived anyway, the impact of 
reduced GC outweighs the impact of of extra CPU cycles consumed by the intern() 
call.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462886#comment-16462886
 ] 

Misha Dmitriev commented on HIVE-19041:
---

[~vihangk1] does the jxray report show that comments (or something else that 
you haven't interned yet) waste a noticeable amount of memory? My experience is 
that it's often difficult to guess, so any real measured evidence is valuable.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16879) Improve Cache Key

2018-03-15 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400630#comment-16400630
 ] 

Misha Dmitriev commented on HIVE-16879:
---

I agree about the negligible CPU performance impact of String.intern(), 
especially when compared with reduced heap size and GC time. Again, I think 
this is a good change, assuming that it's applied in the right place.

However, my experience is that guessing doesn't always work when you try to 
determine where _exactly_ memory is wasted. Do you have access to some running 
Hive instances where you would expect this to be a problem? Then, at a minimum, 
you can run 'jmap -histo:live' to get the number of Key instances and roughly 
estimate memory used by the strings that Keys reference. And the best thing 
would be to take a heap dump (jmap -dump:live,format=b,...) and analyze it with 
a tool, e.g. [www.jxray.com,|http://www.jxray.com,/] that immediately tells you 
the memory overhead of duplicate strings. You will immediately see whether Keys 
cause noticeable overhead, and/or what other classes cause it.

> Improve Cache Key
> -
>
> Key: HIVE-16879
> URL: https://issues.apache.org/jira/browse/HIVE-16879
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch
>
>
> Improve cache key for cache implemented in 
> {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}.
> # Cache some of the key components themselves (db name, table name) using 
> {{String}} intern method to conserve memory for repeated keys, to improve 
> {{equals}} method as now references can be used for equality, and hashcodes 
> will be cached as well as per {{String}} clash hashcode method.
> # Upgrade _debug_ logging to not generate text unless required
> # Changed _equals_ method to check first for the item most likely to be 
> different, column name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16879) Improve Cache Key

2018-03-14 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16399668#comment-16399668
 ] 

Misha Dmitriev commented on HIVE-16879:
---

This looks like nice optimization work, assuming that the right things are 
optimized.

Did you measure that the duplicate strings referenced by the fields of Key 
indeed waste a noticeable amount of memory? If yes, what tool did you use and 
can you share your findings? Is it really the case that dbName and tblName 
cause enough duplication to benefit from interning, but colName does not?

> Improve Cache Key
> -
>
> Key: HIVE-16879
> URL: https://issues.apache.org/jira/browse/HIVE-16879
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-16879.1.patch, HIVE-16879.2.patch
>
>
> Improve cache key for cache implemented in 
> {{org.apache.hadoop.hive.metastore.AggregateStatsCache}}.
> # Cache some of the key components themselves (db name, table name) using 
> {{String}} intern method to conserve memory for repeated keys, to improve 
> {{equals}} method as now references can be used for equality, and hashcodes 
> will be cached as well as per {{String}} clash hashcode method.
> # Upgrade _debug_ logging to not generate text unless required
> # Changed _equals_ method to check first for the item most likely to be 
> different, column name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2018-02-13 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363134#comment-16363134
 ] 

Misha Dmitriev commented on HIVE-6430:
--

Thank you [~akolb]! This is nice work of the kind I wish I can do more :)

> MapJoin hash table has large memory overhead
> 
>
> Key: HIVE-6430
> URL: https://issues.apache.org/jira/browse/HIVE-6430
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 0.14.0
>
> Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
> HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
> HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
> HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
> HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, 
> HIVE-6430.14.patch, HIVE-6430.patch
>
>
> Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
> for row) can take several hundred bytes, which is ridiculous. I am reducing 
> the size of MJKey and MJRowContainer in other jiras, but in general we don't 
> need to have java hash table there.  We can either use primitive-friendly 
> hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
> primitive keys to single row storage structure without an object per row 
> (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2018-01-09 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319247#comment-16319247
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Hi [~stakiar], just a reminder that I cannot make further progress with this 
fix.

It will probably be best if Hive gets upgraded to Hadoop 3.0.0 first per 
https://issues.apache.org/jira/browse/HIVE-18319, and then my patch is 
reapplied.

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2018-01-05 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313767#comment-16313767
 ] 

Misha Dmitriev commented on HIVE-16489:
---

Hi [~szita] - apparently yes, just closed it.

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-16489
> URL: https://issues.apache.org/jira/browse/HIVE-16489
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>
> I've just analyzed an HMS heap dump. It turns out that it contains a lot of 
> duplicate strings, that waste 26.4% of the heap. Most of them come from 
> HashMaps referenced by 
> org.apache.hadoop.hive.metastore.api.Partition.parameters. Below is the 
> relevant section of the jxray (www.jxray.com) report. Looking at 
> Partition.java, I see that in the past somebody has already added code to 
> intern keys and values in the parameters table when it's first set up. 
> However, looks like when more key-value pairs are added, they are not 
> interned, and that probably explains the reason for all these duplicate 
> strings.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> Top duplicate strings:
> Ovhd Num char[]s   Num objs   Value
>  46,088K (0.4%) 58715871  
> "HBa4rRAAGx2MEmludGVyZXN0cmF0ZXNwcmVhZBgM/wD/AP8AXqEAERYBFQAXIEAWuK0QAA1s
>  ...[length 4000]"
>  46,088K (0.4%) 58715871  
> "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJBgcGBAcLBQYGCAgGCQYG
>  ...[length 4000]"
> ...
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10", 623 of 
> "CQUJBQcFCAcGBwUFCgUIDAgEBwgFBQcHBwgGBwYEBQoLCggFCAYHBgcIBwkIDgcG ...[length 
> 4000]", 623 of 
> "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJ ...[length 
> 4000]", 623 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]", 623 of 
> "AAMAAAEAAAEAAQABAAEHAwAKAgAEAwAAAgAEAAMD ...[length 
> 4000]"
> ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-16489) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2018-01-05 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev resolved HIVE-16489.
---
Resolution: Duplicate

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-16489
> URL: https://issues.apache.org/jira/browse/HIVE-16489
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>
> I've just analyzed an HMS heap dump. It turns out that it contains a lot of 
> duplicate strings, that waste 26.4% of the heap. Most of them come from 
> HashMaps referenced by 
> org.apache.hadoop.hive.metastore.api.Partition.parameters. Below is the 
> relevant section of the jxray (www.jxray.com) report. Looking at 
> Partition.java, I see that in the past somebody has already added code to 
> intern keys and values in the parameters table when it's first set up. 
> However, looks like when more key-value pairs are added, they are not 
> interned, and that probably explains the reason for all these duplicate 
> strings.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> Top duplicate strings:
> Ovhd Num char[]s   Num objs   Value
>  46,088K (0.4%) 58715871  
> "HBa4rRAAGx2MEmludGVyZXN0cmF0ZXNwcmVhZBgM/wD/AP8AXqEAERYBFQAXIEAWuK0QAA1s
>  ...[length 4000]"
>  46,088K (0.4%) 58715871  
> "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJBgcGBAcLBQYGCAgGCQYG
>  ...[length 4000]"
> ...
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10", 623 of 
> "CQUJBQcFCAcGBwUFCgUIDAgEBwgFBQcHBwgGBwYEBQoLCggFCAYHBgcIBwkIDgcG ...[length 
> 4000]", 623 of 
> "BQcHBQUGBQgGBQcHCAUGCAkECQcFBQwGBgoJBQYHBQUFBQYKBQgIBgUJEgYFDAYJ ...[length 
> 4000]", 623 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]", 623 of 
> "AAMAAAEAAAEAAQABAAEHAwAKAgAEAwAAAgAEAAMD ...[length 
> 4000]"
> ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-20 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299373#comment-16299373
 ] 

Misha Dmitriev commented on HIVE-17684:
---

[~stakiar] How do I run these {{TestSparkCliDriver}} tests? Note the "Spark" 
thing - looks like they are different from TestCliDriver tests that I know how 
to run. Also in the test report all these test names look the same.

In the mean time, I've just tried the first two of the failed {{TestCliDriver}} 
tests locally. The first one passed for me. The second one keeps failing, with 
the error below. This looks confusing. Note that the error message mentions 
"exhausted memory". However, I cannot find any exception stack traces in the 
hive.log file. Please advise.

{code}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hive.cli.TestCliDriver
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 102.789 
s <<< FAILURE! - in org.apache.hadoop.hive.cli.TestCliDriver
[ERROR] 
testCliDriver[auto_join_without_localtask](org.apache.hadoop.hive.cli.TestCliDriver)
  Time elapsed: 21.46 s  <<< FAILURE!
java.lang.AssertionError: 
Client Execution succeeded but contained differences (error code = 1) after 
executing auto_join_without_localtask.q 
1047a1048,1053
> Hive Runtime Error: Map local work exhausted memory
> FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> Hive Runtime Error: Map local work exhausted memory
> FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> ATTEMPT: Execute BackupTask: org.apache.hadoop.hive.ql.exec.mr.MapRedTask
1054c1060
< RUN: Stage-9:MAPRED
---
> RUN: Stage-1:MAPRED
1057c1063
< RUN: Stage-6:MAPRED
---
> RUN: Stage-2:MAPRED

at org.junit.Assert.fail(Assert.java:88)
at org.apache.hadoop.hive.ql.QTestUtil.failedDiff(QTestUtil.java:2244)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:183)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runners.Suite.runChild(Suite.java:127)
at org.junit.runners.Suite.runChild(Suite.java:26)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:369)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:275)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:239)
  

[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-19 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297728#comment-16297728
 ] 

Misha Dmitriev commented on HIVE-17684:
---

I've fixed some checkstyle warnings (several others, e.g. about indentation, 
seem strange, so I'd rather ignore them).

I've looked at the test failures and I am not sure how to debug this. Note that 
3 previous runs of the same jenkins build have 14..16 test failures each, so I 
suspect something is already broken here. In my case there are a lot more test 
failures, but I suspect that most of them are irrelevant given that my change 
is quite small. I wonder if my update of the hadoop dependency could have such 
an effect?


> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-19 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-19 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.02.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-19 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: In Progress  (was: Patch Available)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-18 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296040#comment-16296040
 ] 

Misha Dmitriev commented on HIVE-17684:
---

Thank you for taking a look, [~stakiar]. Yes, naturally this code builds for me 
locally:

{code}
$ mvn clean install -DskipTests
...
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 03:58 min
[INFO] Finished at: 2017-12-18T13:12:43-08:00
[INFO] Final Memory: 369M/2219M
[INFO] 
{code}

The error in this build looks somewhat strange in that it mentions datanucleus. 
Another strange thing that I see in the console log is a few lines above:

{code}
error: a/pom.xml: does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java: 
does not exist in index
error: a/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java: does not 
exist in index
Going to apply patch with: git apply -p1
{code}

I had a suspicion that maybe my local code base is too far behind, so I've just 
run 'git fetch; git rebase' - this reapplied my change without problems. So I 
am not sure what's going  on here.


> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-18 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Attachment: HIVE-17684.01.patch

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-18 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17684:
--
Status: Patch Available  (was: In Progress)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
> Attachments: HIVE-17684.01.patch
>
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-18 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev reassigned HIVE-17684:
-

Assignee: Misha Dmitriev  (was: Sahil Takiar)

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-12-18 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17684 started by Misha Dmitriev.
-
> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Misha Dmitriev
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17684) HoS memory issues with MapJoinMemoryExhaustionHandler

2017-11-15 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254068#comment-16254068
 ] 

Misha Dmitriev commented on HIVE-17684:
---

The problem with {{MapJoinMemoryExhaustionHandler}} is a large percentage of 
false alarms about memory exhaustion. Without it, however, Hive may go into a 
"GC death spiral", where the JVM runs back-to-back full GCs, but doesn't fail 
for long enough. Because user threads are unable to run most of the time, the 
executor stops responding, and the Spark driver eventually drops it after some 
time. This results in hard-to-debug failures, because from the logs it's not 
clear why the executor stopped responding.

I recently added the new 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GcTimeMonitor.java
 class to hadoop, which allows the user to accurately monitor the percentage of 
time that the JVM spends in GC. When this percentage grows above ~50% over a 
1-minute period, it's almost always a signal that the JVM is in the above "GC 
death spiral". Even if it is not, extremely long GC pauses are very bad for 
performance, and it makes sense to treat them in the same way as OOM, i.e. fail 
the task and ask the user to increase their executors' heap size.

I ran some experiments where I replaced MapJoinMemoryExhaustionHandler with 
checking GC time percentage reported by GcTimeMonitor, and it work well. 
GcTimeMonitor will become available for other projects when Hadoop 3.0.0-GA is 
released (which, according to Hadoop developers, should happen in a few weeks). 
Currently Hive depends on Hadoop 3.0.0-beta1, so to use GcTimeMonitor in Hive, 
we will need to change this dependency to Hadoop GA. Are there any objections 
against:
(a) dependency change from Hadoop 3.0.0-beta1 to 3.0.0-GA
(b) replacing MapJoinMemoryExhaustionHandler with GcTimeMonitor ?

> HoS memory issues with MapJoinMemoryExhaustionHandler
> -
>
> Key: HIVE-17684
> URL: https://issues.apache.org/jira/browse/HIVE-17684
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> We have seen a number of memory issues due the {{HashSinkOperator}} use of 
> the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect 
> scenarios where the small table is taking too much space in memory, in which 
> case a {{MapJoinMemoryExhaustionError}} is thrown.
> The configs to control this logic are:
> {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
> {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)
> The handler works by using the {{MemoryMXBean}} and uses the following logic 
> to estimate how much memory the {{HashMap}} is consuming: 
> {{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
> MemoryMXBean#getHeapMemoryUsage().getMax()}}
> The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
> inaccurate. The value returned by this method returns all reachable and 
> unreachable memory on the heap, so there may be a bunch of garbage data, and 
> the JVM just hasn't taken the time to reclaim it all. This can lead to 
> intermittent failures of this check even though a simple GC would have 
> reclaimed enough space for the process to continue working.
> We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. 
> In Hive-on-MR this probably made sense to use because every Hive task was run 
> in a dedicated container, so a Hive Task could assume it created most of the 
> data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks 
> running in a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-21 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17237:
--
Status: Patch Available  (was: In Progress)

Just rebased the change and submitted the new patch.

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-17237.01.patch, HIVE-17237.02.patch
>
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-21 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17237:
--
Attachment: HIVE-17237.02.patch

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-17237.01.patch, HIVE-17237.02.patch
>
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-21 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17237:
--
Status: In Progress  (was: Patch Available)

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-17237.01.patch, HIVE-17237.02.patch
>
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-07 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116963#comment-16116963
 ] 

Misha Dmitriev commented on HIVE-17237:
---

This is to save memory and improve performance. String.intern() has always been 
an "official" solution to the string duplication problem. However, until JDK 7 
it was not very scalable. This forced people to start using their own interners 
based on WeakHashMap or ConcurrentHashMap. But, as we know, these data 
structures are not economical at all in terms of memory - there is an overhead 
of 32 bytes or more per an interned string. Starting from JDK 7, Sun/Oracle 
finally paid attention and made several improvements to String.intern(), that 
greatly improved its performance. The internal hashtable used by 
String.intern() is also much more economical in terms of memory, and 
preallocated. So since JDK 7, it became counterproductive to use custom string 
interners. 

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-17237.01.patch
>
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 

[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-02 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17237:
--
Status: Patch Available  (was: Open)

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-17237.01.patch
>
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-02 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-17237:
--
Attachment: HIVE-17237.01.patch

> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-17237.01.patch
>
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17237) HMS wastes 26.4% of memory due to dup strings in metastore.api.Partition.parameters

2017-08-02 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev reassigned HIVE-17237:
-


> HMS wastes 26.4% of memory due to dup strings in 
> metastore.api.Partition.parameters
> ---
>
> Key: HIVE-17237
> URL: https://issues.apache.org/jira/browse/HIVE-17237
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>
> I've analyzed a heap dump from a production Hive installation using jxray 
> (www.jxray.com) It turns out that there are a lot of duplicate strings in 
> memory, that waste 26.4% of the heap. Most of them come from HashMaps 
> referenced by org.apache.hadoop.hive.metastore.api.Partition.parameters. 
> Below is the relevant section of the jxray report.
> Looking at Partition.java, I see that in the past somebody has already added 
> code to intern keys and values in the parameters table when it's first set 
> up. However, when more key-value pairs are added, they are not interned, and 
> that probably explains the reason for all these duplicate strings. Also when 
> a Partition instance is deserialized, no interning of parameters is currently 
> done.
> {code}
> 6. DUPLICATE STRINGS
> Total strings: 3,273,557  Unique strings: 460,390  Duplicate values: 110,232  
> Overhead: 3,220,458K (26.4%)
> 
> ===
> 7. REFERENCE CHAINS FOR DUPLICATE STRINGS
>   2,326,150K (19.1%), 597058 dup strings (36386 unique), 597058 dup backing 
> arrays:
> 39949 of "-1", 39088 of "true", 28959 of "8", 20987 of "1", 18437 of "10", 
> 9583 of "9", 5908 of "269664", 5691 of "174528", 4598 of "133980", 4598 of 
> "BgUGBQgFCAYFCgYIBgUEBgQHBgUGCwYGBwYHBgkKBwYGBggIBwUHBgYGCgUJCQUG ...[length 
> 3560]"
> ... and 419200 more strings, of which 36376 are unique
> Also contains one-char strings: 217 of "6", 147 of "7", 91 of "4", 28 of "5", 
> 28 of "2", 21 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
>   463,060K (3.8%), 119644 dup strings (34075 unique), 119644 dup backing 
> arrays:
> 7914 of "true", 7912 of "-1", 6578 of "8", 5606 of "1", 2302 of "10", 1626 of 
> "174528", 1223 of "9", 970 of "171680", 837 of "269664", 657 of "133980"
> ... and 84009 more strings, of which 34065 are unique
> Also contains one-char strings: 42 of "7", 31 of "6", 20 of "4", 8 of "5", 5 
> of "2", 3 of "0"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.TreeMap}.values <-- Java Local (j.u.TreeMap) [@6f084afa0,@73aac9e68]
>   233,384K (1.9%), 64601 dup strings (27295 unique), 64601 dup backing arrays:
> 4472 of "true", 4173 of "-1", 3798 of "1", 3591 of "8", 813 of "174528", 684 
> of "10" ... and 44568 more strings, of which 27285 are unique
> Also contains one-char strings: 305 of "7", 301 of "0", 277 of "4", 146 of 
> "6", 29 of "2", 23 of "5", 19 of "9", 2 of "3"
>  <--  {j.u.HashMap}.values <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- Java Local (j.u.ArrayList) 
> [@4f4cfbd10,@536122408,@726616778]
> ...
>   52,916K (0.4%), 597058 dup strings (16 unique), 597058 dup backing arrays:
>  <--  {j.u.HashMap}.keys <-- 
> org.apache.hadoop.hive.metastore.api.Partition.parameters <--  
> {j.u.ArrayList} <-- 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.success
>  <-- Java Local 
> (org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result)
>  [@6e33618d8,@6eedb9a80,@6eedbad68,@6eedbc788] ... and 3 more GC roots
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-27 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15987241#comment-15987241
 ] 

Misha Dmitriev commented on HIVE-16079:
---

Thank you [~spena]. Yes, this patch contains changes that are not backward 
compatible with JDK versions earlier than 8. So I guess it can only be 
committed to master.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 3.0.0
>
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-24 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15981588#comment-15981588
 ] 

Misha Dmitriev commented on HIVE-16079:
---

vector_if_expr test fails in pretty much every Hive build. accumulo_index fails 
in every 2nd..3rd build, so it looks flaky as well.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-21 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979322#comment-15979322
 ] 

Misha Dmitriev commented on HIVE-16079:
---

Uploaded the 3rd patch, that addresses Sergio's comments.

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-16079) HS2: high memory pressure due to duplicate Properties objects

2017-04-21 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HIVE-16079:
--
Attachment: HIVE-16079.03.patch

> HS2: high memory pressure due to duplicate Properties objects
> -
>
> Key: HIVE-16079
> URL: https://issues.apache.org/jira/browse/HIVE-16079
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-16079.01.patch, HIVE-16079.02.patch, 
> HIVE-16079.03.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code. One (duplicate strings) has been addressed in 
> https://issues.apache.org/jira/browse/HIVE-15882 In this ticket, I am going 
> to address the fact that almost 20% of memory is used by instances of 
> java.util.Properties. These objects are highly duplicate, since for each 
> partition each concurrently running query creates its own copy of Partion, 
> PartitionDesc and Properties. Thus we have nearly 100,000 (50 queries * 2,000 
> partitions) Properties in memory. By interning/deduplicating these objects we 
> may be able to save perhaps 15% of memory.
> Note, however, that if there are queries that mutate partitions, the 
> corresponding Properties would be mutated as well. Thus we cannot simply use 
> a single "canonicalized" Properties object at all times for all Partition 
> objects representing the same DB partition. Instead, I am going to introduce 
> a special CopyOnFirstWriteProperties class. Such an object initially 
> internally references a canonicalized Properties object, and keeps doing so 
> while only read methods are called. However, once any mutating method is 
> called, the given CopyOnFirstWriteProperties copies the data into its own 
> table from the canonicalized table, and uses it ever after.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >