[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546281#comment-14546281 ] Gunther Hagleitner commented on HIVE-10233: --- [~vikram.dixit] could you create a rb entry for this? Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546301#comment-14546301 ] Gunther Hagleitner commented on HIVE-10233: --- I think the same is true for top n hashes. Also - getMemoryNeeded() is a misnomer. That's the max right? getMaxMemory? or getAllocatedMemory? Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518152#comment-14518152 ] Gunther Hagleitner commented on HIVE-10233: --- I'm still reviewing, but there are some changes in this that I think is unnecessary. I think you've renamed the llap memory manager to MemoryManagerInterface to make room for another MemoryManager (ql/exec/MemoryManager). But that one isn't used. You really use the ExecMemoryManager. So - you could roll back the changes to the llap cache, remove the old memory manager and just use the exec one. That simplifies the patch. I also think you don't need a memory manager class at all. All it does is remember a field per operator. It seems cleaner to add memInfo to the operator base class with some facilities to track memory. (or introduce a class between operator and gby/join/rs). Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497034#comment-14497034 ] Siddharth Seth commented on HIVE-10233: --- Looked at just the Tez Configuration changes. - Since Hive will be setting the memory explicitly, disabling the Tez scaling makes sense. That's done by setting tez.task.scale.memory.enabled = false (TezConfiguration.TEZ_TASK_SCALE_MEMORY_ENABLED). This needs to be set before creating the AM, and applies to all DAGs running in the AM. - TezRuntimeConfiguration.TEZ_RUNTIME_IO_SORT_MB, TezRuntimeConfiguration.TEZ_RUNTIME_UNORDERED_OUTPUT_BUFFER_SIZE_MB - need to convert the memory from bytes to MB before setting these properties - edgeProp.getInputMemoryNeededPercent - this needs to be a fraction (0-1) (rather than an actual percentage (0-100)). Not sure what the method gives back right now. - Missed mentioning this in the offline discussions about the properties involved, one more needs to be set for the Ordered case. (TEZ_RUNTIME_INPUT_POST_MERGE_BUFFER_PERCENT). This is a measure of how much memory will be used after the merge is complete to avoid spilling to disk. This defaults to 0, but is typically a lower value than the MergeMemory. Given that this memory is always reserved for the Input, it can just be set to the Input merge memory. There's explicit APIs which can be used to configure these properties. {code} .setValueSerializationClass(TezBytesWritableSerialization.class.getName(), null) .configureOutput().setSortBufferSize([OUT_SIZE]).done() .configureInput().setShuffleBufferFraction(IN_FRACTION).setPostMergeBufferFraction(IN_FRACTION).done() {code} Similarly for the UnorderedCase. Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)