[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager

2015-05-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546281#comment-14546281
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

[~vikram.dixit] could you create a rb entry for this?

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager

2015-05-15 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14546301#comment-14546301
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

I think the same is true for top n hashes. Also - getMemoryNeeded() is a 
misnomer. That's the max right? getMaxMemory? or getAllocatedMemory?

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch, HIVE-10233-WIP-6.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager

2015-04-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518152#comment-14518152
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

I'm still reviewing, but there are some changes in this that I think is 
unnecessary. I think you've renamed the llap memory manager to 
MemoryManagerInterface to make room for another MemoryManager 
(ql/exec/MemoryManager). But that one isn't used. You really use the 
ExecMemoryManager. 

So - you could roll back the changes to the llap cache, remove the old memory 
manager and just use the exec one. That simplifies the patch.

I also think you don't need a memory manager class at all. All it does is 
remember a field per operator. It seems cleaner to add memInfo to the operator 
base class with some facilities to track memory. (or introduce a class between 
operator and gby/join/rs).

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager

2015-04-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497034#comment-14497034
 ] 

Siddharth Seth commented on HIVE-10233:
---

Looked at just the Tez Configuration changes.
- Since Hive will be setting the memory explicitly, disabling the Tez scaling 
makes sense. That's done by setting
tez.task.scale.memory.enabled = false 
(TezConfiguration.TEZ_TASK_SCALE_MEMORY_ENABLED).
This needs to be set before creating the AM, and applies to all DAGs running in 
the AM.

- TezRuntimeConfiguration.TEZ_RUNTIME_IO_SORT_MB, 
TezRuntimeConfiguration.TEZ_RUNTIME_UNORDERED_OUTPUT_BUFFER_SIZE_MB - need to 
convert the memory from bytes to MB before setting these properties
- edgeProp.getInputMemoryNeededPercent - this needs to be a fraction (0-1) 
(rather than an actual percentage (0-100)). Not sure what the method gives back 
right now.
- Missed mentioning this in the offline discussions about the properties 
involved, one more needs to be set for the Ordered case. 
(TEZ_RUNTIME_INPUT_POST_MERGE_BUFFER_PERCENT). This is a measure of how much 
memory will be used after the merge is complete to avoid spilling to disk. This 
defaults to 0, but is typically a lower value than the MergeMemory.
Given that this memory is always reserved for the Input, it can just be set to 
the Input merge memory.

There's explicit APIs which can be used to configure these properties.
{code}
.setValueSerializationClass(TezBytesWritableSerialization.class.getName(), null)
.configureOutput().setSortBufferSize([OUT_SIZE]).done()
.configureInput().setShuffleBufferFraction(IN_FRACTION).setPostMergeBufferFraction(IN_FRACTION).done()
{code}

Similarly for the UnorderedCase.





 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)