[
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116961#comment-13116961
]
Hyunsik Choi commented on GIRAPH-12:
------------------------------------
Avery,
Thank you for your review. You are right. Runtime's totalMem() and freeMem()
methods doesn't measure stack sizes. I'm sure of it after testing the below
code.
https://gist.github.com/1249761
I have looked for how to measure the stack size of a java application. I could
not find about that. Still, I'm not sure how to show that thread stack memory
is reduced by the thread pool approach. Now, your way seems a only method to
prove them.
However, I'm curious to know how much thread overhead is in terms of memory
consumption. Before I try your approach. I conducted some simple experiments.
I used the above source code to investigate the memory usage of threads. This
is executed on a machine with intel i3, ubuntu 11.10 (64bit), and 8G memory. I
measure their memory by using 'top'. 'top' shows several columns including VIRT
and RES, and SHR. We only need to focus RES, resident memory. RES includes all
resident memory usages, such as heap and stack. I could know this from this
page (http://goo.gl/JE7fD).
Firstly, I executed the above code with 1000 threads and without a jvm option
'-Xss'. Accoring to this page (http://goo.gl/sz2qM), the default stack size
'Xss' is 1024k on the jvm of 64bit linux. After all threads are created, I
executed 'top' to print the memory usages as follows:
1k threads with default thread stack size.
{noformat}
VIRT RES SHR
9163 hyunsik 20 0 3366m 30m 8296 S 18 0.4 0:01.52 java
{noformat}
2k threads with default thread stack size.
{noformat}
VIRT RES SHR
11223 hyunsik 20 0 4434m 46m 8340 S 40 0.6 0:04.11 java
{noformat}
With 1k and 2k threads, that program consumes only 30 and 46 mega bytes
respectively. The memory usage of threads are smaller than I expected. I wonder
if thread stack size is the main cause of the memory problem that we have faced.
Besides, the default stack size is 1024k. The thread stack size seems to not
affect RES. I had more tests with 'Xss' in order to investigate more the thread
stack size.
1k threads with '-Xss4096k'.
{noformat}
28301 hyunsik 20 0 6380m 30m 8292 S 17 0.4 0:05.25 java
{noformat}
2k threads with '-Xss4096k'
{noformat}
29326 hyunsik 20 0 10.1g 46m 8300 S 38 0.6 0:03.42 java
{noformat}
VIRT surely is affected by '-Xss', but RES is not. 'Xss' seems the maximum
stack size of each thread because it doesn't affect RES.
What do you think about that?
> Investigate communication improvements
> --------------------------------------
>
> Key: GIRAPH-12
> URL: https://issues.apache.org/jira/browse/GIRAPH-12
> Project: Giraph
> Issue Type: Improvement
> Components: bsp
> Reporter: Avery Ching
> Assignee: Hyunsik Choi
> Priority: Minor
> Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other
> workers. Hadoop RPC is used for communication. For instance if there are
> 400 workers, each worker will create 400 threads. This ends up using a lot
> of memory, even with the option
> -Dmapred.child.java.opts="-Xss64k".
> It would be good to investigate using frameworks like Netty or custom roll
> our own to improve this situation. By moving away from Hadoop RPC, we would
> also make compatibility of different Hadoop versions easier.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira