[
https://issues.apache.org/jira/browse/HAMA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261431#comment-13261431
]
Edward J. Yoon commented on HAMA-521:
-------------------------------------
{code}
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: TASK mapping from
zookeeper: 0 : slave.udanax.org:61003 at index 0
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: TASK mapping from
zookeeper: 1 : slave.udanax.org:61002 at index 1
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: TASK mapping from
zookeeper: 2 : slave.udanax.org:61001 at index 2
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: TASK mapping from
zookeeper: 3 : slave2.udanax.org:61003 at index 3
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: TASK mapping from
zookeeper: 4 : slave2.udanax.org:61001 at index 4
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: TASK mapping from
zookeeper: 5 : slave2.udanax.org:61002 at index 5
12/04/25 20:00:44 DEBUG bsp.Counters: Creating group
org.apache.hama.bsp.BSPPeerImpl$PeerCounter with nothing
12/04/25 20:00:44 DEBUG bsp.Counters: Adding TOTAL_MESSAGES_SENT
12/04/25 20:00:44 DEBUG message.AbstractMessageManager: Send message (3.1588)
to slave2.udanax.org:61003
12/04/25 20:00:44 DEBUG bsp.Counters: Adding SUPERSTEP_SUM
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: [slave2.udanax.org:61002]
enter the enterbarrier: 0
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: ===> at superstep :0
current znode size: 6 current znodes:[attempt_201204252000_0001_000002_0,
attempt_201204252000_0001_000000_0, attempt_201204252000_0001_000004_0,
attempt_201204252000_0001_000001_0, attempt_201204252000_0001_000005_0,
attempt_201204252000_0001_000003_0]
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: enterBarrier() znode size
within /bsp/job_201204252000_0001/0 is 6. Znodes include
[attempt_201204252000_0001_000002_0, attempt_201204252000_0001_000000_0,
attempt_201204252000_0001_000004_0, attempt_201204252000_0001_000001_0,
attempt_201204252000_0001_000005_0, attempt_201204252000_0001_000003_0]
12/04/25 20:00:44 DEBUG sync.ZooKeeperSyncClientImpl: ---> at superstep: 0 task
that is creating /ready znode:attempt_201204252000_0001_000005_0
12/04/25 20:00:45 DEBUG bsp.BSPPeerImpl: Enabled = false checkPointInterval = 1
lastCheckPointStep = 0 getSuperstepCount() = 0
12/04/25 20:00:45 DEBUG bsp.Counters: Adding COMPRESSED_BYTES_SENT
12/04/25 20:00:45 INFO ipc.NettyTransceiver: Connecting to
slave2.udanax.org/192.168.123.138:61003
12/04/25 20:00:45 INFO ipc.NettyTransceiver: [id: 0x25786286] OPEN
12/04/25 20:00:45 INFO ipc.NettyTransceiver: [id: 0x25786286,
/192.168.123.138:33600 => slave2.udanax.org/192.168.123.138:61003] BOUND:
/192.168.123.138:33600
12/04/25 20:00:45 INFO ipc.NettyTransceiver: [id: 0x25786286,
/192.168.123.138:33600 => slave2.udanax.org/192.168.123.138:61003] CONNECTED:
slave2.udanax.org/192.168.123.138:61003
12/04/25 20:00:45 DEBUG sync.ZooKeeperSyncClientImpl: leaveBarrier() !!!
checking znodes contnains /ready node or not: at superstep:0
znode:[attempt_201204252000_0001_000004_0, attempt_201204252000_0001_000003_0,
attempt_201204252000_0001_000000_0, attempt_201204252000_0001_000005_0, ready]
12/04/25 20:00:45 DEBUG sync.ZooKeeperSyncClientImpl: leaveBarrier() at
superstep:0 znode size: (4) znodes:[attempt_201204252000_0001_000004_0,
attempt_201204252000_0001_000003_0, attempt_201204252000_0001_000000_0,
attempt_201204252000_0001_000005_0]
12/04/25 20:00:45 DEBUG sync.ZooKeeperSyncClientImpl: leaveBarrier():
superstep:0 taskid:attempt_201204252000_0001_000005_0 wait for lowest notify.
12/04/25 20:00:45 DEBUG sync.ZooKeeperSyncClientImpl: leaveBarrier() at
superstep: 0 taskid:attempt_201204252000_0001_000005_0 lowest notify other
nodes.
12/04/25 20:00:45 DEBUG sync.ZooKeeperSyncClientImpl: leaveBarrier() !!!
checking znodes contnains /ready node or not: at superstep:0 znode:[ready]
12/04/25 20:00:45 DEBUG sync.ZooKeeperSyncClientImpl: leaveBarrier() at
superstep:0 znode size: (0) znodes:[]
12/04/25 20:00:45 DEBUG bsp.Counters: Adding TIME_IN_SYNC_MS
12/04/25 20:00:47 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351647316,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=0,SECOND=47,MILLISECOND=316,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:00:49 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351649817,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=0,SECOND=49,MILLISECOND=817,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:00:52 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351652318,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=0,SECOND=52,MILLISECOND=318,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:00:54 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351654819,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=0,SECOND=54,MILLISECOND=819,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:00:57 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351657320,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=0,SECOND=57,MILLISECOND=320,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:00:59 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351659821,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=0,SECOND=59,MILLISECOND=821,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:01:02 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351662322,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=1,SECOND=2,MILLISECOND=322,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:01:04 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351664823,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=1,SECOND=4,MILLISECOND=823,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:01:07 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351667324,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=1,SECOND=7,MILLISECOND=324,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:01:09 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351669825,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=1,SECOND=9,MILLISECOND=825,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:01:12 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351672326,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=1,SECOND=12,MILLISECOND=326,ZONE_OFFSET=32400000,DST_OFFSET=0]
12/04/25 20:01:14 DEBUG bsp.BSPTask: Pinging at time
java.util.GregorianCalendar[time=1335351674827,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="Asia/Seoul",offset=32400000,dstSavings=0,useDaylight=false,transitions=14,lastRule=null],firstDayOfWeek=1,minimalDaysInFirstWeek=1,ERA=1,YEAR=2012,MONTH=3,WEEK_OF_YEAR=17,WEEK_OF_MONTH=4,DAY_OF_MONTH=25,DAY_OF_YEAR=116,DAY_OF_WEEK=4,DAY_OF_WEEK_IN_MONTH=4,AM_PM=1,HOUR=8,HOUR_OF_DAY=20,MINUTE=1,SECOND=14,MILLISECOND=827,ZONE_OFFSET=32400000,DST_OFFSET=0]
{code}
> Improve message buffering to save memory
> ----------------------------------------
>
> Key: HAMA-521
> URL: https://issues.apache.org/jira/browse/HAMA-521
> Project: Hama
> Issue Type: Sub-task
> Reporter: Thomas Jungblut
> Assignee: Thomas Jungblut
> Attachments: HAMA-521.patch, HAMA-521_1.patch, HAMA-521_2.patch,
> HAMA-521_3.patch
>
>
> Suraj and I had a bit of discussion about incoming and outgoing message
> buffering and scalability.
> Currently everything lies on the heap, causing huge amounts of GC and waste
> of memory. We can do better.
> Therefore we need to extract an abstract Messenger class which is directly
> under the interface but over the compressor class.
> It should abstract the use of the queues in the back (currently lot of
> duplicated code) and it should be backed by a sequencefile on local disk.
> Once sync() starts it should return a message iterator for combining and then
> gets put into a message bundle which is send over RPC.
> On the other side we get a bundle and looping over it putting everything into
> the heap making it much larger than it needs to be. Here we can also flush on
> disk because we are just using a queue-like method to the user-side.
> Plus points:
> In case we have enough heap (see our new metric system), we can also
> implement a buffering technology that is not flushing everything to disk.
> Open questions:
> I don't know how much slower the whole system gets, but it would save alot of
> memory. Maybe we should first evaluate if it is really needed.
> In any case, the refactoring of the duplicate code in the messengers is
> needed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira