Please use the latest version. On Sun, Jul 14, 2013 at 4:28 PM, Kostas Xirog <[email protected]> wrote: > Thanks for your reply, > > I don't know what I can actually show you that will be of any help(except > from my code which is about 1000 lines), but I'll try to give you guys the > basic idea. > Of course I'm using the hama's graph (implementation of Pregel) for this. > > My program creates a graph with nodes and edges that both have big sets of > data (such as recordIds and edge values in each record) , as values. The > basic idea is that I'm running a query on this graph in the form of a path > (or subgraph), and the program returns the records that contain this path, > as well as the values of each of the records that contain this path. > > The compute function executes and only the nodes that are part of the query > wake up at first, all others halt. As this happens, I collect the recordIds > from the node Values and the edge values from the edges, and when the end > nodes have been reached, the program terminates, I collect the result from > the end nodes and write it to the result file... > > Is there some way I can access a memory mapping or something?... After > execution with 400.000 records, the log is: > >>13/07/14 10:17:57 INFO bsp.BSPJobClient: The total number of supersteps: 48 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: Counters: 12 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: > org.apache.hama.graph.GraphJobRunner$GraphJobCounter >>13/07/14 10:17:57 INFO bsp.BSPJobClient: ITERATIONS=42 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: MULTISTEP_PARTITIONING=4 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: INPUT_VERTICES=1001 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: > org.apache.hama.bsp.JobInProgress$JobCounter >>13/07/14 10:17:57 INFO bsp.BSPJobClient: SUPERSTEPS=48 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: LAUNCHED_TASKS=6 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: > org.apache.hama.bsp.BSPPeerImpl$PeerCounter >>13/07/14 10:17:57 INFO bsp.BSPJobClient: SUPERSTEP_SUM=294 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: IO_BYTES_READ=344290795 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: TIME_IN_SYNC_MS=411231 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: TOTAL_MESSAGES_SENT=1592 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: TASK_INPUT_RECORDS=1001 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: TOTAL_MESSAGES_RECEIVED=1580 >>13/07/14 10:17:57 INFO bsp.BSPJobClient: TASK_OUTPUT_RECORDS=1001 > Job 1 Finished in 3559.706 seconds > > > Any ideas? > Thanks in advance, > Kostas X. > > > On Sun, Jul 14, 2013 at 10:08 AM, Chia-Hung Lin <[email protected]>wrote: > >> Any chance to show how the code, logic, log, etc. is executed? Others >> might be able to help spot the issue in underlying infrastructure or >> somewhere else. >> >> On 14 July 2013 15:00, Kostas Xirog <[email protected]> wrote: >> > Hello, >> > >> > I'm running my program with 400.000 records as data and the execution >> takes >> > 50 minutes whereas the execution of the same query on 200.000 records >> > takes 70 seconds. Any idea why that might be? I've been monitoring my >> > system with the 'top' command, and I see that for these 50 minutes the >> > memory usage is 75.5% and the CPU as at 100 almost constantly... >> > >> > I'm running hama in local mode on one machine with 8GB of RAM and 8 CPUs. >> > Any idea why that might be? Any ideas of how I can fix it? >> > >> > Thanks in advance, >> > Kostas X. >>
-- Best Regards, Edward J. Yoon @eddieyoon
