Eh,..... btw, is re-partitioned data really necessary to be Sorted? On Thu, Feb 28, 2013 at 7:48 PM, Thomas Jungblut <[email protected]> wrote: > Now I get how the partitioning works, obviously if you merge n sorted files > by just appending to each other, this will result in totally unsorted data > ;-) > Why didn't you solve this via messaging? > > 2013/2/28 Thomas Jungblut <[email protected]> > >> Seems that they are not correctly sorted: >> >> vertexID: 50 >> vertexID: 52 >> vertexID: 54 >> vertexID: 56 >> vertexID: 58 >> vertexID: 61 >> ... >> vertexID: 78 >> vertexID: 81 >> vertexID: 83 >> vertexID: 85 >> ... >> vertexID: 94 >> vertexID: 96 >> vertexID: 98 >> vertexID: 1 >> vertexID: 10 >> vertexID: 12 >> vertexID: 14 >> vertexID: 16 >> vertexID: 18 >> vertexID: 21 >> vertexID: 23 >> vertexID: 25 >> vertexID: 27 >> vertexID: 29 >> vertexID: 3 >> >> So this won't work then correctly... >> >> >> 2013/2/28 Thomas Jungblut <[email protected]> >> >>> sure, have fun on your holidays. >>> >>> >>> 2013/2/28 Edward J. Yoon <[email protected]> >>> >>>> Sure, but if you can fix quickly, please do. March 1 is holiday[1] so >>>> I'll appear next week. >>>> >>>> 1. http://en.wikipedia.org/wiki/Public_holidays_in_South_Korea >>>> >>>> On Thu, Feb 28, 2013 at 6:36 PM, Thomas Jungblut >>>> <[email protected]> wrote: >>>> > Maybe 50 is missing from the file, didn't observe if all items were >>>> added. >>>> > As far as I remember, I copy/pasted the logic of the ID into the >>>> fastgen, >>>> > want to have a look into it? >>>> > >>>> > 2013/2/28 Edward J. Yoon <[email protected]> >>>> > >>>> >> I guess, it's a bug of fastgen, when generate adjacency matrix into >>>> >> multiple files. >>>> >> >>>> >> On Thu, Feb 28, 2013 at 6:29 PM, Thomas Jungblut >>>> >> <[email protected]> wrote: >>>> >> > You have two files, are they partitioned correctly? >>>> >> > >>>> >> > 2013/2/28 Edward J. Yoon <[email protected]> >>>> >> > >>>> >> >> It looks like a bug. >>>> >> >> >>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al /tmp/randomgraph/ >>>> >> >> total 44 >>>> >> >> drwxrwxr-x 3 edward edward 4096 2월 28 18:03 . >>>> >> >> drwxrwxrwt 19 root root 20480 2월 28 18:04 .. >>>> >> >> -rwxrwxrwx 1 edward edward 2243 2월 28 18:01 part-00000 >>>> >> >> -rw-rw-r-- 1 edward edward 28 2월 28 18:01 .part-00000.crc >>>> >> >> -rwxrwxrwx 1 edward edward 2251 2월 28 18:01 part-00001 >>>> >> >> -rw-rw-r-- 1 edward edward 28 2월 28 18:01 .part-00001.crc >>>> >> >> drwxrwxr-x 2 edward edward 4096 2월 28 18:03 partitions >>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al >>>> >> /tmp/randomgraph/partitions/ >>>> >> >> total 24 >>>> >> >> drwxrwxr-x 2 edward edward 4096 2월 28 18:03 . >>>> >> >> drwxrwxr-x 3 edward edward 4096 2월 28 18:03 .. >>>> >> >> -rwxrwxrwx 1 edward edward 2932 2월 28 18:03 part-00000 >>>> >> >> -rw-rw-r-- 1 edward edward 32 2월 28 18:03 .part-00000.crc >>>> >> >> -rwxrwxrwx 1 edward edward 2955 2월 28 18:03 part-00001 >>>> >> >> -rw-rw-r-- 1 edward edward 32 2월 28 18:03 .part-00001.crc >>>> >> >> edward@udanax:~/workspace/hama-trunk$ >>>> >> >> >>>> >> >> >>>> >> >> On Thu, Feb 28, 2013 at 5:27 PM, Edward <[email protected]> wrote: >>>> >> >> > yes i'll check again >>>> >> >> > >>>> >> >> > Sent from my iPhone >>>> >> >> > >>>> >> >> > On Feb 28, 2013, at 5:18 PM, Thomas Jungblut < >>>> >> [email protected]> >>>> >> >> wrote: >>>> >> >> > >>>> >> >> >> Can you verify an observation for me please? >>>> >> >> >> >>>> >> >> >> 2 files are created from fastgen, part-00000 and part-00001, >>>> both >>>> >> ~2.2kb >>>> >> >> >> sized. >>>> >> >> >> In the below partition directory, there is only a single 5.56kb >>>> file. >>>> >> >> >> >>>> >> >> >> Is it intended for the partitioner to write a single file if you >>>> >> >> configured >>>> >> >> >> two? >>>> >> >> >> It even reads it as a two files, strange huh? >>>> >> >> >> >>>> >> >> >> 2013/2/28 Thomas Jungblut <[email protected]> >>>> >> >> >> >>>> >> >> >>> Will have a look into it. >>>> >> >> >>> >>>> >> >> >>> gen fastgen 100 10 /tmp/randomgraph 1 >>>> >> >> >>> pagerank /tmp/randomgraph /tmp/pageout >>>> >> >> >>> >>>> >> >> >>> did work for me the last time I profiled, maybe the >>>> partitioning >>>> >> >> doesn't >>>> >> >> >>> partition correctly with the input or something else. >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> 2013/2/28 Edward J. Yoon <[email protected]> >>>> >> >> >>> >>>> >> >> >>> Fastgen input seems not work for graph examples. >>>> >> >> >>>> >>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$ bin/hama jar >>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar gen fastgen >>>> 100 10 >>>> >> >> >>>> /tmp/randomgraph 2 >>>> >> >> >>>> 13/02/28 10:32:02 WARN util.NativeCodeLoader: Unable to load >>>> >> >> >>>> native-hadoop library for your platform... using builtin-java >>>> >> classes >>>> >> >> >>>> where applicable >>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.BSPJobClient: Running job: >>>> >> >> job_localrunner_0001 >>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.LocalBSPRunner: Setting up a new >>>> barrier >>>> >> >> for 2 >>>> >> >> >>>> tasks! >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: Current supersteps >>>> >> number: 0 >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: The total number of >>>> >> >> supersteps: 0 >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: Counters: 3 >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: >>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: SUPERSTEPS=0 >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: LAUNCHED_TASKS=2 >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: >>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: >>>> >> TASK_OUTPUT_RECORDS=100 >>>> >> >> >>>> Job Finished in 3.212 seconds >>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$ bin/hama jar >>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT >>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT-javadoc.jar >>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT.jar >>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$ bin/hama jar >>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar pagerank >>>> >> >> >>>> /tmp/randomgraph /tmp/pageour >>>> >> >> >>>> 13/02/28 10:32:29 WARN util.NativeCodeLoader: Unable to load >>>> >> >> >>>> native-hadoop library for your platform... using builtin-java >>>> >> classes >>>> >> >> >>>> where applicable >>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat: Total input paths >>>> to >>>> >> >> process >>>> >> >> >>>> : 2 >>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat: Total input paths >>>> to >>>> >> >> process >>>> >> >> >>>> : 2 >>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.BSPJobClient: Running job: >>>> >> >> job_localrunner_0001 >>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.LocalBSPRunner: Setting up a new >>>> barrier >>>> >> >> for 2 >>>> >> >> >>>> tasks! >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Current supersteps >>>> >> number: 1 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: The total number of >>>> >> >> supersteps: 1 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Counters: 6 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: >>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: SUPERSTEPS=1 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: LAUNCHED_TASKS=2 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: >>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: SUPERSTEP_SUM=4 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: >>>> IO_BYTES_READ=4332 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: >>>> TIME_IN_SYNC_MS=14 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: >>>> TASK_INPUT_RECORDS=100 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.FileInputFormat: Total input paths >>>> to >>>> >> >> process >>>> >> >> >>>> : 2 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Running job: >>>> >> >> job_localrunner_0001 >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.LocalBSPRunner: Setting up a new >>>> barrier >>>> >> >> for 2 >>>> >> >> >>>> tasks! >>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner: 50 vertices are >>>> loaded >>>> >> >> into >>>> >> >> >>>> local:1 >>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner: 50 vertices are >>>> loaded >>>> >> >> into >>>> >> >> >>>> local:0 >>>> >> >> >>>> 13/02/28 10:32:33 ERROR bsp.LocalBSPRunner: Exception during >>>> BSP >>>> >> >> >>>> execution! >>>> >> >> >>>> java.lang.IllegalArgumentException: Messages must never be >>>> behind >>>> >> the >>>> >> >> >>>> vertex in ID! Current Message ID: 1 vs. 50 >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:279) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> >> >>>> org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:225) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:129) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> >> >>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:256) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> >> >>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> >> >>>> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) >>>> >> >> >>>> at >>>> java.util.concurrent.FutureTask.run(FutureTask.java:166) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) >>>> >> >> >>>> at >>>> java.util.concurrent.FutureTask.run(FutureTask.java:166) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> >> >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >>>> >> >> >>>> at >>>> >> >> >>>> >>>> >> >> >>>> >> >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >>>> >> >> >>>> at java.lang.Thread.run(Thread.java:722) >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> -- >>>> >> >> >>>> Best Regards, Edward J. Yoon >>>> >> >> >>>> @eddieyoon >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>>> >> >> >>>> >> >> >>>> >> >> -- >>>> >> >> Best Regards, Edward J. Yoon >>>> >> >> @eddieyoon >>>> >> >> >>>> >> >>>> >> >>>> >> >>>> >> -- >>>> >> Best Regards, Edward J. Yoon >>>> >> @eddieyoon >>>> >> >>>> >>>> >>>> >>>> -- >>>> Best Regards, Edward J. Yoon >>>> @eddieyoon >>>> >>> >>> >>
-- Best Regards, Edward J. Yoon @eddieyoon
