yes. Once Suraj added merging of sorted files we can add this to the
partitioner pretty easily.

2013/2/28 Edward J. Yoon <[email protected]>

> Eh,..... btw, is re-partitioned data really necessary to be Sorted?
>
> On Thu, Feb 28, 2013 at 7:48 PM, Thomas Jungblut
> <[email protected]> wrote:
> > Now I get how the partitioning works, obviously if you merge n sorted
> files
> > by just appending to each other, this will result in totally unsorted
> data
> > ;-)
> > Why didn't you solve this via messaging?
> >
> > 2013/2/28 Thomas Jungblut <[email protected]>
> >
> >> Seems that they are not correctly sorted:
> >>
> >> vertexID: 50
> >> vertexID: 52
> >> vertexID: 54
> >> vertexID: 56
> >> vertexID: 58
> >> vertexID: 61
> >> ...
> >> vertexID: 78
> >> vertexID: 81
> >> vertexID: 83
> >> vertexID: 85
> >> ...
> >> vertexID: 94
> >> vertexID: 96
> >> vertexID: 98
> >> vertexID: 1
> >> vertexID: 10
> >> vertexID: 12
> >> vertexID: 14
> >> vertexID: 16
> >> vertexID: 18
> >> vertexID: 21
> >> vertexID: 23
> >> vertexID: 25
> >> vertexID: 27
> >> vertexID: 29
> >> vertexID: 3
> >>
> >> So this won't work then correctly...
> >>
> >>
> >> 2013/2/28 Thomas Jungblut <[email protected]>
> >>
> >>> sure, have fun on your holidays.
> >>>
> >>>
> >>> 2013/2/28 Edward J. Yoon <[email protected]>
> >>>
> >>>> Sure, but if you can fix quickly, please do. March 1 is holiday[1] so
> >>>> I'll appear next week.
> >>>>
> >>>> 1. http://en.wikipedia.org/wiki/Public_holidays_in_South_Korea
> >>>>
> >>>> On Thu, Feb 28, 2013 at 6:36 PM, Thomas Jungblut
> >>>> <[email protected]> wrote:
> >>>> > Maybe 50 is missing from the file, didn't observe if all items were
> >>>> added.
> >>>> > As far as I remember, I copy/pasted the logic of the ID into the
> >>>> fastgen,
> >>>> > want to have a look into it?
> >>>> >
> >>>> > 2013/2/28 Edward J. Yoon <[email protected]>
> >>>> >
> >>>> >> I guess, it's a bug of fastgen, when generate adjacency matrix into
> >>>> >> multiple files.
> >>>> >>
> >>>> >> On Thu, Feb 28, 2013 at 6:29 PM, Thomas Jungblut
> >>>> >> <[email protected]> wrote:
> >>>> >> > You have two files, are they partitioned correctly?
> >>>> >> >
> >>>> >> > 2013/2/28 Edward J. Yoon <[email protected]>
> >>>> >> >
> >>>> >> >> It looks like a bug.
> >>>> >> >>
> >>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al /tmp/randomgraph/
> >>>> >> >> total 44
> >>>> >> >> drwxrwxr-x  3 edward edward  4096  2월 28 18:03 .
> >>>> >> >> drwxrwxrwt 19 root   root   20480  2월 28 18:04 ..
> >>>> >> >> -rwxrwxrwx  1 edward edward  2243  2월 28 18:01 part-00000
> >>>> >> >> -rw-rw-r--  1 edward edward    28  2월 28 18:01 .part-00000.crc
> >>>> >> >> -rwxrwxrwx  1 edward edward  2251  2월 28 18:01 part-00001
> >>>> >> >> -rw-rw-r--  1 edward edward    28  2월 28 18:01 .part-00001.crc
> >>>> >> >> drwxrwxr-x  2 edward edward  4096  2월 28 18:03 partitions
> >>>> >> >> edward@udanax:~/workspace/hama-trunk$ ls -al
> >>>> >> /tmp/randomgraph/partitions/
> >>>> >> >> total 24
> >>>> >> >> drwxrwxr-x 2 edward edward 4096  2월 28 18:03 .
> >>>> >> >> drwxrwxr-x 3 edward edward 4096  2월 28 18:03 ..
> >>>> >> >> -rwxrwxrwx 1 edward edward 2932  2월 28 18:03 part-00000
> >>>> >> >> -rw-rw-r-- 1 edward edward   32  2월 28 18:03 .part-00000.crc
> >>>> >> >> -rwxrwxrwx 1 edward edward 2955  2월 28 18:03 part-00001
> >>>> >> >> -rw-rw-r-- 1 edward edward   32  2월 28 18:03 .part-00001.crc
> >>>> >> >> edward@udanax:~/workspace/hama-trunk$
> >>>> >> >>
> >>>> >> >>
> >>>> >> >> On Thu, Feb 28, 2013 at 5:27 PM, Edward <[email protected]>
> wrote:
> >>>> >> >> > yes i'll check again
> >>>> >> >> >
> >>>> >> >> > Sent from my iPhone
> >>>> >> >> >
> >>>> >> >> > On Feb 28, 2013, at 5:18 PM, Thomas Jungblut <
> >>>> >> [email protected]>
> >>>> >> >> wrote:
> >>>> >> >> >
> >>>> >> >> >> Can you verify an observation for me please?
> >>>> >> >> >>
> >>>> >> >> >> 2 files are created from fastgen, part-00000 and part-00001,
> >>>> both
> >>>> >> ~2.2kb
> >>>> >> >> >> sized.
> >>>> >> >> >> In the below partition directory, there is only a single
> 5.56kb
> >>>> file.
> >>>> >> >> >>
> >>>> >> >> >> Is it intended for the partitioner to write a single file if
> you
> >>>> >> >> configured
> >>>> >> >> >> two?
> >>>> >> >> >> It even reads it as a two files, strange huh?
> >>>> >> >> >>
> >>>> >> >> >> 2013/2/28 Thomas Jungblut <[email protected]>
> >>>> >> >> >>
> >>>> >> >> >>> Will have a look into it.
> >>>> >> >> >>>
> >>>> >> >> >>> gen fastgen 100 10 /tmp/randomgraph 1
> >>>> >> >> >>> pagerank /tmp/randomgraph /tmp/pageout
> >>>> >> >> >>>
> >>>> >> >> >>> did work for me the last time I profiled, maybe the
> >>>> partitioning
> >>>> >> >> doesn't
> >>>> >> >> >>> partition correctly with the input or something else.
> >>>> >> >> >>>
> >>>> >> >> >>>
> >>>> >> >> >>> 2013/2/28 Edward J. Yoon <[email protected]>
> >>>> >> >> >>>
> >>>> >> >> >>> Fastgen input seems not work for graph examples.
> >>>> >> >> >>>>
> >>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$ bin/hama
> jar
> >>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar gen
> fastgen
> >>>> 100 10
> >>>> >> >> >>>> /tmp/randomgraph 2
> >>>> >> >> >>>> 13/02/28 10:32:02 WARN util.NativeCodeLoader: Unable to
> load
> >>>> >> >> >>>> native-hadoop library for your platform... using
> builtin-java
> >>>> >> classes
> >>>> >> >> >>>> where applicable
> >>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.BSPJobClient: Running job:
> >>>> >> >> job_localrunner_0001
> >>>> >> >> >>>> 13/02/28 10:32:03 INFO bsp.LocalBSPRunner: Setting up a new
> >>>> barrier
> >>>> >> >> for 2
> >>>> >> >> >>>> tasks!
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: Current supersteps
> >>>> >> number: 0
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: The total number
> of
> >>>> >> >> supersteps: 0
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient: Counters: 3
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
> >>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:     SUPERSTEPS=0
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
> LAUNCHED_TASKS=2
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
> >>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
> >>>> >> >> >>>> 13/02/28 10:32:06 INFO bsp.BSPJobClient:
> >>>> >> TASK_OUTPUT_RECORDS=100
> >>>> >> >> >>>> Job Finished in 3.212 seconds
> >>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$ bin/hama
> jar
> >>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT
> >>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT-javadoc.jar
> >>>> >> >> >>>> hama-examples-0.7.0-SNAPSHOT.jar
> >>>> >> >> >>>> edward@edward-virtualBox:~/workspace/hama-trunk$ bin/hama
> jar
> >>>> >> >> >>>> examples/target/hama-examples-0.7.0-SNAPSHOT.jar pagerank
> >>>> >> >> >>>> /tmp/randomgraph /tmp/pageour
> >>>> >> >> >>>> 13/02/28 10:32:29 WARN util.NativeCodeLoader: Unable to
> load
> >>>> >> >> >>>> native-hadoop library for your platform... using
> builtin-java
> >>>> >> classes
> >>>> >> >> >>>> where applicable
> >>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat: Total input
> paths
> >>>> to
> >>>> >> >> process
> >>>> >> >> >>>> : 2
> >>>> >> >> >>>> 13/02/28 10:32:29 INFO bsp.FileInputFormat: Total input
> paths
> >>>> to
> >>>> >> >> process
> >>>> >> >> >>>> : 2
> >>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.BSPJobClient: Running job:
> >>>> >> >> job_localrunner_0001
> >>>> >> >> >>>> 13/02/28 10:32:30 INFO bsp.LocalBSPRunner: Setting up a new
> >>>> barrier
> >>>> >> >> for 2
> >>>> >> >> >>>> tasks!
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Current supersteps
> >>>> >> number: 1
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: The total number
> of
> >>>> >> >> supersteps: 1
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Counters: 6
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> >>>> >> >> >>>> org.apache.hama.bsp.JobInProgress$JobCounter
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:     SUPERSTEPS=1
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> LAUNCHED_TASKS=2
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> >>>> >> >> >>>> org.apache.hama.bsp.BSPPeerImpl$PeerCounter
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> SUPERSTEP_SUM=4
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> >>>> IO_BYTES_READ=4332
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> >>>> TIME_IN_SYNC_MS=14
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient:
> >>>> TASK_INPUT_RECORDS=100
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.FileInputFormat: Total input
> paths
> >>>> to
> >>>> >> >> process
> >>>> >> >> >>>> : 2
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.BSPJobClient: Running job:
> >>>> >> >> job_localrunner_0001
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO bsp.LocalBSPRunner: Setting up a new
> >>>> barrier
> >>>> >> >> for 2
> >>>> >> >> >>>> tasks!
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner: 50 vertices
> are
> >>>> loaded
> >>>> >> >> into
> >>>> >> >> >>>> local:1
> >>>> >> >> >>>> 13/02/28 10:32:33 INFO graph.GraphJobRunner: 50 vertices
> are
> >>>> loaded
> >>>> >> >> into
> >>>> >> >> >>>> local:0
> >>>> >> >> >>>> 13/02/28 10:32:33 ERROR bsp.LocalBSPRunner: Exception
> during
> >>>> BSP
> >>>> >> >> >>>> execution!
> >>>> >> >> >>>> java.lang.IllegalArgumentException: Messages must never be
> >>>> behind
> >>>> >> the
> >>>> >> >> >>>> vertex in ID! Current Message ID: 1 vs. 50
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >>
> org.apache.hama.graph.GraphJobRunner.iterate(GraphJobRunner.java:279)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>> >>
> >>>>
> org.apache.hama.graph.GraphJobRunner.doSuperstep(GraphJobRunner.java:225)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> org.apache.hama.graph.GraphJobRunner.bsp(GraphJobRunner.java:129)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>> >>
> >>>>
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.run(LocalBSPRunner.java:256)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>> >>
> >>>>
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:286)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>> >>
> >>>>
> org.apache.hama.bsp.LocalBSPRunner$BSPRunner.call(LocalBSPRunner.java:211)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>> >> >> >>>>        at
> >>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>> >> >> >>>>        at
> >>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>> >>
> >>>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> >>>> >> >> >>>>        at
> >>>> >> >> >>>>
> >>>> >> >>
> >>>> >>
> >>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> >>>> >> >> >>>>        at java.lang.Thread.run(Thread.java:722)
> >>>> >> >> >>>>
> >>>> >> >> >>>>
> >>>> >> >> >>>> --
> >>>> >> >> >>>> Best Regards, Edward J. Yoon
> >>>> >> >> >>>> @eddieyoon
> >>>> >> >> >>>
> >>>> >> >> >>>
> >>>> >> >>
> >>>> >> >>
> >>>> >> >>
> >>>> >> >> --
> >>>> >> >> Best Regards, Edward J. Yoon
> >>>> >> >> @eddieyoon
> >>>> >> >>
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> --
> >>>> >> Best Regards, Edward J. Yoon
> >>>> >> @eddieyoon
> >>>> >>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best Regards, Edward J. Yoon
> >>>> @eddieyoon
> >>>>
> >>>
> >>>
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>

Reply via email to