Huh, it might be a bug in the code. Could it be that Pattern.compile has to take "[\\t ]" (note the double backslash) to properly match tabs? If so, that bug is in all the input formats...
Happy to help :) Young On Mon, Mar 31, 2014 at 4:07 PM, ghufran malik <[email protected]>wrote: > Hi, > > I removed the spaces and it worked! I don't understand though. I'm sure > the separator pattern means that it splits it by tab spaces?. > > Thanks for all your help though some what relieved now! > > Kind regards, > > Ghufran > > > On Mon, Mar 31, 2014 at 8:15 PM, Young Han <[email protected]> wrote: > >> Hi, >> >> That looks like an error with the algorithm... What do the Hadoop >> userlogs say? >> >> And just to rule out weirdness, what happens if you use spaces instead of >> tabs (for your input graph)? >> >> Young >> >> >> On Mon, Mar 31, 2014 at 2:04 PM, ghufran malik >> <[email protected]>wrote: >> >>> Hey, >>> >>> No even after I added the .txt it gets to map 100% then drops back down >>> to 50 and gives me the error: >>> >>> 14/03/31 18:22:56 INFO utils.ConfigurationUtils: No edge input format >>> specified. Ensure your InputFormat does not require one. >>> 14/03/31 18:22:56 WARN job.GiraphConfigurationValidator: Output format >>> vertex index type is not known >>> 14/03/31 18:22:56 WARN job.GiraphConfigurationValidator: Output format >>> vertex value type is not known >>> 14/03/31 18:22:56 WARN job.GiraphConfigurationValidator: Output format >>> edge value type is not known >>> 14/03/31 18:22:56 INFO job.GiraphJob: run: Since checkpointing is >>> disabled (default), do not allow any task retries (setting >>> mapred.map.max.attempts = 0, old value = 4) >>> 14/03/31 18:22:57 INFO mapred.JobClient: Running job: >>> job_201403311622_0004 >>> 14/03/31 18:22:58 INFO mapred.JobClient: map 0% reduce 0% >>> 14/03/31 18:23:16 INFO mapred.JobClient: map 50% reduce 0% >>> 14/03/31 18:23:19 INFO mapred.JobClient: map 100% reduce 0% >>> 14/03/31 18:33:25 INFO mapred.JobClient: map 50% reduce 0% >>> 14/03/31 18:33:30 INFO mapred.JobClient: Job complete: >>> job_201403311622_0004 >>> 14/03/31 18:33:30 INFO mapred.JobClient: Counters: 6 >>> 14/03/31 18:33:30 INFO mapred.JobClient: Job Counters >>> 14/03/31 18:33:30 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=1238858 >>> 14/03/31 18:33:30 INFO mapred.JobClient: Total time spent by all >>> reduces waiting after reserving slots (ms)=0 >>> 14/03/31 18:33:30 INFO mapred.JobClient: Total time spent by all >>> maps waiting after reserving slots (ms)=0 >>> 14/03/31 18:33:30 INFO mapred.JobClient: Launched map tasks=2 >>> 14/03/31 18:33:30 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 >>> 14/03/31 18:33:30 INFO mapred.JobClient: Failed map tasks=1 >>> >>> >>> I did a check to make sure the graph was being stored correctly by >>> doing: >>> >>> ghufran@ghufran:~/Downloads/hadoop-0.20.203.0/bin$ hadoop dfs -cat >>> input/* >>> 1 2 >>> 2 1 3 4 >>> 3 2 >>> 4 2 >>> >> >> >
