Yeah. . .you are having to modify scripts (not the best solution).  Using the 
distributed cache is way more flexible since you can put your jars wherever you 
want (on hdfs).  And you don't have to change environment stuff.  Please lemme 
know if you have any other questions.

From: yavuz gokirmak <<>>
Date: Mon, 20 Feb 2012 00:40:41 -0600
To: "<>" 
Subject: Re: how to use SimplePageRankVertex

Thank you,

I will try distributed cache.

When I use distributed cache, the patches I have written will be unnecassary ?

On 20 February 2012 03:44, David Garcia 
<<>> wrote:
so, if that's the case, it's possible that the Tasktracker process doesn't have 
the job on it's classpath.  Although you have added the jar to "a" classpath, 
I'm not certain that the Tasktracker will have it.  There are several ways to 
address this.  1.) you could bring Hadoop down, and then adjust 
to export the HADOOP_CLASSPATH environment variable to include your jar.   This 
variable is commented out by default.  If you are running in distributed mode, 
this means that you will have to copy this jar to ever single machine...and 
probably change this script on every single machine too...unless you are using 
something like condor (or puppet if you're hard core serious), this is a 
serious pain...and for changing MR jobs, totally overkill.  My personal 
preference is to use the Distributed cache, and copy your jar to a location in 

hope this helps.
From: yavuz gokirmak [<>]
Sent: Sunday, February 19, 2012 2:19 AM
Subject: Re: how to use SimplePageRankVertex

I am using pseudo distribudet cluster

On 19 February 2012 02:00, David Garcia 
Are you submitting this job to a pseudo distributed cluster or a fully 
distributed cluster?

Sent from my HTC Inspireā„¢ 4G on AT&T

----- Reply message -----
From: "yavuz gokirmak" 
Subject: how to use SimplePageRankVertex
Date: Sat, Feb 18, 2012 2:04 pm

Thank you for advices,

I have a few more questions.

I have created a class named INTPageRankVertex which is similar to 
SimplePageRankVertex and generated a jar holding only

Later, try to run with giraph command as below but get classpath errors:

giraph INTPageRankVertex.jar org.test.INTPageRankVertex \
-ip /user/hdfs/pagerankinput/graph.input \
-op /user/hdfs/pagerankoutput/ \
-w 1  \
-if org.test.INTPageRankVertex.INTPageRankVertexInputFormat \
-of org.test.INTPageRankVertex.INTPageRankVertexOutputFormat \

First I get,
Exception in thread "main" java.lang.ClassNotFoundException: 

in bin/giraph user jar is added to classpath on line 58

but CLASSPATH is overwritten on line 87
87.         CLASSPATH=`mvn dependency:build-classpath | grep -v "[INFO]"`

changing line 87 as below solves my first problem. Does this patch is valid?
87.         CLASSPATH=$CLASSPATH:`mvn dependency:build-classpath | grep -v 

After changing line 87 I get a different classpath error:
Exception in thread "main" java.lang.NoClassDefFoundError: 

And I solved this problem by adding below line

Does these patches are necessary or I am doing something wrong while running my 

best regards..

On 18 February 2012 18:37, Avery Ching 
IntIntNullIntTextInputFormat in the examples package (extending 
TextVertexInputFormat as David suggests) is very similar to what you need I 
think, although the types might be different for your application.  You can 
start with that perhaps.


On 2/18/12 7:48 AM, David Garcia wrote:
The easiest thing to do is to extend text vertex or/and textvertext input 
format and/or the record reader.  The record reader will give you the vertices 
you want.  Look at the record reader for textvertexinputformat.  It's an 
innerclass on this format class.

Sent from my HTC Inspireā„¢ 4G on AT&T

----- Reply message -----
From: "yavuz gokirmak" 
Subject: how to use SimplePageRankVertex
Date: Sat, Feb 18, 2012 9:08 am


I am planning to use giraph for network analysis. First I am trying to fully 
understand SimplePageRankVertex implementation and modify in order to serve my 

I have a question about example,
What is the expected input format for SimplePageRankVertex, I couldn't 
understand the input format although  SimplePageRankVertexReader class has few 

My input file is contains of rows such as:
usera, userb
usera, userc
userc, usera
userb, userc
userc, userb
Each row represents a relation between two users,
"usera,userb" means that "usera is clicked userb's profile"

Is it possible to make social network analysis over such kind of data using 
I will be glad if you can give advices..

thanks in advance
best regards

Reply via email to