Re: running job with giraph dependency anomaly

2012-02-07 Thread David Garcia
Yeah.  I haven't changed anything with the standard Giraph stuff.  I just
made my own vertex and and VertexInputFormat.  We are in a 64bit
environment. . .is it possible that building a jar with 32bit tools would
be a problem?  I wouldn't think so, since that addressing
native-dependency issues was sort of the *point* of java. . .but, this
seems really odd to me.  Are there some dependency restrictions that I
should know about?  We have to use Jackson 1.6 (because we use cloudera
distribution of hadoop), and there are other libraries we use.  Thx again
for the feedback.

-David

On 2/7/12 8:08 PM, "Avery Ching"  wrote:

>If you're using GiraphJob, the mapper class should be set for you.
>That's weird.
>
>Avery
>
>On 2/7/12 5:58 PM, David Garcia wrote:
>> That's interesting.  Yes, I don't need native libraries.  The problem
>>I'm
>> having is that after I run job.waitForCompletion(..),
>> The job runs a mapper that is something other than GraphMapper.  It
>> doesn't complain that a Mapper isn't defined or anything.  It runs
>> something else.  As I mentioned below, the map-class doesn't appear to
>>be
>> defined.
>>
>>
>> On 2/7/12 7:50 PM, "Jakob Homan"  wrote:
>>
>>> That's not necessarily a bad thing.  Hadoop (not Giraph) has native
>>> code library it can use for improved performance.  You'll see this
>>> message when running on a cluster that's not been deployed to use the
>>> native libraries.  If I follow what you wrote, most likely your work
>>> project cluster is so configured.  Unless you actively expect to have
>>> the native libraries loaded, I wouldn't be concerned.
>>>
>>>
>>> On Tue, Feb 7, 2012 at 5:46 PM, David Garcia
>>> wrote:
 I am running into a weird error that I haven't seen yet (I suppose
I've
 been lucky).  I see the following in the logging:

 org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where
applicable


 In the job definition, the property "mapreduce.map.class" is not even
 defined.  For Giraph, this is usually set to
 "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"

 I'm building my project with hadoop 0.20.204.

 When I build the GiraphProject myself (and run my own tests with the
 projects dependencies), I have no problems.  The main difference is
that
 I'm using a Giraph dependency in my work project.  All input is
welcome.
 Thx!!

 -David

>



Re: running job with giraph dependency anomaly

2012-02-07 Thread Avery Ching
If you're using GiraphJob, the mapper class should be set for you.  
That's weird.


Avery

On 2/7/12 5:58 PM, David Garcia wrote:

That's interesting.  Yes, I don't need native libraries.  The problem I'm
having is that after I run job.waitForCompletion(..),
The job runs a mapper that is something other than GraphMapper.  It
doesn't complain that a Mapper isn't defined or anything.  It runs
something else.  As I mentioned below, the map-class doesn't appear to be
defined.


On 2/7/12 7:50 PM, "Jakob Homan"  wrote:


That's not necessarily a bad thing.  Hadoop (not Giraph) has native
code library it can use for improved performance.  You'll see this
message when running on a cluster that's not been deployed to use the
native libraries.  If I follow what you wrote, most likely your work
project cluster is so configured.  Unless you actively expect to have
the native libraries loaded, I wouldn't be concerned.


On Tue, Feb 7, 2012 at 5:46 PM, David Garcia
wrote:

I am running into a weird error that I haven't seen yet (I suppose I've
been lucky).  I see the following in the logging:

org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable


In the job definition, the property "mapreduce.map.class" is not even
defined.  For Giraph, this is usually set to
"mapreduce.map.class=org.apache.giraph.graph.GraphMapper"

I'm building my project with hadoop 0.20.204.

When I build the GiraphProject myself (and run my own tests with the
projects dependencies), I have no problems.  The main difference is that
I'm using a Giraph dependency in my work project.  All input is welcome.
Thx!!

-David





Re: running job with giraph dependency anomaly

2012-02-07 Thread David Garcia
That's interesting.  Yes, I don't need native libraries.  The problem I'm
having is that after I run job.waitForCompletion(..),
The job runs a mapper that is something other than GraphMapper.  It
doesn't complain that a Mapper isn't defined or anything.  It runs
something else.  As I mentioned below, the map-class doesn't appear to be
defined.


On 2/7/12 7:50 PM, "Jakob Homan"  wrote:

>That's not necessarily a bad thing.  Hadoop (not Giraph) has native
>code library it can use for improved performance.  You'll see this
>message when running on a cluster that's not been deployed to use the
>native libraries.  If I follow what you wrote, most likely your work
>project cluster is so configured.  Unless you actively expect to have
>the native libraries loaded, I wouldn't be concerned.
>
>
>On Tue, Feb 7, 2012 at 5:46 PM, David Garcia 
>wrote:
>> I am running into a weird error that I haven't seen yet (I suppose I've
>> been lucky).  I see the following in the logging:
>>
>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>>
>>
>> In the job definition, the property "mapreduce.map.class" is not even
>> defined.  For Giraph, this is usually set to
>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>
>> I'm building my project with hadoop 0.20.204.
>>
>> When I build the GiraphProject myself (and run my own tests with the
>> projects dependencies), I have no problems.  The main difference is that
>> I'm using a Giraph dependency in my work project.  All input is welcome.
>> Thx!!
>>
>> -David
>>



Re: running job with giraph dependency anomaly

2012-02-07 Thread Jakob Homan
That's not necessarily a bad thing.  Hadoop (not Giraph) has native
code library it can use for improved performance.  You'll see this
message when running on a cluster that's not been deployed to use the
native libraries.  If I follow what you wrote, most likely your work
project cluster is so configured.  Unless you actively expect to have
the native libraries loaded, I wouldn't be concerned.


On Tue, Feb 7, 2012 at 5:46 PM, David Garcia  wrote:
> I am running into a weird error that I haven't seen yet (I suppose I've
> been lucky).  I see the following in the logging:
>
> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
>
> In the job definition, the property "mapreduce.map.class" is not even
> defined.  For Giraph, this is usually set to
> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>
> I'm building my project with hadoop 0.20.204.
>
> When I build the GiraphProject myself (and run my own tests with the
> projects dependencies), I have no problems.  The main difference is that
> I'm using a Giraph dependency in my work project.  All input is welcome.
> Thx!!
>
> -David
>


running job with giraph dependency anomaly

2012-02-07 Thread David Garcia
I am running into a weird error that I haven't seen yet (I suppose I've
been lucky).  I see the following in the logging:

org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable


In the job definition, the property "mapreduce.map.class" is not even
defined.  For Giraph, this is usually set to
"mapreduce.map.class=org.apache.giraph.graph.GraphMapper"

I'm building my project with hadoop 0.20.204.

When I build the GiraphProject myself (and run my own tests with the
projects dependencies), I have no problems.  The main difference is that
I'm using a Giraph dependency in my work project.  All input is welcome.
Thx!!

-David



Re: the slides for my talk @ FOSDEM

2012-02-07 Thread Claudio Martella
Hi Sebastian,

thanks for your feedback on the slides.

As a matter of fact I'm aware of the pegasus matrix-based optimization
as of the shimmy technique by Jimmy Lin. I thought that this kind of
technique is general enough for all iterative graph algorithms, not
just PR, and mostly using the naive algorithm would just help me out
explaining the presentation. Messaging the adjacent vertices from the
Mapper by iterating over them and emitting (otherVertex, myPartialPR)
maps easily to our messaging paradigm. I'll maybe make it more clear
in the next presentation.

Thanks!


On Mon, Feb 6, 2012 at 2:54 PM, Sebastian Schelter  wrote:
> Hi Claudio,
>
> nice job with the slides! I have only one small point to criticize:
>
> When PageRank is implemented with MapReduce, it's not necessary to have
> the graph passed through in each iteration. Mahout for example uses
> power iterations where the adjacency matrix is multiplied by the
> pagerank vector and only that vector has to be sent over the network.
> Pegasus uses a similar approach.
>
> /s
>
>
>
> On 06.02.2012 12:24, Claudio Martella wrote:
>> Hello guys,
>>
>> for those interested, here are the "slides" for my talk at FOSDEM.
>>
>> http://prezi.com/9ake_klzwrga/apache-giraph-distributed-graph-processing-in-the-cloud/
>>
>> The event was very nice, a tight community and a great interest in
>> Giraph. Isabel Drost, one of the organizers of Berlin Buzzwords,
>> invited the talk there. Jakob, are you still planning to talk there?
>> Maybe we can split Kafka/Giraph talks?
>>
>> Best,
>> Claudio
>>
>



-- 
   Claudio Martella
   claudio.marte...@gmail.com