Re: stand alone implementation

2011-12-28 Thread Avery Ching

Gavan,

My comments are inlined.

Avery

On 12/28/11 11:48 PM, Gavan Hood wrote:


Thanks Avery,

I am asking questions up front ahead of jumping into the code.

I am looking  at embedded up to cloud scalability.

The  map slot approach hints that performance would be good on multi 
core machines compared to alternative graph approaches, is that a 
reasonable assumption.


This type of approach will work to utilize multiple cores, but there is 
probably some overhead form the Task Tracker and Job Tracker that could 
be avoided with some optimizations.


Do you have any idea of the performance trade off on a single core 
machine / laptop?


A single machine avoids the network I/O.  This is a good thing.  But 
it's limited to the speed/memory of the single machine rather that 
utilizing lots of machines.


Is the single machine support just for debug or could you build an 
application upon it.


You could do this, but remember that we have not optimized for this 
case.  That being said, there is no reason we can't tweak a couple of 
things to improve this.


Could you consider the above question for embedded systems (android  
devices , iphone etc)


Is it PC and up technology or is it able to be configured for 
reasonable support on these devices.


I realise this applies to Hadoop as much as Giraph.


Yes, a lot of what I said would apply to Hadoop as well.


Perhaps the answer is in your response of not requiring Hadoop to run, 
does this mean there is an alternative or generic persistence model?


Giraph is a graph processing framework, not a persistent storage 
system.  You can store your data anyway you like (i.e. hard drive, flash 
drive, etc.)


If the embedded implementation is a problem, what is required to 
generate a back end for this size of device, has there been any 
thought on this side.


I haven't thought much about using Giraph on embedded devices.  I 
certainly wouldn't want to run graph processing applications on my 
phone.  Think about what that would do to my battery life =).



Regards

Gavan

*From:*Avery Ching [mailto:ach...@apache.org]
*Sent:* Thursday, 29 December 2011 1:07 AM
*To:* giraph-user@incubator.apache.org
*Subject:* Re: stand alone implementation

Hi Gavan,

Giraph can run on a single machine as well as multiple machines, just 
like Hadoop.  Our test suite can be run with or without a running 
Hadoop instance as an example.


If you want to take advantage of multiple cores though, you might want 
to try running Hadoop with multiple map slots on the single node and 
then using the appropriate number of workers.


Hope that helps,

Avery

On 12/28/11 2:41 PM, Gavan Hood wrote:

Hi all,

I know the focus of giraph is multiple machines etc

What if I want to scale down to single pc/ multiple cpu's and even 
down to embedded systems.


Is this project and hadoop able to scale down as well as up ?

Regards

Gavan





RE: stand alone implementation

2011-12-28 Thread Gavan Hood
Thanks Avery,

 

I am asking questions up front ahead of jumping into the code.

I am looking  at embedded up to cloud scalability.

 

The  map slot approach hints that performance would be good on multi core
machines compared to alternative graph approaches, is that a reasonable
assumption.

 

Do you have any idea of the performance trade off on a single core machine /
laptop? 

 

Is the single machine support just for debug or could you build an
application upon it.

 

Could you consider the above question for embedded systems (android  devices
, iphone etc)

Is it PC and up technology or is it able to be configured for reasonable
support on these devices.

 

I realise this applies to Hadoop as much as Giraph.

Perhaps the answer is in your response of not requiring Hadoop to run, does
this mean there is an alternative or generic persistence model?

If the embedded implementation is a problem, what is required to generate a
back end for this size of device, has there been any thought on this side.

 

Regards

Gavan 

 

 

From: Avery Ching [mailto:ach...@apache.org] 
Sent: Thursday, 29 December 2011 1:07 AM
To: giraph-user@incubator.apache.org
Subject: Re: stand alone implementation

 

Hi Gavan,

Giraph can run on a single machine as well as multiple machines, just like
Hadoop.  Our test suite can be run with or without a running Hadoop instance
as an example.

If you want to take advantage of multiple cores though, you might want to
try running Hadoop with multiple map slots on the single node and then using
the appropriate number of workers.

Hope that helps,

Avery

On 12/28/11 2:41 PM, Gavan Hood wrote: 

Hi all,

I know the focus of giraph is multiple machines etc..

What if I want to scale down to single pc/ multiple cpu's and even down to
embedded systems. 

Is this project and hadoop able to scale down as well as up ?

 

Regards

Gavan

 



Re: stand alone implementation

2011-12-28 Thread Avery Ching

Hi Gavan,

Giraph can run on a single machine as well as multiple machines, just 
like Hadoop.  Our test suite can be run with or without a running Hadoop 
instance as an example.


If you want to take advantage of multiple cores though, you might want 
to try running Hadoop with multiple map slots on the single node and 
then using the appropriate number of workers.


Hope that helps,

Avery

On 12/28/11 2:41 PM, Gavan Hood wrote:


Hi all,

I know the focus of giraph is multiple machines etc

What if I want to scale down to single pc/ multiple cpu's and even 
down to embedded systems.


Is this project and hadoop able to scale down as well as up ?

Regards

Gavan





stand alone implementation

2011-12-28 Thread Gavan Hood
Hi all,

I know the focus of giraph is multiple machines etc..

What if I want to scale down to single pc/ multiple cpu's and even down to
embedded systems. 

Is this project and hadoop able to scale down as well as up ?

 

Regards

Gavan