Re: Pregel

2009-06-26 Thread Owen O'Malley


On Jun 25, 2009, at 9:42 PM, Mark Kerzner wrote:

my guess, as good as anybody's, is that Pregel is to large graphs is  
what

Hadoop is to large datasets.


I think it is much more likely a language that allows you to easily  
define fixed point algorithms.  I would imagine a distributed version  
of something similar to Michal Young's GenSet. http://portal.acm.org/citation.cfm?doid=586094.586108


I've been trying to figure out how to justify working on a project  
like that for a couple of years, but haven't yet. (I have a background  
in program static analysis, so I've implemented similar stuff.)



In other words, Pregel is the next natural step
for massively scalable computations after Hadoop.


I wonder if it uses map/reduce as a base or not. It would be easier to  
use map/reduce, but a direct implementation would be more performant.  
In either case, it is a new hammer. From what I see, it likely won't  
replace map/reduce, pig, or hive; but rather support a different class  
of applications much more directly than you can under map/reduce.


-- Owen



Re: Pregel

2009-06-26 Thread Edward J. Yoon
According to my understanding, I think the Pregel is in same layer
with MR, not a MR based language processor.

I think the 'Collective Communication' of BSP seems the core of the
problem. For example, this BFS problem
(http://blog.udanax.org/2009/02/breadth-first-search-mapreduce.html)
can be solved at once w/o MR iterations.

On Fri, Jun 26, 2009 at 3:17 PM, Owen O'Malleyomal...@apache.org wrote:

 On Jun 25, 2009, at 9:42 PM, Mark Kerzner wrote:

 my guess, as good as anybody's, is that Pregel is to large graphs is what
 Hadoop is to large datasets.

 I think it is much more likely a language that allows you to easily define
 fixed point algorithms.  I would imagine a distributed version of something
 similar to Michal Young's GenSet.
 http://portal.acm.org/citation.cfm?doid=586094.586108

 I've been trying to figure out how to justify working on a project like that
 for a couple of years, but haven't yet. (I have a background in program
 static analysis, so I've implemented similar stuff.)

 In other words, Pregel is the next natural step
 for massively scalable computations after Hadoop.

 I wonder if it uses map/reduce as a base or not. It would be easier to use
 map/reduce, but a direct implementation would be more performant. In either
 case, it is a new hammer. From what I see, it likely won't replace
 map/reduce, pig, or hive; but rather support a different class of
 applications much more directly than you can under map/reduce.

 -- Owen





-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardy...@apache.org
http://blog.udanax.org


Re: Pregel

2009-06-26 Thread Saptarshi Guha
Hello,
I don't have a  background in CS, but does MS's Dryad (
http://research.microsoft.com/en-us/projects/Dryad/ ) fit in anywhere
here?
Regards
Saptarshi


On Fri, Jun 26, 2009 at 5:19 AM, Edward J. Yoonedwardy...@apache.org wrote:
 According to my understanding, I think the Pregel is in same layer
 with MR, not a MR based language processor.

 I think the 'Collective Communication' of BSP seems the core of the
 problem. For example, this BFS problem
 (http://blog.udanax.org/2009/02/breadth-first-search-mapreduce.html)
 can be solved at once w/o MR iterations.

 On Fri, Jun 26, 2009 at 3:17 PM, Owen O'Malleyomal...@apache.org wrote:

 On Jun 25, 2009, at 9:42 PM, Mark Kerzner wrote:

 my guess, as good as anybody's, is that Pregel is to large graphs is what
 Hadoop is to large datasets.

 I think it is much more likely a language that allows you to easily define
 fixed point algorithms.  I would imagine a distributed version of something
 similar to Michal Young's GenSet.
 http://portal.acm.org/citation.cfm?doid=586094.586108

 I've been trying to figure out how to justify working on a project like that
 for a couple of years, but haven't yet. (I have a background in program
 static analysis, so I've implemented similar stuff.)

 In other words, Pregel is the next natural step
 for massively scalable computations after Hadoop.

 I wonder if it uses map/reduce as a base or not. It would be easier to use
 map/reduce, but a direct implementation would be more performant. In either
 case, it is a new hammer. From what I see, it likely won't replace
 map/reduce, pig, or hive; but rather support a different class of
 applications much more directly than you can under map/reduce.

 -- Owen





 --
 Best Regards, Edward J. Yoon @ NHN, corp.
 edwardy...@apache.org
 http://blog.udanax.org