Re: Pregel

2009-06-26 Thread Owen O'Malley


On Jun 25, 2009, at 9:42 PM, Mark Kerzner wrote:

my guess, as good as anybody's, is that Pregel is to large graphs is  
what

Hadoop is to large datasets.


I think it is much more likely a language that allows you to easily  
define fixed point algorithms.  I would imagine a distributed version  
of something similar to Michal Young's GenSet. http://portal.acm.org/citation.cfm?doid=586094.586108


I've been trying to figure out how to justify working on a project  
like that for a couple of years, but haven't yet. (I have a background  
in program static analysis, so I've implemented similar stuff.)



In other words, Pregel is the next natural step
for massively scalable computations after Hadoop.


I wonder if it uses map/reduce as a base or not. It would be easier to  
use map/reduce, but a direct implementation would be more performant.  
In either case, it is a new hammer. From what I see, it likely won't  
replace map/reduce, pig, or hive; but rather support a different class  
of applications much more directly than you can under map/reduce.


-- Owen



Re: Pregel

2009-06-26 Thread Edward J. Yoon
According to my understanding, I think the Pregel is in same layer
with MR, not a MR based language processor.

I think the 'Collective Communication' of BSP seems the core of the
problem. For example, this BFS problem
(http://blog.udanax.org/2009/02/breadth-first-search-mapreduce.html)
can be solved at once w/o MR iterations.

On Fri, Jun 26, 2009 at 3:17 PM, Owen O'Malleyomal...@apache.org wrote:

 On Jun 25, 2009, at 9:42 PM, Mark Kerzner wrote:

 my guess, as good as anybody's, is that Pregel is to large graphs is what
 Hadoop is to large datasets.

 I think it is much more likely a language that allows you to easily define
 fixed point algorithms.  I would imagine a distributed version of something
 similar to Michal Young's GenSet.
 http://portal.acm.org/citation.cfm?doid=586094.586108

 I've been trying to figure out how to justify working on a project like that
 for a couple of years, but haven't yet. (I have a background in program
 static analysis, so I've implemented similar stuff.)

 In other words, Pregel is the next natural step
 for massively scalable computations after Hadoop.

 I wonder if it uses map/reduce as a base or not. It would be easier to use
 map/reduce, but a direct implementation would be more performant. In either
 case, it is a new hammer. From what I see, it likely won't replace
 map/reduce, pig, or hive; but rather support a different class of
 applications much more directly than you can under map/reduce.

 -- Owen





-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardy...@apache.org
http://blog.udanax.org


Re: Pregel

2009-06-26 Thread Saptarshi Guha
Hello,
I don't have a  background in CS, but does MS's Dryad (
http://research.microsoft.com/en-us/projects/Dryad/ ) fit in anywhere
here?
Regards
Saptarshi


On Fri, Jun 26, 2009 at 5:19 AM, Edward J. Yoonedwardy...@apache.org wrote:
 According to my understanding, I think the Pregel is in same layer
 with MR, not a MR based language processor.

 I think the 'Collective Communication' of BSP seems the core of the
 problem. For example, this BFS problem
 (http://blog.udanax.org/2009/02/breadth-first-search-mapreduce.html)
 can be solved at once w/o MR iterations.

 On Fri, Jun 26, 2009 at 3:17 PM, Owen O'Malleyomal...@apache.org wrote:

 On Jun 25, 2009, at 9:42 PM, Mark Kerzner wrote:

 my guess, as good as anybody's, is that Pregel is to large graphs is what
 Hadoop is to large datasets.

 I think it is much more likely a language that allows you to easily define
 fixed point algorithms.  I would imagine a distributed version of something
 similar to Michal Young's GenSet.
 http://portal.acm.org/citation.cfm?doid=586094.586108

 I've been trying to figure out how to justify working on a project like that
 for a couple of years, but haven't yet. (I have a background in program
 static analysis, so I've implemented similar stuff.)

 In other words, Pregel is the next natural step
 for massively scalable computations after Hadoop.

 I wonder if it uses map/reduce as a base or not. It would be easier to use
 map/reduce, but a direct implementation would be more performant. In either
 case, it is a new hammer. From what I see, it likely won't replace
 map/reduce, pig, or hive; but rather support a different class of
 applications much more directly than you can under map/reduce.

 -- Owen





 --
 Best Regards, Edward J. Yoon @ NHN, corp.
 edwardy...@apache.org
 http://blog.udanax.org



Pregel

2009-06-25 Thread Mark Kerzner
Hi all,
my guess, as good as anybody's, is that Pregel is to large graphs is what
Hadoop is to large datasets. In other words, Pregel is the next natural step
for massively scalable computations after Hadoop. And, as with MapReduce,
Google will talk about the technology but not give out the code
impementation. Is this then the next task for Doug Cutting and his team?

Of course, hardly anything can be done before the August talk, and the
Euler's theorem is no spoiler at all. But August will be here soon enough,
and besides, why do they pre-announce the talk? Maybe they plan to leak
something.

Does anybody think differently?

Sincerely,
Mark