[jira] [Commented] (GIRAPH-12) Investigate communication improvements

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104270#comment-13104270
 ] 

Avery Ching commented on GIRAPH-12:
---

Sound great, hope you had a nice vacation. Perhaps if you have some extra time, 
could you draft up a message passing benchmark that could be useful to compare 
you final implementation against the original?

> Investigate communication improvements
> --
>
> Key: GIRAPH-12
> URL: https://issues.apache.org/jira/browse/GIRAPH-12
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp
>Reporter: Avery Ching
>Assignee: Hyunsik Choi
>Priority: Minor
> Attachments: GIRAPH-12_1.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Avery Ching (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avery Ching resolved GIRAPH-31.
---

Resolution: Fixed

Thanks Jake!

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff, GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104234#comment-13104234
 ] 

Avery Ching commented on GIRAPH-31:
---

Times up (it is 9:10 PM) and there were no comments.  If there are any 
additional interface changes, we can always address them later.  I made some 
minor changes to fit the code conventions and verified that unittests passed.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff, GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-12) Investigate communication improvements

2011-09-13 Thread Hyunsik Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104230#comment-13104230
 ] 

Hyunsik Choi commented on GIRAPH-12:


Sorry for late response. Actually, I was on vacation between September 12-13.

Thank you for your testing. As you pointed out, the current patch incurs 
hotspots on the receiving side. I will add code lines to randomize flushes to 
mitigate skewness problem and some tweaks to improve the performance.



> Investigate communication improvements
> --
>
> Key: GIRAPH-12
> URL: https://issues.apache.org/jira/browse/GIRAPH-12
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp
>Reporter: Avery Ching
>Assignee: Hyunsik Choi
>Priority: Minor
> Attachments: GIRAPH-12_1.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-30) NPE in ZooKeeperManager if base directory cannot be created

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13104220#comment-13104220
 ] 

Avery Ching commented on GIRAPH-30:
---

Taking to long for another committer, and Andrew did review it.  I have 
committed.  If this is an issue, please reopen.

> NPE in ZooKeeperManager if base directory cannot be created
> ---
>
> Key: GIRAPH-30
> URL: https://issues.apache.org/jira/browse/GIRAPH-30
> Project: Giraph
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Attachments: GIRAPH-30.2.patch, GIRAPH-30.patch
>
>
> If the base directory cannot be created, for example if running on secure 
> Hadoop and the user home directory does not exist, ZooKeeperManager will 
> throw an NPE when trying to list it. It would be better to throw an 
> IOException with an informative message.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Edward J. Yoon
Interesintg. In our community, someone's thinking about asynchronous
message processing for more efficient iteration[1], too.

As I mentioned before to you, differ in slogan but not in kind. The
technical issues are nothing, Avery.

Anyway, ...

It would be nice if we can talk together continuously, for
collaborative competition. http://s.apache.org/HamaVsGiraph

1. http://markmail.org/thread/nrrevdrb5qc7ic5c

On Wed, Sep 14, 2011 at 2:47 AM, Avery Ching  wrote:
> Hi Vinod,
>
> Edward and I have chatted about this at times.  It sounds better in theory
> (both BSP based and adding support for MRv2) than in practice I think
> (underlying implementations are quite different).  Actually, I also believe
> that in the future, Giraph is not going to solely be BSP-based graph
> computing.  We are also thinking about other underlying computing models
> (i.e. streaming (asynchronous) graph processing - see
>
> http://mail-archives.apache.org/mod_mbox/incubator-giraph-user/201109.mbox/%3CCAEVHzWC8b-7RiBjkDiQKjiu-rVBz9=ogeoajxhbclcr5n3+...@mail.gmail.com%3E
>
> But I think today, the issues are the following:
>
> 1)  Giraph runs completely as a MapReduce job on Hadoop today.  This needs
> to be maintained to support our current users, who will not likely move to
> MRv2 for at least a year.
> 2)  The internals of Giraph are implemented differently than Hama and would
> take some time to port to.
> 3)  If we have various graph processing computing models (BSP based, streams
> or asynchronous, or a combination), then being on Hama brings little value
> for Giraph.
>
> Perhaps more practically, I wonder if it would be possible for someone from
> the Hama team to refactor our code a bit to support Hama-style BSP in
> Giraph?  Certainly would be a pretty cool project...
>
> Avery
>
> On 9/13/11 4:49 AM, Edward J. Yoon wrote:
>>
>> Quite a while ago, I implemented a clone of Google Pregel simply using
>> BSPLib[1] and decided to focus on BSP computing engine.
>>
>> Hama and Giraph projects are differ in slogan but not in kind.
>>
>> If we made some collaboration, Giraph should be implemented on top of
>> Hama BSP computing engine.
>>
>> Otherwise, we will back to square one again.
>>
>> 1. http://markmail.org/thread/4czcgtjupjvpqcqi
>>
>> On Sun, Sep 11, 2011 at 11:22 PM, Vinod Kumar Vavilapalli
>>   wrote:
>>>
>>> Crosspost to hama-dev and giraph-dev.
>>>
>>> It was only in my morning time that I was looking at HAMA-431, the port
>>> of
>>> Hama to YARN. And one of the tweets reminded me of JIRA issue GIRAPH-13
>>> which is about porting Giraph to YARN.
>>>
>>> I was also looking at the Girpah proposal for entry into Apache
>>> Incubator.
>>> There is an interesting section there:
>>> {quote}
>>> Relationships with Other Apache Products
>>>
>>> Giraph has some overlapping functionality with Apache Hama. However,
>>> there
>>> are some significant differences. Giraph focuses on graph-based bulk
>>> synchronous parallel (BSP) computing, while Apache Hama is more for
>>> general
>>> purposed BSP computing. Giraph runs on the Hadoop infrastructure, while
>>> Apache Hama uses its own computing framework.
>>> {quote}
>>>
>>> I agree with the point about Hama being a general purposed BSP and Giraph
>>> being completely graph oriented. But the later one about the
>>> infrastructure
>>> is going to be moot with both Giraph and Hama trying to be ported over to
>>> YARN.
>>>
>>> So here's my billion dollar question: Is it possible to implement
>>> Girpah's
>>> graph based APIs over the Hama's bsp APIs which both run over a single
>>> Apache BSP implementation over YARN?
>>>
>>> I also do see the email thread regarding Hama and Giraph's future
>>> collaboration when Hadoop NextGen aka YARN comes in:
>>> http://s.apache.org/HamaVsGiraph. So are we ready for this yet?
>>>
>>> Disclaimer: I come from the Hadoop world, have no idea of Giraph's APIs
>>> or
>>> internals except that I see a bsp package in Giraph's source tree. I do
>>> know
>>> a tiny bit about Hama's APIs and internal but my expertise is only two
>>> days.
>>>
>>> Thanks,
>>> +Vinod
>>> (An elephant maintainer trying to see if a Giraffe can be made to ride
>>> over
>>> a hippopotamus riding over an elephant)
>>>
>>
>>
>
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon


[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103948#comment-13103948
 ] 

Jake Mannix commented on GIRAPH-31:
---

Sounds good to me!  "Lazy consensus" is pretty common to The Apache Way ( 
http://www.apache.org/foundation/voting.html#LazyConsensus ).

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff, GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103944#comment-13103944
 ] 

Avery Ching commented on GIRAPH-31:
---

How about I wait until tonight (say after 7 pm) sometime to commit this?  In 
case anyone has any last thoughts...

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff, GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Avery Ching
Maybe it's possible, hard to say what will happen in a year.  However, 
at the same time, porting an application from any of the projects to the 
another should be shouldn't be too difficult since the Pregel API is 
relatively simple.  However, as I mentioned in my original post, I 
imagine that Giraph will support non-BSP graph computing models as well 
in the future (less portable).


Avery

On 9/13/11 12:51 PM, Dan Brickley wrote:

On 13 September 2011 21:43, Dmitriy Ryaboy  wrote:

Dan,
Given how fast we are currently iterating on the API in Giraph, I think
agreeing on a common API across 3 projects is a bit premature at this stage,
unfortunately..

Current velocity aside, ... could such an interface be plausible? e.g.
this time next year?

Dan




Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Dan Brickley
On 13 September 2011 21:43, Dmitriy Ryaboy  wrote:
> Dan,
> Given how fast we are currently iterating on the API in Giraph, I think
> agreeing on a common API across 3 projects is a bit premature at this stage,
> unfortunately..

Current velocity aside, ... could such an interface be plausible? e.g.
this time next year?

Dan


Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Dmitriy Ryaboy
Dan,
Given how fast we are currently iterating on the API in Giraph, I think
agreeing on a common API across 3 projects is a bit premature at this stage,
unfortunately..

D

On Tue, Sep 13, 2011 at 11:20 AM, Dan Brickley  wrote:

> On 13 September 2011 19:47, Avery Ching  wrote:
>
> > Perhaps more practically, I wonder if it would be possible for someone
> from
> > the Hama team to refactor our code a bit to support Hama-style BSP in
> > Giraph?  Certainly would be a pretty cool project...
>
> Maybe this is crazy, but: I was wondering...  Pregel's basic API
> approach is pretty straightforward, gloriously simple even. Could we
> have platform-neutral APIs that allowed portability of applications
> between  Pregel-based platforms? At least for Java...
>
> Right now, those of us who are more 'application people' than platform
> developers, are left searching around on 'pregel opensource' and have
> to try to guess which of the various Pregel-eseque platforms is
> looking most healthy. For example, my summer vacation project was
> checking out GoldenOrbOS. Yet by the time I get back, the Mahout list
> was buzzing with discussion of Giraph, so I took a look at that (and
> was pleasantly suprised).
>
> There is clearly a lot of energy and creativity right now going into
> this kind of distributed graph processing platform. That suggests to
> me that *finalising* cross-platform APIs would be premature. But it is
> also a time when platforms have a certain amount of flexibility that
> they will loose as they get adopted and embedded within products and
> processes. Could a Pregel-like Java API be agreed between platforms
> (e.g. let's consider Giraph, Hama, GoldenOrbOS), so that those of us
> investigating applications could proceed with some hope of later
> portability. This might be cheaper than trying to persuade Giraph to
> rebuild on top of Hama, or suchlike. Anyone care to make a first pass
> at suggesting some common interfaces?
>
> cheers,
>
> Dan
>



-- 
Dmitriy V Ryaboy
Twitter Analytics
http://twitter.com/squarecog


[jira] [Updated] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Mannix updated GIRAPH-31:
--

Attachment: GIRAPH-31.diff

Updated patch - remove isSorted(), document the fact that the iterator may or 
may not be sorted (and in fact is, in Vertex), and that users may subclass 
either Vertex *or* MutableVertex.  

I have not tested subclassing BasicVertex, which I suspect would fail in 
various ways, as VertexReader, GraphMapper, and some other classes may expect 
to get a MutableVertex for some methods.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff, GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Dan Brickley
On 13 September 2011 19:47, Avery Ching  wrote:

> Perhaps more practically, I wonder if it would be possible for someone from
> the Hama team to refactor our code a bit to support Hama-style BSP in
> Giraph?  Certainly would be a pretty cool project...

Maybe this is crazy, but: I was wondering...  Pregel's basic API
approach is pretty straightforward, gloriously simple even. Could we
have platform-neutral APIs that allowed portability of applications
between  Pregel-based platforms? At least for Java...

Right now, those of us who are more 'application people' than platform
developers, are left searching around on 'pregel opensource' and have
to try to guess which of the various Pregel-eseque platforms is
looking most healthy. For example, my summer vacation project was
checking out GoldenOrbOS. Yet by the time I get back, the Mahout list
was buzzing with discussion of Giraph, so I took a look at that (and
was pleasantly suprised).

There is clearly a lot of energy and creativity right now going into
this kind of distributed graph processing platform. That suggests to
me that *finalising* cross-platform APIs would be premature. But it is
also a time when platforms have a certain amount of flexibility that
they will loose as they get adopted and embedded within products and
processes. Could a Pregel-like Java API be agreed between platforms
(e.g. let's consider Giraph, Hama, GoldenOrbOS), so that those of us
investigating applications could proceed with some hope of later
portability. This might be cheaper than trying to persuade Giraph to
rebuild on top of Hama, or suchlike. Anyone care to make a first pass
at suggesting some common interfaces?

cheers,

Dan


Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Avery Ching

Hi Vinod,

Edward and I have chatted about this at times.  It sounds better in 
theory (both BSP based and adding support for MRv2) than in practice I 
think (underlying implementations are quite different).  Actually, I 
also believe that in the future, Giraph is not going to solely be 
BSP-based graph computing.  We are also thinking about other underlying 
computing models (i.e. streaming (asynchronous) graph processing - see


http://mail-archives.apache.org/mod_mbox/incubator-giraph-user/201109.mbox/%3CCAEVHzWC8b-7RiBjkDiQKjiu-rVBz9=ogeoajxhbclcr5n3+...@mail.gmail.com%3E

But I think today, the issues are the following:

1)  Giraph runs completely as a MapReduce job on Hadoop today.  This 
needs to be maintained to support our current users, who will not likely 
move to MRv2 for at least a year.
2)  The internals of Giraph are implemented differently than Hama and 
would take some time to port to.
3)  If we have various graph processing computing models (BSP based, 
streams or asynchronous, or a combination), then being on Hama brings 
little value for Giraph.


Perhaps more practically, I wonder if it would be possible for someone 
from the Hama team to refactor our code a bit to support Hama-style BSP 
in Giraph?  Certainly would be a pretty cool project...


Avery

On 9/13/11 4:49 AM, Edward J. Yoon wrote:

Quite a while ago, I implemented a clone of Google Pregel simply using
BSPLib[1] and decided to focus on BSP computing engine.

Hama and Giraph projects are differ in slogan but not in kind.

If we made some collaboration, Giraph should be implemented on top of
Hama BSP computing engine.

Otherwise, we will back to square one again.

1. http://markmail.org/thread/4czcgtjupjvpqcqi

On Sun, Sep 11, 2011 at 11:22 PM, Vinod Kumar Vavilapalli
  wrote:

Crosspost to hama-dev and giraph-dev.

It was only in my morning time that I was looking at HAMA-431, the port of
Hama to YARN. And one of the tweets reminded me of JIRA issue GIRAPH-13
which is about porting Giraph to YARN.

I was also looking at the Girpah proposal for entry into Apache Incubator.
There is an interesting section there:
{quote}
Relationships with Other Apache Products

Giraph has some overlapping functionality with Apache Hama. However, there
are some significant differences. Giraph focuses on graph-based bulk
synchronous parallel (BSP) computing, while Apache Hama is more for general
purposed BSP computing. Giraph runs on the Hadoop infrastructure, while
Apache Hama uses its own computing framework.
{quote}

I agree with the point about Hama being a general purposed BSP and Giraph
being completely graph oriented. But the later one about the infrastructure
is going to be moot with both Giraph and Hama trying to be ported over to
YARN.

So here's my billion dollar question: Is it possible to implement Girpah's
graph based APIs over the Hama's bsp APIs which both run over a single
Apache BSP implementation over YARN?

I also do see the email thread regarding Hama and Giraph's future
collaboration when Hadoop NextGen aka YARN comes in:
http://s.apache.org/HamaVsGiraph. So are we ready for this yet?

Disclaimer: I come from the Hadoop world, have no idea of Giraph's APIs or
internals except that I see a bsp package in Giraph's source tree. I do know
a tiny bit about Hama's APIs and internal but my expertise is only two days.

Thanks,
+Vinod
(An elephant maintainer trying to see if a Giraffe can be made to ride over
a hippopotamus riding over an elephant)








[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103815#comment-13103815
 ] 

Jake Mannix commented on GIRAPH-31:
---

Noticed one more thing: if people do subclass Vertex, we need to change 
destEdgeMap to be protected, as we don't provide a getter anymore, so 
subclasses which want to do range-queries or whatnot, can do so.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103798#comment-13103798
 ] 

Jake Mannix commented on GIRAPH-31:
---

+1 to that, given your argument on the current use of the class.  It may come a 
time when we have generic things going on in GraphMapper or BspServiceWorker 
which need to do special optimized things to sorted vertices, and at that time 
we can add an "isSorted()" or "getSortedIterator()" method.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Port to YARN: GIRAPH and HAMA

2011-09-13 Thread Edward J. Yoon
Quite a while ago, I implemented a clone of Google Pregel simply using
BSPLib[1] and decided to focus on BSP computing engine.

Hama and Giraph projects are differ in slogan but not in kind.

If we made some collaboration, Giraph should be implemented on top of
Hama BSP computing engine.

Otherwise, we will back to square one again.

1. http://markmail.org/thread/4czcgtjupjvpqcqi

On Sun, Sep 11, 2011 at 11:22 PM, Vinod Kumar Vavilapalli
 wrote:
> Crosspost to hama-dev and giraph-dev.
>
> It was only in my morning time that I was looking at HAMA-431, the port of
> Hama to YARN. And one of the tweets reminded me of JIRA issue GIRAPH-13
> which is about porting Giraph to YARN.
>
> I was also looking at the Girpah proposal for entry into Apache Incubator.
> There is an interesting section there:
> {quote}
> Relationships with Other Apache Products
>
> Giraph has some overlapping functionality with Apache Hama. However, there
> are some significant differences. Giraph focuses on graph-based bulk
> synchronous parallel (BSP) computing, while Apache Hama is more for general
> purposed BSP computing. Giraph runs on the Hadoop infrastructure, while
> Apache Hama uses its own computing framework.
> {quote}
>
> I agree with the point about Hama being a general purposed BSP and Giraph
> being completely graph oriented. But the later one about the infrastructure
> is going to be moot with both Giraph and Hama trying to be ported over to
> YARN.
>
> So here's my billion dollar question: Is it possible to implement Girpah's
> graph based APIs over the Hama's bsp APIs which both run over a single
> Apache BSP implementation over YARN?
>
> I also do see the email thread regarding Hama and Giraph's future
> collaboration when Hadoop NextGen aka YARN comes in:
> http://s.apache.org/HamaVsGiraph. So are we ready for this yet?
>
> Disclaimer: I come from the Hadoop world, have no idea of Giraph's APIs or
> internals except that I see a bsp package in Giraph's source tree. I do know
> a tiny bit about Hama's APIs and internal but my expertise is only two days.
>
> Thanks,
> +Vinod
> (An elephant maintainer trying to see if a Giraffe can be made to ride over
> a hippopotamus riding over an elephant)
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon


[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103793#comment-13103793
 ] 

Avery Ching commented on GIRAPH-31:
---

After "iterating" on it, given that we don't have a well defined use case for a 
sorted iterator and those apis I suggested are a little nasty, I think the 
prefer the following:

Each Vertex implementation should implement Iterable as you both suggest, but I 
think following the Java utils style of sorted or not feels the most natural.  
We can describe the iterating order via javadoc and we can have multiple Vertex 
implementation, i.e. SortedPrimitiveVertex, HashPrimitiveVertex, etc.  Somehow 
isSorted() feels a little yucky.  Examples from java utils Set implemetnations:

TreeSet:
Iterator iterator()  Returns an iterator over the elements in this set in 
ascending order.

HashSet:
Iterator iterator() Returns an iterator over the elements in this set.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103745#comment-13103745
 ] 

Jake Mannix commented on GIRAPH-31:
---

And for the implementations which have both the ability to provide a sorted 
iterator which isn't prohibitively expensive, but also provide a much faster 
unsorted iterator, they can choose whether to return true or false from the 
"isSorted()" method, and provide another method of the type you're suggesting. 


> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103741#comment-13103741
 ] 

Jake Mannix commented on GIRAPH-31:
---

Right, as many implementations will just 'throw new 
UnsupportedOperationException("We don't sort!");'

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103740#comment-13103740
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-31:
-

Avery,
It seems like requiring all BasicVertex implementations to implement a sorted 
iterator even when they don't need it is a bit heavy-handed.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103735#comment-13103735
 ] 

Avery Ching commented on GIRAPH-31:
---

You sure you don't want to just provide the interfaces 

Iterator> getOutEdgeIterator();
Iterator> getSortedOutEdgeIterator();

or 

Iterator getOutEdgeIterator();
Iterator getSortedOutEdgeIterator();

It would do away with this issue of sorted...and still keep iterable, but 
sorted or not, it's up to the implementation.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Jake Mannix (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103734#comment-13103734
 ] 

Jake Mannix commented on GIRAPH-31:
---

Avery, Dmitriy - after thinking about it, I think both true and false are 
wrong!  BasicVertex shouldn't implement this method at all, leave it abstract, 
and sublcasses which implement iterator() are forced to also tell users whether 
it chose to implement it sorted or not.  

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Claudio Martella (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103689#comment-13103689
 ] 

Claudio Martella commented on GIRAPH-31:


One question: how can I provide my own implementation of the edge-containing 
datastructure if addEdge is final? Maybe we should drop the final?

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103442#comment-13103442
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-31:
-

I was just commenting on the javadoc, not the implementation. Though now that 
you say that, i think you are right, false is a safer thing to do.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103436#comment-13103436
 ] 

Avery Ching commented on GIRAPH-31:
---

committer +1.  A few minor formatting issues (missing javadoc and over 80 char 
lines - I can fix before committing), but otherwise great!  I agree with 
Dmitriy's comment that the default should be false.  We should probably wait 
(maybe a day) for other folks to chime in for this one since it's a user facing 
interface.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-12) Investigate communication improvements

2011-09-13 Thread Avery Ching (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103422#comment-13103422
 ] 

Avery Ching commented on GIRAPH-12:
---

Hyunsik, just to update, I grabbed your patch and it passed unittest on my 
machine.  Then I ran it on a cluster at Yahoo!.  

I didn't have time to make a messaging benchmark, so I ran PageRankBenchmark.  
I ran with 100 workers, 1 M vertices, 3 supersteps, and 10 edges per vertex.

Here are 2 runs with the original code:

11/09/13 07:02:08 INFO mapred.JobClient:   Giraph Timers
11/09/13 07:02:08 INFO mapred.JobClient: Total (milliseconds)=46709
11/09/13 07:02:08 INFO mapred.JobClient: Superstep 3 (milliseconds)=1682
11/09/13 07:02:08 INFO mapred.JobClient: Setup (milliseconds)=3228
11/09/13 07:02:08 INFO mapred.JobClient: Shutdown (milliseconds)=1223
11/09/13 07:02:08 INFO mapred.JobClient: Vertex input superstep 
(milliseconds)=3578
11/09/13 07:02:08 INFO mapred.JobClient: Superstep 0 (milliseconds)=16222
11/09/13 07:02:08 INFO mapred.JobClient: Superstep 2 (milliseconds)=12302
11/09/13 07:02:08 INFO mapred.JobClient: Superstep 1 (milliseconds)=8467

13 07:14:51 INFO mapred.JobClient:   Giraph Timers
11/09/13 07:14:51 INFO mapred.JobClient: Total (milliseconds)=51475
11/09/13 07:14:51 INFO mapred.JobClient: Superstep 3 (milliseconds)=1348
11/09/13 07:14:51 INFO mapred.JobClient: Setup (milliseconds)=7233
11/09/13 07:14:51 INFO mapred.JobClient: Shutdown (milliseconds)=884
11/09/13 07:14:51 INFO mapred.JobClient: Vertex input superstep 
(milliseconds)=3284
11/09/13 07:14:51 INFO mapred.JobClient: Superstep 0 (milliseconds)=22213
11/09/13 07:14:51 INFO mapred.JobClient: Superstep 2 (milliseconds)=8553
11/09/13 07:14:51 INFO mapred.JobClient: Superstep 1 (milliseconds)=7955


Here are 2 runs with your code:

11/09/13 07:06:56 INFO mapred.JobClient:   Giraph Timers
11/09/13 07:06:56 INFO mapred.JobClient: Total (milliseconds)=51935
11/09/13 07:06:56 INFO mapred.JobClient: Superstep 3 (milliseconds)=1150
11/09/13 07:06:56 INFO mapred.JobClient: Setup (milliseconds)=3338
11/09/13 07:06:56 INFO mapred.JobClient: Shutdown (milliseconds)=833
11/09/13 07:06:56 INFO mapred.JobClient: Vertex input superstep 
(milliseconds)=3401
11/09/13 07:06:56 INFO mapred.JobClient: Superstep 0 (milliseconds)=17297
11/09/13 07:06:56 INFO mapred.JobClient: Superstep 2 (milliseconds)=14384
11/09/13 07:06:56 INFO mapred.JobClient: Superstep 1 (milliseconds)=11528

11/09/13 07:12:09 INFO mapred.JobClient:   Giraph Timers
11/09/13 07:12:09 INFO mapred.JobClient: Total (milliseconds)=51985
11/09/13 07:12:09 INFO mapred.JobClient: Superstep 3 (milliseconds)=1362
11/09/13 07:12:09 INFO mapred.JobClient: Setup (milliseconds)=3776
11/09/13 07:12:09 INFO mapred.JobClient: Shutdown (milliseconds)=710
11/09/13 07:12:09 INFO mapred.JobClient: Vertex input superstep 
(milliseconds)=3771
11/09/13 07:12:09 INFO mapred.JobClient: Superstep 0 (milliseconds)=17741
11/09/13 07:12:09 INFO mapred.JobClient: Superstep 2 (milliseconds)=13068
11/09/13 07:12:09 INFO mapred.JobClient: Superstep 1 (milliseconds)=11551

In my limited testing, numbers aren't too different.  I also see that the 
connections are maintained throughout the application run as you mentioned.  So 
the only tradeoff is possibly the reduced parallelization of message sending 
(user chosen vs all threads).  I like the approach and think it's an 
improvement (controllable threads).  Perhaps the only comment is that regarding 
the following code block.

for(PeerConnection pc : peerConnections.values()) {
futures.add(executor.submit(new PeerFlushExecutor(pc)));
}

Probably would be good to randomize the PeerConnection objects to avoid 
hotspots on the receiving side?


> Investigate communication improvements
> --
>
> Key: GIRAPH-12
> URL: https://issues.apache.org/jira/browse/GIRAPH-12
> Project: Giraph
>  Issue Type: Improvement
>  Components: bsp
>Reporter: Avery Ching
>Assignee: Hyunsik Choi
>Priority: Minor
> Attachments: GIRAPH-12_1.patch
>
>
> Currently every worker will start up a thread to communicate with every other 
> workers.  Hadoop RPC is used for communication.  For instance if there are 
> 400 workers, each worker will create 400 threads.  This ends up using a lot 
> of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll 
> our own to improve this situation.  By moving away from Hadoop RPC, we would 
> also make compatibility of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://ww

[jira] [Commented] (GIRAPH-31) Hide the SortedMap> in Vertex from client visibility (impl. detail), replace with appropriate accessor methods

2011-09-13 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13103412#comment-13103412
 ] 

Dmitriy V. Ryaboy commented on GIRAPH-31:
-

non-committer +1.

Please change javadoc for providesSortedIterator to not just say "@return true" 
-- implementations that override this to return false might forget to provide 
their own javadoc, inherit this, and this claim behavior opposite from what 
they actually do.

> Hide the SortedMap> in Vertex from client visibility (impl. 
> detail), replace with appropriate accessor methods
> ---
>
> Key: GIRAPH-31
> URL: https://issues.apache.org/jira/browse/GIRAPH-31
> Project: Giraph
>  Issue Type: Improvement
>  Components: graph
>Affects Versions: 0.70.0
>Reporter: Jake Mannix
>Assignee: Jake Mannix
> Attachments: GIRAPH-31.diff
>
>
> As discussed on the list, and on GIRAPH-28, the SortedMap> is an 
> implementation detail which needs not be exposed to application developers - 
> they need to iterate over the edges, and possibly access them one-by-one, and 
> remove them (in the Mutable case), but they don't need the SortedMap, and 
> creating primitive-optimized BasicVertex implementations is hampered by the 
> fact that clients expect this Map to exist.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira