[jira] [Created] (GIRAPH-155) Allow creation of graph by adding edges that span multiple workers

2012-03-15 Thread Dionysios Logothetis (Created) (JIRA)
Allow creation of graph by adding edges that span multiple workers
--

 Key: GIRAPH-155
 URL: https://issues.apache.org/jira/browse/GIRAPH-155
 Project: Giraph
  Issue Type: New Feature
  Components: graph, lib
Affects Versions: 0.1.0
Reporter: Dionysios Logothetis


Currently a graph is created only be adding vertices. The typical way is to 
read input text files line-by-line with each line describing a vertex (its 
value, its edges etc). The current API allows for the creation of a vertex only 
if all the information for the vertex is available in a single line.

However, it's common to have graphs described in the form of edges. Edges might 
span multiple lines in an input file or even span multiple workers. The current 
API doesn't allow this. In the input superstep, a vertex must be created by a 
single worker.

Instead, it should be possible for multiple workers to mutate the graph during 
the input superstep.

This has the following implications:
(i) Instead of just instantiating a vertex, a vertex reader should be able to 
do vertex addition and edge addition requests.
(ii) Multiple workers might try to create the same vertex. Any conflicts should 
be handled with a VertexResolver. So the resolver has to be instantiated before 
load time.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (GIRAPH-156) Users should be able to set simple 'custom arguments' via org.apache.giraph.GiraphRunner

2012-03-15 Thread Sebastian Schelter (Created) (JIRA)
Users should be able to set simple 'custom arguments' via 
org.apache.giraph.GiraphRunner


 Key: GIRAPH-156
 URL: https://issues.apache.org/jira/browse/GIRAPH-156
 Project: Giraph
  Issue Type: Improvement
  Components: conf and scripts
Affects Versions: 0.1.0
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter


Some vertices need custom arguments to run. The SimpleShortestPathsVertex for 
example needs to know the source vertex for the computation which is saved in 
the job's Configuration as _SimpleShortestPathsVertex.sourceId_. Users should 
be able to apply such simple custom arguments via GiraphRunner. 

I propose to add a new option _--customArguments_ where users can supply 
arguments in the form _param1=value1,param2=value2_ for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: How to contribute page

2012-03-15 Thread Jakob Homan
That's fine.  Can we update the site to point to the wiki (and
harmonize the content), so we don't have duplicate, soon-to-diverage
information?  If so, I'll try to do this pretty soon.

On Wed, Mar 14, 2012 at 11:37 PM, Avery Ching ach...@apache.org wrote:
 Main differences are the 'mvn verify' and running singe node unittest tests.
  It's easier for us to manage on confluence compared to maintaining the site
 =).

 Avery


 On 3/14/12 11:59 AM, Jakob Homan wrote:

 This page looks very similar in content to the Generating Patches and
 Getting Invovled sections on the main site:
 https://incubator.apache.org/giraph/  Are there any significant
 differences?

 On Wed, Mar 14, 2012 at 10:25 AM, Sebastian Schelters...@apache.org
  wrote:

 I added the 'Be involved' part from Mahout's [1] 'How to contribute'
 page. Maybe we could even copy a little more from there :)

 Best,
 Sebastian

 [1] https://cwiki.apache.org/MAHOUT/how-to-contribute.html

 On 14.03.2012 17:39, Avery Ching wrote:

 Yes, that is thanks to Sebastian.  We should probably make that another
 confluence page though based on his notes.  Anyone want to do it? =)

 Avery

 On 3/14/12 7:43 AM, Benjamin Heitmann wrote:

 On 14 Mar 2012, at 07:08, Avery Ching wrote:

 I just added a How to contribute page.

 https://cwiki.apache.org/confluence/display/GIRAPH/How+to+Contribute

 Thanks for setting up this page!

 Also, the link about running giraph's unit test in pseudo distributed
 mode [1] is very interesting.



 [1]
 http://ssc.io/running-giraphs-unit-tests-in-pseudo-distributed-mode/




[jira] [Updated] (GIRAPH-154) Worker ports are not synched properly with its peers

2012-03-15 Thread Zhiwei Gu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/GIRAPH-154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiwei Gu updated GIRAPH-154:
-

Attachment: GIRAPH-154.patch

passed unit test and grid test.

 Worker ports are not synched properly with its peers
 

 Key: GIRAPH-154
 URL: https://issues.apache.org/jira/browse/GIRAPH-154
 Project: Giraph
  Issue Type: Bug
  Components: bsp
Affects Versions: 0.2.0
Reporter: Zhiwei Gu
Assignee: Zhiwei Gu
 Attachments: GIRAPH-154.patch


 When worker trying multiple ports to setup the rpc server, the final port is 
 not synched with it's peer workers properly, and resulted in peer workers 
 send message to the default port.
 Here is some logs:
 
 Base port: 34900
 
 
 log for worker 161:
 
 IPC Server handler 98 on 36061: starting
 BasicRPCCommunications: Started RPC communication server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:36061 with 100 handlers and 199 
 flush threads on bind attempt 1
 IPC Server handler 99 on 36061: starting
 setup: Registering health of this worker...
 getJobState: Job state already exists 
 (/_hadoopBsp/job_201203130609_14838/_masterJobState)
 getApplicationAttempt: Node 
 /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists!
 getApplicationAttempt: Node 
 /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists!
 registerHealth: Created my health node for attempt=0, superstep=-1 with 
 /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/gsta32085.tan.ygrid.yahoo.com_161
  and workerInfo= Worker(hostname=gsta32085.tan.ygrid.yahoo.com, 
 MRpartition=161, port=35061)
 process: partitionAssignmentsReadyChanged (partitions are assigned)
 startSuperstep: Ready for computation on superstep -1 since worker selection 
 and vertex range assignments are done in 
 /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_partitionAssignments
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 0 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 1 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 2 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 3 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 4 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 5 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 6 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 7 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 8 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 9 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 10 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 11 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 12 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 13 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 14 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 15 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 16 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 17 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 18 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 19 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 20 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 21 time(s).
 Retrying connect to server: 
 gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 22 time(s).
 Retrying connect to server: