[jira] [Commented] (GIRAPH-77) Coordinator should expose a web interface with progress, vertex region assignments, etc.

2011-11-17 Thread Arun Suresh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151910#comment-13151910
 ] 

Arun Suresh commented on GIRAPH-77:
---

This looks very related to 
[GIRAPH-76|https://issues.apache.org/jira/browse/GIRAPH-76] since I will be 
refactoring GraphMapper. I can take this up as well if you havn't already 
started..

 Coordinator should expose a web interface with progress, vertex region 
 assignments, etc.
 

 Key: GIRAPH-77
 URL: https://issues.apache.org/jira/browse/GIRAPH-77
 Project: Giraph
  Issue Type: New Feature
Reporter: Jakob Homan

 It would be nice if the coordinator worker had a web interface that showed 
 progress, splits, etc. during job execution. Right now it would duplicate 
 information currently being exposed through task status, but with the move to 
 YARN, it will be a necessity.  It would be great if we could do this in a 
 modern way to avoid the screen-scraping, etc. currently used to get 
 information from most other Hadoop project's web interfaces.  The coordinator 
 could announce its address at the beginning or via status updates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-96) Support for Graphs with Huge adjacency lists

2011-11-17 Thread Arun Suresh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152087#comment-13152087
 ] 

Arun Suresh commented on GIRAPH-96:
---

Looks like Claudio beat me to a similar suggestion 
[GIRAPH-94|https://issues.apache.org/jira/browse/GIRAPH-94]

My proposal was more for a standard means of storing vertex/adjacency list 
information. The Giraph framework would handle the storage and would expose 
APIs which the Vertex reader can use to store the information as it reads the 
graph. The user would then not be required to subclass a Vertex class and 
implement the initialize() method. All adjacency list/vertex manipulation would 
go thru the common data store.

 Support for Graphs with Huge adjacency lists
 

 Key: GIRAPH-96
 URL: https://issues.apache.org/jira/browse/GIRAPH-96
 Project: Giraph
  Issue Type: Improvement
Reporter: Arun Suresh

 Currently the vertex initialize() method is passed the complete adjacency 
 list as a HashMap. All the current concrete implementations of Vertex iterate 
 over the adjacency list and recreate new Data Structures within the Vertex 
 instance to hold/manipulate the adjacency list. This would seize to be 
 feasible once the size of the adjacency list becomes really huge.
 I propose storing the adjacency list and all vertex information (and incoming 
 messages ?) in a distributed data store such as HBase. The adjacency list can 
 be lazily loaded via HBase Scans. I was thinking of an HBase schema where the 
 row Id is a concatenation of VertexID+OutboundVertexId with a single column 
 containing the edge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-91) Large-memory improvements (Memory reduced vertex implementation, fast failure, added settings)

2011-11-16 Thread Arun Suresh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151823#comment-13151823
 ] 

Arun Suresh commented on GIRAPH-91:
---

Avery, I see that you have used 2 sorted ArrayLists. Couldnt a LinkedHashMap 
have been an alternative ? I understand that the getEdgeValue and hasEdgeVale 
would be faster if it were a sortedArrayList. Also arraylists are more compact. 
But I was just wondering.. in the event that the graph is truly large (millions 
of edges, for a vertex) would it make sense to have the entire edgelist in 
memory in the first place ? we might need a scheme where only a part of the 
list is in memory and have chunks of the list fetched on demand when the 
provided iterator calls next(). In which case we can have a hybrid array + 
linked list (linked list of chunks of the edgelist)

 Large-memory improvements (Memory reduced vertex implementation, fast 
 failure, added settings) 
 ---

 Key: GIRAPH-91
 URL: https://issues.apache.org/jira/browse/GIRAPH-91
 Project: Giraph
  Issue Type: Improvement
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-91.diff


 Current vertex implementation uses a HashMap for storing the edges, which is 
 quite memory heavy for large graphs.  The default settings in Giraph need to 
 be improved for large graphs and heaps of 20G.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-76) Refactor worker logic from GraphMapper

2011-11-16 Thread Arun Suresh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-76?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151825#comment-13151825
 ] 

Arun Suresh commented on GIRAPH-76:
---

Yes, this does sound like a good idea. I could take a crack at it you havn't 
already started.

 Refactor worker logic from GraphMapper
 --

 Key: GIRAPH-76
 URL: https://issues.apache.org/jira/browse/GIRAPH-76
 Project: Giraph
  Issue Type: Improvement
  Components: graph
Reporter: Jakob Homan

 The plumbing around executing vertices is hosted within the mapper, but could 
 be extracted to its own class and executed from the Mapper directly.  This 
 would ease testing and make it easier to host in the new YARN infrastructure. 
  There's nothing mapper specific about this code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-93) Hive input / output format

2011-11-16 Thread Arun Suresh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13151851#comment-13151851
 ] 

Arun Suresh commented on GIRAPH-93:
---

Avery, This might not be an optimal solution, but just putting it out there. I 
understand Hive exposes a JDBC interface. Once can use the JDBC interface and 
the DbInputFormat 
http://www.cloudera.com/blog/2009/03/database-access-with-hadoop/ to load data 
from a Hive table for a Map Reduce Job

 Hive input / output format
 --

 Key: GIRAPH-93
 URL: https://issues.apache.org/jira/browse/GIRAPH-93
 Project: Giraph
  Issue Type: New Feature
Reporter: Avery Ching
Assignee: Avery Ching

 It would be great to be able to load/store data from/to Hive tables.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-61) Worker's early failure will cause the whole system fail

2011-11-15 Thread Arun Suresh (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13150646#comment-13150646
 ] 

Arun Suresh commented on GIRAPH-61:
---

Since a Giraph Job is a SINGLE Hadoop Job, and since the MasterThread has no 
other option but to Fail the current job if all workers dont respond (I cannot 
spawn new workers / restart existing workers) there might not be a way around 
this in the current scheme.

But consider what if the Giraph Job were composed of two Hadoop Jobs.. 
Basically, the first Job would start just the Master Map Task which takes care 
of Job initialization, starting Zookeeper etc. Finally, it will kick off a 
second Hadoop Job. The Second Hadoop Job would be similar to the current Giraph 
Job and will spawn only Worker Tasks. The Master Task from the first job stays 
alive until the algorithm completes.

Pig apparently does something similar actually. It starts a single Map task 
first which in-turn spawns multiple subtasks. 

Pros:
* The whole Giraph Job need not fail if a worker fails at startup as above. 
Under the new scheme, the Master will detect that all workers have not 
responded at the start of the superstep.. and can opt to completely restart 
JUST the Second job. 
* The GiraphMapper class can be split into a MasterMapper and WorkerMapper. 
Might possibly make it a bit cleaner (Basically there are a couple of places 
where an if (mapFunctions == MapFunctions.MASTER_ONLY) is required to 
differentiate the Task from being a Master/Worker. This can cleanly be 
refactored into two class each with a specific responsibility).
* At some point of time, we will possibly move to YARN where Giraph would be 
userland code decoupled from MapReduce itself. In which case, the Master would 
map to the ApplicationMaster. The refactoring would allow Giraph to more easily 
retrofitted to YARN.

 Worker's early failure will cause the whole system fail
 ---

 Key: GIRAPH-61
 URL: https://issues.apache.org/jira/browse/GIRAPH-61
 Project: Giraph
  Issue Type: Bug
  Components: bsp
Affects Versions: 0.70.0
Reporter: Zhiwei Gu
Priority: Critical

 When there's early failure happens to a worker, the whole system will fail.
 Observed failed worker:
State: Creating RPC threads failed
Result: It will cause the worker fail, however, master has already 
 recorded and reserved these splits to this worker (identified by 
 InetAddress), thus although hadoop reschedule a mapper for this worker, the 
 master still waiting for the old worker's response, finally, the master will 
 fail.
 [Failed worker logs:]
 2011-10-24 18:19:51,051 INFO org.apache.giraph.graph.BspService: process: 
 vertexRangeAssignmentsReadyChanged (vertex ranges are assigned)
 2011-10-24 18:19:51,060 INFO org.apache.giraph.graph.BspServiceWorker: 
 startSuperstep: Ready for computation on superstep 1 since worker selection 
 and vertex range assignments are done in 
 /_hadoopBsp/job_201108260911_842943/_applicationAttemptsDir/0/_superstepDir/1/_vertexRangeAssignments
 2011-10-24 18:19:51,078 INFO org.apache.giraph.graph.BspServiceWorker: 
 getAggregatorValues: no aggregators in 
 /_hadoopBsp/job_201108260911_842943/_applicationAttemptsDir/0/_superstepDir/0/_mergedAggregatorDir
  on superstep 1
 2011-10-24 18:19:53,974 INFO org.apache.giraph.graph.GraphMapper: map: 
 totalMem=84213760 maxMem=2067988480 freeMem=65069808
 2011-10-24 18:19:53,974 INFO org.apache.giraph.comm.BasicRPCCommunications: 
 flush: starting...
 2011-10-24 18:19:54,022 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
 Initializing logs' truncater with mapRetainSize=102400 and 
 reduceRetainSize=102400
 2011-10-24 18:19:54,023 FATAL org.apache.hadoop.mapred.Child: Error running 
 child : java.lang.OutOfMemoryError: unable to create new native thread
   at java.lang.Thread.start0(Native Method)
   at java.lang.Thread.start(Thread.java:597)
   at java.lang.UNIXProcess$1.run(UNIXProcess.java:141)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.lang.UNIXProcess.init(UNIXProcess.java:103)
   at java.lang.ProcessImpl.start(ProcessImpl.java:65)
   at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
   at org.apache.hadoop.util.Shell.run(Shell.java:182)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
   at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
   at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37)
   at