[
https://issues.apache.org/jira/browse/HAMA-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011181#comment-13011181
]
Alois Cochard edited comment on HAMA-359 at 3/25/11 12:47 PM:
--------------------------------------------------------------
>>BUT this could be a large disadvantage too, in the case if a groom is not
>>running on a server where the data is actually stored.
>> In MapReduce this is called a non-local task, so you have to copy the data
>> to the local datanode
Exactly what I was thinking when I was speaking about problem of data locality.
>> Using a GraphDB and an interface like Blueprints is like using a MySQL
>> Database with JDBC inside a distributed environment. It is possible, but
>> IMHO it is not optimal.
+1 Same conclusion here. I would say it's even an horrible abstraction
inversion.
>>>>I looked at the graph-hbase, but it's just a graph-friendly API layer on
>>>>top of HBase. It'll make you feel complex.
So ok, no added value at all. Will break the flexibility of storing the
adjacency matrix the way you want.
To concluded it's seems more important to know *which* data structure to use,
more than *where* to store it.
When you sure which structure to use (adjacency matrix i.e.) you can then
choose the best system to store/access it (SequenceFile/HBase/...) and change
it if necessary without impacting the algorithm.
Thanks !
was (Author: alois.cochard):
>>BUT this could be a large disadvantage too, in the case if a groom is not
running on a server where the data is actually stored.
>> In MapReduce this is called a non-local task, so you have to copy the data
>> to the local datanode
Exactly what I was thinking when I was speaking about problem of data locality.
>> Using a GraphDB and an interface like Blueprints is like using a MySQL
>> Database with JDBC inside a distributed environment. It is possible, but
>> IMHO it is not optimal.
+1 Same conclusion here. I would say it's even an horrible abstraction
inversion.
>>>>I looked at the graph-hbase, but it's just a graph-friendly API layer on
>>>>top of HBase. It'll make you feel complex.
So ok, no added value at all. Will break the flexibility of storing the
adjacency matrix the way you want.
To concluded it's seems more important to know *which* data structure to use,
more than *where* to store it.
When you sure which structure to use (adjacency list i.e.) you can then choose
the best system to store/access it (SequenceFile/HBase/...) and change it if
necessary without impacting the algorithm.
Thanks !
> Development of Shortest Path Finding Algorithm
> ----------------------------------------------
>
> Key: HAMA-359
> URL: https://issues.apache.org/jira/browse/HAMA-359
> Project: Hama
> Issue Type: New Feature
> Components: examples
> Affects Versions: 0.2.0
> Reporter: Edward J. Yoon
> Labels: gsoc, gsoc2011, mentor
> Fix For: 0.3.0
>
> Original Estimate: 2016h
> Remaining Estimate: 2016h
>
> The goal of this project is development of parallel algorithm for finding a
> Shortest Path using Hama BSP.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira