[
https://issues.apache.org/jira/browse/HADOOP-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on HADOOP-2878 started by Edward Yoon.
> Hama code contribution
> ----------------------
>
> Key: HADOOP-2878
> URL: https://issues.apache.org/jira/browse/HADOOP-2878
> Project: Hadoop Core
> Issue Type: New Feature
> Environment: All environment
> Reporter: Edward Yoon
> Assignee: Edward Yoon
> Priority: Minor
> Attachments: hama.tar.gz
>
>
> *Introduction*
> Hama will develop a high-performance and large-scale parallel matrix
> computational package based on Hadoop Map/Reduce. It will be useful for a
> massively large-scale Numerical Analysis and Data Mining, which need the
> intensive computation power of matrix inversion, e.g. linear regression, PCA,
> SVM and etc. It will be also useful for many scientific applications, e.g.
> physics computations, linear algebra, computational fluid dynamics,
> statistics, graphic rendering and many more.
> Hama approach proposes the use of 3-dimensional Row and Column (Qualifier),
> Time space and multi-dimensional Columnfamilies of Hbase (BigTable Clone),
> which is able to store large sparse and various type of matrices (e.g.
> Triangular Matrix, 3D Matrix, and etc.). its auto-partitioned sparsity
> sub-structure will be efficiently managed and serviced by Hbase. Row and
> Column operations can be done in linear-time, where several algorithms, such
> as structured Gaussian elimination or iterative methods, run in O(the number
> of non-zero elements in the matrix / number of mappers) time on Hadoop
> Map/Reduce.
> So, it has a strong relationship with the hadoop project, and it would be
> great if the "hama" can become a contrib project of the hadoop
> *Current Status*
> In its current state, the 'hama' is buggy and needs filling out, but
> generalized matrix interface and basic linear algebra operations was
> implemented within a large prototype system. In the future, We need new
> parallel algorithms based on Map/Reduce for performance of heavy
> decompositions and factorizations. It also needs tools to compose an
> arbitrary matrix only with certain data filtered from hbase array structure.
> It would be great if we can collaboration with the hadoop members.
> *Members*
> We have a master's (or Ph.D) degrees in the mathematics and computer science.
> - Edward Yoon (edward AT udanax DOT org)
> - Chanwit Kaewkasi (chanwit AT gmail DOT com)
> - Min Cha (minslovey AT gmail DOT com)
> - Antonio Suh (bluesvm AT gmail DOT com)
> At least, I and Min Cha will be involved full-time with this work.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.