[jira] Work started: (HADOOP-2878) Hama code contribution

Edward Yoon (JIRA) Sun, 16 Mar 2008 18:35:08 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Work on HADOOP-2878 started by Edward Yoon.

> Hama code contribution
> ----------------------
>
>                 Key: HADOOP-2878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2878
>             Project: Hadoop Core
>          Issue Type: New Feature
>         Environment: All environment
>            Reporter: Edward Yoon
>            Assignee: Edward Yoon
>            Priority: Minor
>         Attachments: hama.tar.gz
>
>
> *Introduction*
> Hama will develop a high-performance and large-scale parallel matrix 
> computational package based on Hadoop Map/Reduce. It will be useful for a 
> massively large-scale Numerical Analysis and Data Mining, which need the 
> intensive computation power of matrix inversion, e.g. linear regression, PCA, 
> SVM and etc. It will be also useful for many scientific applications, e.g. 
> physics computations, linear algebra, computational fluid dynamics, 
> statistics, graphic rendering and many more.
> Hama approach proposes the use of 3-dimensional Row and Column (Qualifier), 
> Time space and multi-dimensional Columnfamilies of Hbase (BigTable Clone), 
> which is able to store large sparse and various type of matrices (e.g. 
> Triangular Matrix, 3D Matrix, and etc.). its auto-partitioned sparsity 
> sub-structure will be efficiently managed and serviced by Hbase. Row and 
> Column operations can be done in linear-time, where several algorithms, such 
> as structured Gaussian elimination or iterative methods, run in O(the number 
> of non-zero elements in the matrix / number of mappers) time on Hadoop 
> Map/Reduce.
> So, it has a strong relationship with the hadoop project, and it would be 
> great if the "hama" can become a contrib project of the hadoop
> *Current Status*
> In its current state, the 'hama' is buggy and needs filling out, but 
> generalized matrix interface and basic linear algebra operations was 
> implemented within a large prototype system. In the future, We need new 
> parallel algorithms based on Map/Reduce for performance of heavy 
> decompositions and factorizations. It also needs tools to compose an 
> arbitrary matrix only with certain data filtered from hbase array structure.
> It would be great if we can collaboration with the hadoop members.
> *Members*
> We have a master's (or Ph.D) degrees in the mathematics and computer science.
> - Edward Yoon (edward AT udanax DOT org)
> - Chanwit Kaewkasi (chanwit AT gmail DOT com)
> - Min Cha (minslovey AT gmail DOT com)
> - Antonio Suh (bluesvm AT gmail DOT com) 
> At least, I and Min Cha will be involved full-time with this work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Work started: (HADOOP-2878) Hama code contribution

Reply via email to