[ 
https://issues.apache.org/jira/browse/HADOOP-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12578101#action_12578101
 ] 

Edward Yoon commented on HADOOP-2878:
-------------------------------------

After discuss with mahout-dev, I made a proposal to mahout.
http://www.nabble.com/-jira--Created%3A-%28MAHOUT-16%29-Hama-contrib-package-for-the-mahout-to15998717.html

I would appreciate any advice you could give me.

> Hama code contribution
> ----------------------
>
>                 Key: HADOOP-2878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2878
>             Project: Hadoop Core
>          Issue Type: New Feature
>         Environment: All environment
>            Reporter: Edward Yoon
>            Assignee: Edward Yoon
>            Priority: Minor
>
> *Introduction*
> Hama will develop a high-performance and large-scale parallel matrix 
> computational package based on Hadoop Map/Reduce. It will be useful for a 
> massively large-scale Numerical Analysis and Data Mining, which need the 
> intensive computation power of matrix inversion, e.g. linear regression, PCA, 
> SVM and etc. It will be also useful for many scientific applications, e.g. 
> physics computations, linear algebra, computational fluid dynamics, 
> statistics, graphic rendering and many more.
> Hama approach proposes the use of 3-dimensional Row and Column (Qualifier), 
> Time space and multi-dimensional Columnfamilies of Hbase (BigTable Clone), 
> which is able to store large sparse and various type of matrices (e.g. 
> Triangular Matrix, 3D Matrix, and etc.). its auto-partitioned sparsity 
> sub-structure will be efficiently managed and serviced by Hbase. Row and 
> Column operations can be done in linear-time, where several algorithms, such 
> as structured Gaussian elimination or iterative methods, run in O(the number 
> of non-zero elements in the matrix / number of mappers) time on Hadoop 
> Map/Reduce.
> So, it has a strong relationship with the hadoop project, and it would be 
> great if the "hama" can become a contrib project of the hadoop
> *Current Status*
> In its current state, the 'hama' is buggy and needs filling out, but 
> generalized matrix interface and basic linear algebra operations was 
> implemented within a large prototype system. In the future, We need new 
> parallel algorithms based on Map/Reduce for performance of heavy 
> decompositions and factorizations. It also needs tools to compose an 
> arbitrary matrix only with certain data filtered from hbase array structure.
> It would be great if we can collaboration with the hadoop members.
> *Members*
> The initial set of committers includes folks from the Hadoop & Hbase 
> communities, and We have a master's (or Ph.D) degrees in the mathematics and 
> computer science.
> - Edward Yoon (edward AT udanax DOT org)
> - Chanwit Kaewkasi (chanwit AT gmail DOT com)
> - Min Cha (minslovey AT gmail DOT com)
> At least, I and Min Cha will be involved full-time with this work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to