[jira] Updated: (HADOOP-2878) Hama code contribution

Edward Yoon (JIRA) Thu, 13 Mar 2008 21:51:04 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Edward Yoon updated HADOOP-2878:
--------------------------------

    Description: 
*Introduction*

Hama will develop a high-performance and large-scale parallel matrix 
computational package based on Hadoop Map/Reduce. It will be useful for a 
massively large-scale Numerical Analysis and Data Mining, which need the 
intensive computation power of matrix inversion, e.g. linear regression, PCA, 
SVM and etc. It will be also useful for many scientific applications, e.g. 
physics computations, linear algebra, computational fluid dynamics, statistics, 
graphic rendering and many more.

Hama approach proposes the use of 3-dimensional Row and Column (Qualifier), 
Time space and multi-dimensional Columnfamilies of Hbase (BigTable Clone), 
which is able to store large sparse and various type of matrices (e.g. 
Triangular Matrix, 3D Matrix, and etc.). its auto-partitioned sparsity 
sub-structure will be efficiently managed and serviced by Hbase. Row and Column 
operations can be done in linear-time, where several algorithms, such as 
structured Gaussian elimination or iterative methods, run in O(the number of 
non-zero elements in the matrix / number of mappers) time on Hadoop Map/Reduce.

So, it has a strong relationship with the hadoop project, and it would be great 
if the "hama" can become a contrib project of the hadoop

*Current Status*

In its current state, the 'hama' is buggy and needs filling out, but 
generalized matrix interface and basic linear algebra operations was 
implemented within a large prototype system. In the future, We need new 
parallel algorithms based on Map/Reduce for performance of heavy decompositions 
and factorizations. It also needs tools to compose an arbitrary matrix only 
with certain data filtered from hbase array structure.

It would be great if we can collaboration with the hadoop members.

*Members*

We have a master's (or Ph.D) degrees in the mathematics and computer science.

- Edward Yoon (edward AT udanax DOT org)
- Chanwit Kaewkasi (chanwit AT gmail DOT com)
- Min Cha (minslovey AT gmail DOT com)
- Antonio Suh (bluesvm AT gmail DOT com) 

At least, I and Min Cha will be involved full-time with this work.

  was:
*Introduction*

Hama will develop a high-performance and large-scale parallel matrix 
computational package based on Hadoop Map/Reduce. It will be useful for a 
massively large-scale Numerical Analysis and Data Mining, which need the 
intensive computation power of matrix inversion, e.g. linear regression, PCA, 
SVM and etc. It will be also useful for many scientific applications, e.g. 
physics computations, linear algebra, computational fluid dynamics, statistics, 
graphic rendering and many more.

Hama approach proposes the use of 3-dimensional Row and Column (Qualifier), 
Time space and multi-dimensional Columnfamilies of Hbase (BigTable Clone), 
which is able to store large sparse and various type of matrices (e.g. 
Triangular Matrix, 3D Matrix, and etc.). its auto-partitioned sparsity 
sub-structure will be efficiently managed and serviced by Hbase. Row and Column 
operations can be done in linear-time, where several algorithms, such as 
structured Gaussian elimination or iterative methods, run in O(the number of 
non-zero elements in the matrix / number of mappers) time on Hadoop Map/Reduce.

So, it has a strong relationship with the hadoop project, and it would be great 
if the "hama" can become a contrib project of the hadoop

*Current Status*

In its current state, the 'hama' is buggy and needs filling out, but 
generalized matrix interface and basic linear algebra operations was 
implemented within a large prototype system. In the future, We need new 
parallel algorithms based on Map/Reduce for performance of heavy decompositions 
and factorizations. It also needs tools to compose an arbitrary matrix only 
with certain data filtered from hbase array structure.

It would be great if we can collaboration with the hadoop members.

*Members*

The initial set of committers includes folks from the Hadoop & Hbase 
communities, and We have a master's (or Ph.D) degrees in the mathematics and 
computer science.

- Edward Yoon (edward AT udanax DOT org)
- Chanwit Kaewkasi (chanwit AT gmail DOT com)
- Min Cha (minslovey AT gmail DOT com)

At least, I and Min Cha will be involved full-time with this work.


Antonio Suh was joined to this project. He is my fellow worker. 

> Hama code contribution
> ----------------------
>
>                 Key: HADOOP-2878
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2878
>             Project: Hadoop Core
>          Issue Type: New Feature
>         Environment: All environment
>            Reporter: Edward Yoon
>            Assignee: Edward Yoon
>            Priority: Minor
>         Attachments: hama.tar.gz
>
>
> *Introduction*
> Hama will develop a high-performance and large-scale parallel matrix 
> computational package based on Hadoop Map/Reduce. It will be useful for a 
> massively large-scale Numerical Analysis and Data Mining, which need the 
> intensive computation power of matrix inversion, e.g. linear regression, PCA, 
> SVM and etc. It will be also useful for many scientific applications, e.g. 
> physics computations, linear algebra, computational fluid dynamics, 
> statistics, graphic rendering and many more.
> Hama approach proposes the use of 3-dimensional Row and Column (Qualifier), 
> Time space and multi-dimensional Columnfamilies of Hbase (BigTable Clone), 
> which is able to store large sparse and various type of matrices (e.g. 
> Triangular Matrix, 3D Matrix, and etc.). its auto-partitioned sparsity 
> sub-structure will be efficiently managed and serviced by Hbase. Row and 
> Column operations can be done in linear-time, where several algorithms, such 
> as structured Gaussian elimination or iterative methods, run in O(the number 
> of non-zero elements in the matrix / number of mappers) time on Hadoop 
> Map/Reduce.
> So, it has a strong relationship with the hadoop project, and it would be 
> great if the "hama" can become a contrib project of the hadoop
> *Current Status*
> In its current state, the 'hama' is buggy and needs filling out, but 
> generalized matrix interface and basic linear algebra operations was 
> implemented within a large prototype system. In the future, We need new 
> parallel algorithms based on Map/Reduce for performance of heavy 
> decompositions and factorizations. It also needs tools to compose an 
> arbitrary matrix only with certain data filtered from hbase array structure.
> It would be great if we can collaboration with the hadoop members.
> *Members*
> We have a master's (or Ph.D) degrees in the mathematics and computer science.
> - Edward Yoon (edward AT udanax DOT org)
> - Chanwit Kaewkasi (chanwit AT gmail DOT com)
> - Min Cha (minslovey AT gmail DOT com)
> - Antonio Suh (bluesvm AT gmail DOT com) 
> At least, I and Min Cha will be involved full-time with this work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2878) Hama code contribution

Reply via email to