As you already might know, the proposal has accepted by Apache Incubator PMC. :) I'll migrate the o.a.h.ml.ann and o.a.h.ml.perceptron to Apache Horn (Incubating) repository.
Thanks. On Thu, Aug 6, 2015 at 11:20 AM, Minho Kim <[email protected]> wrote: > +1 > I would like to participate in too. :-) > > 2015-08-06 6:12 GMT+09:00 Behroz Sikander <[email protected]>: > >> +1 >> I would also like to participate :) >> >> On Wed, Aug 5, 2015 at 5:52 AM, Edward J. Yoon <[email protected]> >> wrote: >> >> > Guys, >> > >> > I plan to submit a 'DNN platform on top of Apache Hama' proposal as >> > below. I know Hama community is somewhat small, but the main reason is >> > that this domain-specific project is not fit for Apache Hama >> > community. Recruiting volunteers is also hard problem. I expect this >> > will become a very nice use-case of Apache Hama. >> > >> > If you have any suggestions or other opinions, Please let me know. >> > Also, if you want to participate in this project, Pls feel free to add >> > your name here. >> > >> > Thanks! >> > >> > -- >> > == Abstract == >> > >> > (tentatively named "Horn [hɔ:n]", korean meaning of Horn is a >> > "Spirit") is a neuron-centric programming APIs and execution framework >> > for large-scale deep learning, built on top of Apache Hama. >> > >> > == Proposal == >> > >> > It is a goal of the Horn to provide a neuron-centric programming APIs >> > which allows user to easily define the characteristic of artificial >> > neural network model and its structure, and its execution framework >> > that leverages the heterogeneous resources on Hama and Hadoop YARN >> > cluster. >> > >> > == Background == >> > >> > The initial ANN code was developed at Apache Hama project by a >> > committer, Yexi Jiang (Facebook) in 2013. The motivation behind this >> > work is to build a framework that provides more intuitive programming >> > APIs like Google's MapReduce or Pregel and supports applications >> > needing large model with huge memory consumptions in distributed way. >> > >> > == Rationale == >> > >> > While many of deep learning open source softwares are still data or >> > model parallel only, we aim to support both data and model parallelism >> > and also fault-tolerant system design. The basic idea of data and >> > model parallelism is use of the remote parameter server to parallelize >> > model creation and distribute training across machines, and the BSP >> > framework of Apache Hama for performing asynchronous mini-batches. >> > Within single BSP job, each task group works asynchronously using >> > region barrier synchronization instead of global barrier >> > synchronization, and trains large-scale neural network model using >> > assigned data sets in BSP paradigm. This architecture is inspired by >> > Google's DistBelief (Jeff Dean et al, 2012). >> > >> > == Initial Goals == >> > >> > Some current goals include: >> > >> > * builds new community >> > * provides more intuitive programming APIs >> > * needs both data and model parallelism support >> > * must run natively on both Hama and Hadoop2 >> > * needs also GPUs and InfiniBand support >> > >> > == Current Status == >> > >> > === Meritocracy === >> > >> > The core developers understand what it means to have a process based >> > on meritocracy. We will provide continuous efforts to build an >> > environment that supports this, encouraging community members to >> > contribute. >> > >> > === Community === >> > >> > A small community has formed within the Apache Hama project and some >> > companies such as instant messenger service company and mobile >> > manufacturing company. And many people are interested in the >> > large-scale deep learning platform itself. By bringing Horn into >> > Apache, we believe that the community will grow even bigger. >> > >> > === Core Developers === >> > >> > Edward J. Yoon, Thomas Jungblut, and Dongjin Lee >> > >> > == Known Risks == >> > >> > === Orphaned Products === >> > >> > Apache Hama is already a core open source component at Samsung >> > Electronics, and Horn also will be used by Samsung Electronics, and so >> > there is no direct risk for this project to be orphaned. >> > >> > === Inexperience with Open Source === >> > >> > Some are very new and the others have experience using and/or working >> > on Apache open source projects. >> > >> > === Homogeneous Developers === >> > >> > The initial committers are from different organizations such as, >> > Microsoft, Samsung Electronics, and Line Plus. >> > >> > === Reliance on Salaried Developers === >> > >> > Other developers will also start working on the project in their spare >> > time. >> > >> > === Relationships with Other Apache Products === >> > >> > * Horn is based on Apache Hama >> > * Apache Zookeeper is used for distributed locking service >> > * Natively run on Apache Hadoop and Mesos >> > * Horn can be somewhat overlapped with Singa podling. >> > >> > === An Excessive Fascination with the Apache Brand === >> > >> > Horn itself will hopefully have benefits from Apache, in terms of >> > attracting a community and establishing a solid group of developers, >> > but also the relation with Apache Hama, a general-purpose BSP >> > computing engine. These are the main reasons for us to send this >> > proposal. >> > >> > == Documentation == >> > >> > Initial plan about Horn can be found at >> > http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html >> > >> > == Initial Source == >> > >> > The initial source code has been release as part of Apache Hama >> > project developed under Apache Software Foundation. The source code is >> > currently hosted at >> > >> > >> https://svn.apache.org/repos/asf/hama/trunk/ml/src/main/java/org/apache/hama/ml/ann/ >> > >> > == Cryptography == >> > >> > Not applicable. >> > >> > == Required Resources == >> > >> > Mailing Lists >> > >> > * horn-private >> > * horn-dev >> > >> > Subversion Directory >> > >> > * Git is the preferred source control system: git://git.apache.org/horn >> > >> > Issue Tracking >> > >> > * a JIRA issue tracker, HORN >> > >> > == Initial Committers and Affiliations == >> > >> > * Thomas Jungblut (tjungblut at apache dot org) >> > * Edward J. Yoon (edwardyoon at apache dot org) >> > * Dongjin Lee (dongjin.lee.kr at gmail dot com) >> > * Minho Kim (minwise.kim at samsung dot com) >> > * TODO >> > >> > == Affiliations == >> > >> > * Thomas Jungblut (Microsoft) >> > * Edward J. Yoon (Samsung Electronics) >> > * Donjin Lee (LINE Plus) >> > * Minho Kim (Samsung Electronics) >> > * TODO >> > >> > == Sponsors == >> > >> > Champion >> > >> > * Edward J. Yoon <edwardyoon at apache dot org> >> > >> > Nominated Mentors >> > >> > * TODO >> > >> > Sponsoring Entity >> > >> > The Apache Incubator >> > >> > -- >> > Best Regards, Edward J. Yoon >> > >> -- Best Regards, Edward J. Yoon
