Re: [DISCUSSION] Spinoff ANN package

Edward J. Yoon Tue, 08 Sep 2015 17:59:03 -0700

As you already might know, the proposal has accepted by Apache
Incubator PMC. :) I'll migrate the o.a.h.ml.ann and
o.a.h.ml.perceptron to Apache Horn (Incubating) repository.



Thanks.

On Thu, Aug 6, 2015 at 11:20 AM, Minho Kim <[email protected]> wrote:
> +1
> I would like to participate in too. :-)
>
> 2015-08-06 6:12 GMT+09:00 Behroz Sikander <[email protected]>:
>
>> +1
>> I would also like to participate :)
>>
>> On Wed, Aug 5, 2015 at 5:52 AM, Edward J. Yoon <[email protected]>
>> wrote:
>>
>> > Guys,
>> >
>> > I plan to submit a 'DNN platform on top of Apache Hama' proposal as
>> > below. I know Hama community is somewhat small, but the main reason is
>> > that this domain-specific project is not fit for Apache Hama
>> > community. Recruiting volunteers is also hard problem. I expect this
>> > will become a very nice use-case of Apache Hama.
>> >
>> > If you have any suggestions or other opinions, Please let me know.
>> > Also, if you want to participate in this project, Pls feel free to add
>> > your name here.
>> >
>> > Thanks!
>> >
>> > --
>> > == Abstract ==
>> >
>> > (tentatively named "Horn [hɔ:n]", korean meaning of Horn is a
>> > "Spirit") is a neuron-centric programming APIs and execution framework
>> > for large-scale deep learning, built on top of Apache Hama.
>> >
>> > == Proposal ==
>> >
>> > It is a goal of the Horn to provide a neuron-centric programming APIs
>> > which allows user to easily define the characteristic of artificial
>> > neural network model and its structure, and its execution framework
>> > that leverages the heterogeneous resources on Hama and Hadoop YARN
>> > cluster.
>> >
>> > == Background ==
>> >
>> > The initial ANN code was developed at Apache Hama project by a
>> > committer, Yexi Jiang (Facebook) in 2013. The motivation behind this
>> > work is to build a framework that provides more intuitive programming
>> > APIs like Google's MapReduce or Pregel and supports applications
>> > needing large model with huge memory consumptions in distributed way.
>> >
>> > == Rationale ==
>> >
>> > While many of deep learning open source softwares are still data or
>> > model parallel only, we aim to support both data and model parallelism
>> > and also fault-tolerant system design. The basic idea of data and
>> > model parallelism is use of the remote parameter server to parallelize
>> > model creation and distribute training across machines, and the BSP
>> > framework of Apache Hama for performing asynchronous mini-batches.
>> > Within single BSP job, each task group works asynchronously using
>> > region barrier synchronization instead of global barrier
>> > synchronization, and trains large-scale neural network model using
>> > assigned data sets in BSP paradigm. This architecture is inspired by
>> > Google's DistBelief (Jeff Dean et al, 2012).
>> >
>> > == Initial Goals ==
>> >
>> > Some current goals include:
>> >
>> >  * builds new community
>> >  * provides more intuitive programming APIs
>> >  * needs both data and model parallelism support
>> >  * must run natively on both Hama and Hadoop2
>> >  * needs also GPUs and InfiniBand support
>> >
>> > == Current Status ==
>> >
>> > === Meritocracy ===
>> >
>> > The core developers understand what it means to have a process based
>> > on meritocracy. We will provide continuous efforts to build an
>> > environment that supports this, encouraging community members to
>> > contribute.
>> >
>> > === Community ===
>> >
>> > A small community has formed within the Apache Hama project and some
>> > companies such as instant messenger service company and mobile
>> > manufacturing company. And many people are interested in the
>> > large-scale deep learning platform itself. By bringing Horn into
>> > Apache, we believe that the community will grow even bigger.
>> >
>> > === Core Developers ===
>> >
>> > Edward J. Yoon, Thomas Jungblut, and Dongjin Lee
>> >
>> > == Known Risks ==
>> >
>> > === Orphaned Products ===
>> >
>> > Apache Hama is already a core open source component at Samsung
>> > Electronics, and Horn also will be used by Samsung Electronics, and so
>> > there is no direct risk for this project to be orphaned.
>> >
>> > === Inexperience with Open Source ===
>> >
>> > Some are very new and the others have experience using and/or working
>> > on Apache open source projects.
>> >
>> > === Homogeneous Developers ===
>> >
>> > The initial committers are from different organizations such as,
>> > Microsoft, Samsung Electronics, and Line Plus.
>> >
>> > === Reliance on Salaried Developers ===
>> >
>> > Other developers will also start working on the project in their spare
>> > time.
>> >
>> > === Relationships with Other Apache Products ===
>> >
>> >  * Horn is based on Apache Hama
>> >  * Apache Zookeeper is used for distributed locking service
>> >  * Natively run on Apache Hadoop and Mesos
>> >  * Horn can be somewhat overlapped with Singa podling.
>> >
>> > === An Excessive Fascination with the Apache Brand ===
>> >
>> > Horn itself will hopefully have benefits from Apache, in terms of
>> > attracting a community and establishing a solid group of developers,
>> > but also the relation with Apache Hama, a general-purpose BSP
>> > computing engine. These are the main reasons for us to send this
>> > proposal.
>> >
>> > == Documentation ==
>> >
>> > Initial plan about Horn can be found at
>> > http://blog.udanax.org/2015/06/googles-distbelief-clone-project-on.html
>> >
>> > == Initial Source ==
>> >
>> > The initial source code has been release as part of Apache Hama
>> > project developed under Apache Software Foundation. The source code is
>> > currently hosted at
>> >
>> >
>> https://svn.apache.org/repos/asf/hama/trunk/ml/src/main/java/org/apache/hama/ml/ann/
>> >
>> > == Cryptography ==
>> >
>> > Not applicable.
>> >
>> > == Required Resources ==
>> >
>> > Mailing Lists
>> >
>> >  * horn-private
>> >  * horn-dev
>> >
>> > Subversion Directory
>> >
>> >  * Git is the preferred source control system: git://git.apache.org/horn
>> >
>> > Issue Tracking
>> >
>> >  * a JIRA issue tracker, HORN
>> >
>> > == Initial Committers and Affiliations ==
>> >
>> >  * Thomas Jungblut (tjungblut at apache dot org)
>> >  * Edward J. Yoon (edwardyoon at apache dot org)
>> >  * Dongjin Lee (dongjin.lee.kr at gmail dot com)
>> >  * Minho Kim (minwise.kim at samsung dot com)
>> >  * TODO
>> >
>> > == Affiliations ==
>> >
>> >  * Thomas Jungblut (Microsoft)
>> >  * Edward J. Yoon (Samsung Electronics)
>> >  * Donjin Lee (LINE Plus)
>> >  * Minho Kim (Samsung Electronics)
>> >  * TODO
>> >
>> > == Sponsors ==
>> >
>> > Champion
>> >
>> >  * Edward J. Yoon <edwardyoon at apache dot org>
>> >
>> > Nominated Mentors
>> >
>> >  * TODO
>> >
>> > Sponsoring Entity
>> >
>> > The Apache Incubator
>> >
>> > --
>> > Best Regards, Edward J. Yoon
>> >
>>



-- 
Best Regards, Edward J. Yoon

Re: [DISCUSSION] Spinoff ANN package

Reply via email to