Re: [DISCUSS] Making submarine to different release model like Ozone

Bharat Viswanadham Fri, 01 Feb 2019 11:05:18 -0800

Thank You Wangda for driving this discussion.
+1 for a separate release for submarine.
Having own release cadence will help iterate the project to grow at a faster 
pace and also get the new features in hand to the users, and get their feedback 
quickly.



Thanks,
Bharat




On 2/1/19, 10:54 AM, "Ajay Kumar" <[email protected]> wrote:

    +1, Thanks for driving this. With rise of use cases running ML along with 
traditional applications this will be of great help.
    
    Thanks,
    Ajay   
    
    On 2/1/19, 10:49 AM, "Suma Shivaprasad" <[email protected]> 
wrote:
    
        +1. Thanks for bringing this up Wangda.
        
        Makes sense to have Submarine follow its own release cadence given the 
good
        momentum/adoption so far. Also, making it run with older versions of 
Hadoop
        would drive higher adoption.
        
        Suma
        
        On Fri, Feb 1, 2019 at 9:40 AM Eric Yang <[email protected]> wrote:
        
        > Submarine is an application built for YARN framework, but it does not 
have
        > strong dependency on YARN development.  For this kind of projects, it 
would
        > be best to enter Apache Incubator cycles to create a new community.  
Apache
        > commons is the only project other than Incubator that has independent
        > release cycles.  The collection is large, and the project goal is
        > ambitious.  No one really knows which component works with each other 
in
        > Apache commons.  Hadoop is a much more focused project on distributed
        > computing framework and not incubation sandbox.  For alignment with 
Hadoop
        > goals, and we want to prevent Hadoop project to be overloaded while
        > allowing good ideas to be carried forwarded in Apache incubator.  Put 
on my
        > Apache Member hat, my vote is -1 to allow more independent subproject
        > release cycle in Hadoop project that does not align with Hadoop 
project
        > goals.
        >
        > Apache incubator process is highly recommended for Submarine:
        > https://incubator.apache.org/policy/process.html This allows 
Submarine to
        > develop for older version of Hadoop like Spark works with multiple 
versions
        > of Hadoop.
        >
        > Regards,
        > Eric
        >
        > On 1/31/19, 10:51 PM, "Weiwei Yang" <[email protected]> wrote:
        >
        >     Thanks for proposing this Wangda, my +1 as well.
        >     It is amazing to see the progress made in Submarine last year, the
        > community grows fast and quiet collaborative. I can see the reasons 
to get
        > it release faster in its own cycle. And at the same time, the Ozone 
way
        > works very well.
        >
        >     —
        >     Weiwei
        >     On Feb 1, 2019, 10:49 AM +0800, Xun Liu <[email protected]>, wrote:
        >     > +1
        >     >
        >     > Hello everyone,
        >     >
        >     > I am Xun Liu, the head of the machine learning team at Netease
        > Research Institute. I quite agree with Wangda.
        >     >
        >     > Our team is very grateful for getting Submarine machine learning
        > engine from the community.
        >     > We are heavy users of Submarine.
        >     > Because Submarine fits into the direction of our big data team's
        > hadoop technology stack,
        >     > It avoids the needs to increase the manpower investment in 
learning
        > other container scheduling systems.
        >     > The important thing is that we can use a common YARN cluster to 
run
        > machine learning,
        >     > which makes the utilization of server resources more efficient, 
and
        > reserves a lot of human and material resources in our previous years.
        >     >
        >     > Our team have finished the test and deployment of the Submarine 
and
        > will provide the service to our e-commerce department (
        > http://www.kaola.com/) shortly.
        >     >
        >     > We also plan to provides the Submarine engine in our existing 
YARN
        > cluster in the next six months.
        >     > Because we have a lot of product departments need to use machine
        > learning services,
        >     > for example:
        >     > 1) Game department (http://game.163.com/) needs AI battle 
training,
        >     > 2) News department (http://www.163.com) needs news 
recommendation,
        >     > 3) Mailbox department (http://www.163.com) requires anti-spam 
and
        > illegal detection,
        >     > 4) Music department (https://music.163.com/) requires music
        > recommendation,
        >     > 5) Education department (http://www.youdao.com) requires voice
        > recognition,
        >     > 6) Massive Open Online Courses (https://open.163.com/) requires
        > multilingual translation and so on.
        >     >
        >     > If Submarine can be released independently like Ozone, it will 
help
        > us quickly get the latest features and improvements, and it will be 
great
        > helpful to our team and users.
        >     >
        >     > Thanks hadoop Community!
        >     >
        >     >
        >     > > 在 2019年2月1日，上午2:53，Wangda Tan <[email protected]> 写道：
        >     > >
        >     > > Hi devs,
        >     > >
        >     > > Since we started submarine-related effort last year, we 
received a
        > lot of
        >     > > feedbacks, several companies (such as Netease, China Mobile, 
etc.)
        > are
        >     > > trying to deploy Submarine to their Hadoop cluster along with 
big
        > data
        >     > > workloads. Linkedin also has big interests to contribute a
        > Submarine TonY (
        >     > > https://github.com/linkedin/TonY) runtime to allow users to 
use
        > the same
        >     > > interface.
        >     > >
        >     > > From what I can see, there're several issues of putting 
Submarine
        > under
        >     > > yarn-applications directory and have same release cycle with
        > Hadoop:
        >     > >
        >     > > 1) We started 3.2.0 release at Sep 2018, but the release is 
done
        > at Jan
        >     > > 2019. Because of non-predictable blockers and security 
issues, it
        > got
        >     > > delayed a lot. We need to iterate submarine fast at this 
point.
        >     > >
        >     > > 2) We also see a lot of requirements to use Submarine on older
        > Hadoop
        >     > > releases such as 2.x. Many companies may not upgrade Hadoop 
to 3.x
        > in a
        >     > > short time, but the requirement to run deep learning is 
urgent to
        > them. We
        >     > > should decouple Submarine from Hadoop version.
        >     > >
        >     > > And why we wanna to keep it within Hadoop? First, Submarine
        > included some
        >     > > innovation parts such as enhancements of user experiences for 
YARN
        >     > > services/containerization support which we can add it back to
        > Hadoop later
        >     > > to address common requirements. In addition to that, we have 
a big
        > overlap
        >     > > in the community developing and using it.
        >     > >
        >     > > There're several proposals we have went through during Ozone 
merge
        > to trunk
        >     > > discussion:
        >     > >
        > 
https://mail-archives.apache.org/mod_mbox/hadoop-common-dev/201803.mbox/%3ccahfhakh6_m3yldf5a2kq8+w-5fbvx5ahfgs-x1vajw8gmnz...@mail.gmail.com%3E
        >     > >
        >     > > I propose to adopt Ozone model: which is the same master 
branch,
        > different
        >     > > release cycle, and different release branch. It is a great 
example
        > to show
        >     > > agile release we can do (2 Ozone releases after Oct 2018) 
with less
        >     > > overhead to setup CI, projects, etc.
        >     > >
        >     > > *Links:*
        >     > > - JIRA: https://issues.apache.org/jira/browse/YARN-8135
        >     > > - Design doc
        >     > > <
        > 
https://docs.google.com/document/d/199J4pB3blqgV9SCNvBbTqkEoQdjoyGMjESV4MktCo0k/edit
        > >
        >     > > - User doc
        >     > > <
        > 
https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine/Index.html
        > >
        >     > > (3.2.0
        >     > > release)
        >     > > - Blogposts, {Submarine} : Running deep learning workloads on
        > Apache Hadoop
        >     > > <
        > 
https://hortonworks.com/blog/submarine-running-deep-learning-workloads-apache-hadoop/
        > >,
        >     > > (Chinese Translation: Link <https://www.jishuwen.com/d/2Vpu>)
        >     > > - Talks: Strata Data Conf NY
        >     > > <
        > 
https://conferences.oreilly.com/strata/strata-ny-2018/public/schedule/detail/68289
        > >
        >     > >
        >     > > Thoughts?
        >     > >
        >     > > Thanks,
        >     > > Wangda Tan
        >     >
        >     >
        >     >
        >     > 
---------------------------------------------------------------------
        >     > To unsubscribe, e-mail: [email protected]
        >     > For additional commands, e-mail: [email protected]
        >     >
        >
        >
        >
        
    
    
    ---------------------------------------------------------------------
    To unsubscribe, e-mail: [email protected]
    For additional commands, e-mail: [email protected]

Re: [DISCUSS] Making submarine to different release model like Ozone

Reply via email to