Re: MaaS + Apache Twill ?

2017-04-18 Thread James Sirota
I would stick to MaaS.  We already have service discovery and scaling built 
into MaaS so we really don't have a need for features that Twill offers.  I 
think had we known that Twill existed a few years ago we would have considered 
it, but given where we are today I don't see a reason to switch.

Thanks,
James 

18.04.2017, 12:30, "Nick Allen" :
>>   Can Twill handle long-running applications?
>
> I did see this mentioned in the slide deck for "Big Data North America
> 2016" around slide 33 or so. The deck is referenced from their home page.
> If we blindly trust the slide deck, then yes, it does support this. :)
>
> On Tue, Apr 18, 2017 at 3:12 PM, Casey Stella  wrote:
>
>>  Some things to look for that address some of the complexities within MaaS:
>>
>> - Can Twill handle long-running applications?
>> - Does Twill provide any sort of name service abstraction or somesuch
>> that lets us communicate with the app master after app deployment?
>> - Does Twill provide any mechanism for impersonation (i.e. we can run
>> the MaaS twill app as metron, but submit and run models as $user)?
>>
>>  On Tue, Apr 18, 2017 at 1:50 PM, Nick Allen  wrote:
>>
>>  > I ran across the Apache Twill [1] project recently whose goal is to
>>  reduce
>>  > the complexity of developing distributed applications that run on YARN.
>>  My
>>  > first thought is that it might offer additional capabilities and/or
>>  > simplify our current MaaS implementation.
>>  >
>>  > Here are a list of features provided by Twill that I think might be
>>  useful
>>  > for MaaS.
>>  >
>>  > - Service discovery
>>  > - Elastic scaling
>>  > - High Availability
>>  > - Placement policies - Which rack/host should the model run on?
>>  > - Security - Kerberos ticket refresh?
>>  >
>>  > Just wanted to float the thought in the community and see if anyone has
>>  > experience with Twill. I need to do some more research myself.
>>  >
>>  > [1] http://twill.apache.org/
>>  >

--- 
Thank you,

James Sirota
PPMC- Apache Metron (Incubating)
jsirota AT apache DOT org


Re: MaaS + Apache Twill ?

2017-04-18 Thread Casey Stella
One more thought, I definitely am not opposed to making it distributed
resource manager independent.  I think what Otto is suggesting isn't a bad
thread to pull on.  Right now, MaaS is tied to Yarn inherently and it'd be
nice to make that dependency pluggable.  This would allow us to use other
Lambda or Kubernetes or whatever for model deployment, which would be
really neat.

On Tue, Apr 18, 2017 at 3:02 PM, Casey Stella  wrote:

> Regarding model performance, I've thought about that a bit.  What I'd like
> to see MaaS be able to do is provide an API that the models can communicate
> through that will send events to kafka and provides a telemetry like any
> other.  Performance statistics, raw results for downstream analysis.  We
> have a system capable of analyzing telemetry data, it seems to me like MaaS
> should use the dogfood in which it runs.  I think we should build an API by
> which the model can communicate with kafka.
>
> One of the things that I very much like about MaaS is that it's light on
> the opinion about the language and library.  I think that building the API
> should be as simple as a log file that is monitored and the MaaS runner
> will provide that proxy to kafka.  I do think that it should be easier to
> write models, but I think that should be solved through applications that
> will turn model collateral into REST APIs if they conform to certain
> standards (i.e. PMML, Spark MLLib serialized models, etc.) and allow users
> the freedom to engage with MaaS via their own mechanism in the language of
> their choice if their situation doesn't conform to our expectations.
>
> On Tue, Apr 18, 2017 at 2:50 PM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
>
>> Completely agree. It would be great to get more info from your experience.
>>
>> To an extent what we have at the moment is very much about isolating the
>> implementation of a model from the deployment and discovery mechanism, and
>> to my mind we should very much keep that to enable any kind of model to
>> plugin. The other thing worth discussing here would be how we can wrap
>> around a model while maintaining the lose coupling, to provide things like
>> generalised performance metrics for the models. Any thoughts on that front
>> anyone?
>>
>>
>> > On 18 Apr 2017, at 11:45, Otto Fowler  wrote:
>> >
>> > I will have to go back to my notes.  There was a day or so when I went
>> through the code and was thinking of a couple of things, but that was a
>> while ago.
>> >
>> > Off the top of my head, I would want something factored enough, or
>> loosely coupled enough that it was not dependent on twill or maas or
>> anything else.  This
>> > would not impose implementation on the services.   This would have to
>> revolve around a discovery/registration api and a rest interface contract.
>> >
>> > Does that make sense?
>> >
>> >
>> >
>> >
>> > On April 18, 2017 at 14:34:28, Simon Elliston Ball (
>> si...@simonellistonball.com ) wrote:
>> >
>> >> Right, how about some sort of REST API? Or through Ambari? What would
>> you say was the best way to start the service, and of course to submit
>> model artefacts?
>> >>
>> >> Simon
>> >>
>> >>> On 18 Apr 2017, at 11:33, Otto Fowler > > wrote:
>> >>>
>> >>> In my mind, I didn’t want to deploy the service as a bash script or
>> wrapped in one, if I recall correctly.
>> >>>
>> >>>
>> >>>
>> >>> On April 18, 2017 at 14:27:52, Simon Elliston Ball (
>> si...@simonellistonball.com ) wrote:
>> >>>
>>  Any particular issues, or things that didn’t work Otto?
>> 
>>  Simon
>> 
>> 
>>  > On 18 Apr 2017, at 11:26, Otto Fowler > > wrote:
>>  >
>>  > I’ll try to take a look. There are a couple of things I wanted to
>> do with MaaS but could not
>>  > figure out because of a couple of limitations. I’d like to see if
>> twill offers more flexibility
>>  >
>>  >
>>  >
>>  >
>>  > On April 18, 2017 at 13:50:39, Nick Allen (n...@nickallen.org
>> ) wrote:
>>  >
>>  > I ran across the Apache Twill [1] project recently whose goal is
>> to reduce
>>  > the complexity of developing distributed applications that run on
>> YARN. My
>>  > first thought is that it might offer additional capabilities and/or
>>  > simplify our current MaaS implementation.
>>  >
>>  > Here are a list of features provided by Twill that I think might
>> be useful
>>  > for MaaS.
>>  >
>>  > - Service discovery
>>  > - Elastic scaling
>>  > - High Availability
>>  > - Placement policies - Which rack/host should the model run on?
>>  > - Security - Kerberos ticket refresh?
>>  >
>>  > Just wanted to float the thought in the community and see if
>> anyone has
>> 

Re: MaaS + Apache Twill ?

2017-04-18 Thread Casey Stella
Regarding model performance, I've thought about that a bit.  What I'd like
to see MaaS be able to do is provide an API that the models can communicate
through that will send events to kafka and provides a telemetry like any
other.  Performance statistics, raw results for downstream analysis.  We
have a system capable of analyzing telemetry data, it seems to me like MaaS
should use the dogfood in which it runs.  I think we should build an API by
which the model can communicate with kafka.

One of the things that I very much like about MaaS is that it's light on
the opinion about the language and library.  I think that building the API
should be as simple as a log file that is monitored and the MaaS runner
will provide that proxy to kafka.  I do think that it should be easier to
write models, but I think that should be solved through applications that
will turn model collateral into REST APIs if they conform to certain
standards (i.e. PMML, Spark MLLib serialized models, etc.) and allow users
the freedom to engage with MaaS via their own mechanism in the language of
their choice if their situation doesn't conform to our expectations.

On Tue, Apr 18, 2017 at 2:50 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Completely agree. It would be great to get more info from your experience.
>
> To an extent what we have at the moment is very much about isolating the
> implementation of a model from the deployment and discovery mechanism, and
> to my mind we should very much keep that to enable any kind of model to
> plugin. The other thing worth discussing here would be how we can wrap
> around a model while maintaining the lose coupling, to provide things like
> generalised performance metrics for the models. Any thoughts on that front
> anyone?
>
>
> > On 18 Apr 2017, at 11:45, Otto Fowler  wrote:
> >
> > I will have to go back to my notes.  There was a day or so when I went
> through the code and was thinking of a couple of things, but that was a
> while ago.
> >
> > Off the top of my head, I would want something factored enough, or
> loosely coupled enough that it was not dependent on twill or maas or
> anything else.  This
> > would not impose implementation on the services.   This would have to
> revolve around a discovery/registration api and a rest interface contract.
> >
> > Does that make sense?
> >
> >
> >
> >
> > On April 18, 2017 at 14:34:28, Simon Elliston Ball (
> si...@simonellistonball.com ) wrote:
> >
> >> Right, how about some sort of REST API? Or through Ambari? What would
> you say was the best way to start the service, and of course to submit
> model artefacts?
> >>
> >> Simon
> >>
> >>> On 18 Apr 2017, at 11:33, Otto Fowler  > wrote:
> >>>
> >>> In my mind, I didn’t want to deploy the service as a bash script or
> wrapped in one, if I recall correctly.
> >>>
> >>>
> >>>
> >>> On April 18, 2017 at 14:27:52, Simon Elliston Ball (
> si...@simonellistonball.com ) wrote:
> >>>
>  Any particular issues, or things that didn’t work Otto?
> 
>  Simon
> 
> 
>  > On 18 Apr 2017, at 11:26, Otto Fowler  > wrote:
>  >
>  > I’ll try to take a look. There are a couple of things I wanted to
> do with MaaS but could not
>  > figure out because of a couple of limitations. I’d like to see if
> twill offers more flexibility
>  >
>  >
>  >
>  >
>  > On April 18, 2017 at 13:50:39, Nick Allen (n...@nickallen.org
> ) wrote:
>  >
>  > I ran across the Apache Twill [1] project recently whose goal is to
> reduce
>  > the complexity of developing distributed applications that run on
> YARN. My
>  > first thought is that it might offer additional capabilities and/or
>  > simplify our current MaaS implementation.
>  >
>  > Here are a list of features provided by Twill that I think might be
> useful
>  > for MaaS.
>  >
>  > - Service discovery
>  > - Elastic scaling
>  > - High Availability
>  > - Placement policies - Which rack/host should the model run on?
>  > - Security - Kerberos ticket refresh?
>  >
>  > Just wanted to float the thought in the community and see if anyone
> has
>  > experience with Twill. I need to do some more research myself.
>  >
>  > [1] http://twill.apache.org/ 
>