Re: [VOTE] Move Apache Mesos to Attic

2021-04-07 Thread DhilipKumar Sankaranarayanan
+1

All the very best everyone,
really sad to see this great project go to attic :-( also excited about
it's potential forks.

Regards,
Dhilip

On Tue, 6 Apr, 2021, 3:07 PM Benjamin Mahler,  wrote:

> +1 (binding)
>
> Thanks to all who contributed to the project.
>
> On Mon, Apr 5, 2021 at 1:58 PM Vinod Kone  wrote:
>
>> Hi folks,
>>
>> Based on the recent conversations
>> <
>> https://lists.apache.org/thread.html/raed89cc5ab78531c48f56aa1989e1e7eb05f89a6941e38e9bc8803ff%40%3Cuser.mesos.apache.org%3E
>> >
>> on our mailing list, it seems to me that the majority consensus among the
>> existing PMC is to move the project to the attic <
>> https://attic.apache.org/>
>> and let the interested community members collaborate on a fork in Github.
>>
>> I would like to call a vote to dissolve the PMC and move the project to
>> the
>> attic.
>>
>> Please reply to this thread with your vote. Only binding votes from
>> PMC/committers count towards the final tally but everyone in the community
>> is encouraged to vote. See process here
>> .
>>
>> Thanks,
>>
>


[Bangalore Mesos & cncf User Group Meetup] Microservices and Serverless - Joint event with other meetup groups

2017-04-16 Thread DhilipKumar Sankaranarayanan
Hello Mesos & DCOS users,

For the first time, several meetup groups ( in Banglore, India) have come
together to organise a single event on MicroServices and Serverless.  We
have got an excellent line-up of speakers from different organisations
around Banglore.   This covers a spectrum of technologies such as AWS,
Azure, Mesos , gRPC , OpenTracing , OpenWhisk etc.,  Generous Walmart Labs,
Banglore has agreed to host the event.

Please join us on 29th April 2017 @Walmark Labs, let's build an engaging
cloud computing community in Banglore.

For further details:  https://www.meetup.com/Bangal
ore-Mesos-cncf-User-Group/events/238458896/

Regards,
Dhilip


Re: Welcome Neil Conway as Mesos Committer and PMC member!

2017-01-21 Thread DhilipKumar Sankaranarayanan
Congratulations Neil.

On 21 Jan 2017 14:13, "Guangya Liu"  wrote:

> Congrats Neil!!
>
> On Sat, 21 Jan 2017 at 12:34 Vinod Kone  wrote:
>
>> Hi folks,
>>
>> Please welcome Neil Conway as the newest committer and PMC member of the
>> Apache Mesos project.
>>
>> Neil has been an active contributor to Mesos for more than a year now. As
>> part of his work, he has contributed some major features (Partition aware
>> frameworks, floating point operations for resources). Neil also took the
>> initiative to improve the documentation of our project and shepherded
>> several improvements over time. Doing that even without being a committer,
>> shows that he takes ownership of the project seriously.
>>
>> Here is his more formal checklist for your perusal.
>>
>> https://docs.google.com/document/d/137MYwxEw9QCZRH09CXfn1544p1LuM
>> uoj9LxS-sk2_F4/edit
>>
>> Thanks,
>> Vinod
>>
>>
>>


Re: Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-08-09 Thread DhilipKumar Sankaranarayanan
Hi ,

In the proposed the design the PE(Policy Engine) will recommend a new DC
the framework can connect to at the time of bursting.  It is upto the
framework to perform contraction or suppress itself from the ' public cloud
' and use offers from its own datacenter.

Yes the second part is true, the DC to be preferred for what workload is
entirely upto the Policy Engine's implementation.  The idea is that
framework will now connect to multiple Mesos Masters and receive offers
from them at the same time.

Which Master it will burst to  is decided by the PE, how to deal with the
offers after connecting to the master is decided by the Framework itself.

Hi All,

A quick update on the survey, so far 25 have responded and 13 of them
prefer Hierarchical.  Its good to know that 52% of them prefer not to
change the framework much, where as 48% of them dont mind changing the
framework code reasonably to connect multiple masters directly.  I hope
this gets the Mesos committers attention to progress further in this
project. :-)

[image: Inline image 1]

On Sun, Aug 7, 2016 at 10:19 AM, Lawrence Rau <larry...@mac.com> wrote:

> if you “burst” into another datacenter because of unavailable “local”
> resources do you contract when resources are free?  If your purpose of
> bursting is the use case of an on-prem site bursting into a public provider
> it would seem logical the contraction would be desired to free that public
> (and presumably more costly) resources are released.  This is amplified
> perhaps if you end up in multiple DC’s if on each local resource contention
> a different “foreign” cluster is chosen to burst into — unless you further
> plan to bias towards keeping your “burst tasks” in a same DC if possible
> (e.g. given DC-{A,B,C} once you burst from DC-A to DC-B you’d prefer B over
> C for future task launches) or maybe you don’t care (or maybe this is a
> policy?)
>
>
>
> On Jul 15, 2016, at 6:46 PM, DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com> wrote:
>
> Hi All,
>
> I got a chance to bring this up during yesterdays Community Sync.  It was
> great discussing with you all.
>
> As a general feedback the role of policy engine in the design needs to be
> clearer, i will update the Document with more information on PE very soon.
>
> We are yet to get more insight on the License issues like bringing in a
> Mozzilla 2.0 library into an Apache 2.0 project.
>
> It will be fantastic to get more thoughts on this from the community so
> please share if you or your organisation had thought about it.
>
> HI Alex,
>
> Thanks again.
>
> a) Yes you are correct, thats exactly what we thought, a Framework could
> simply query and learn about its next step (bursting or load balancing).
> b)  We are currently thinking that the Framework will run in only one
> place and should be able to connect to other datacenters.  Each data
> centres could have some Frameworks running the local and some part of a
> federation.
>
> Regards,
> Dhilip
>
>
> On Thu, Jul 14, 2016 at 9:17 AM, Alexander Gallego <agall...@concord.io>
> wrote:
>
>>
>>
>> On Thu, Jul 14, 2016 at 2:40 AM, DhilipKumar Sankaranarayanan <
>> s.dhilipku...@gmail.com> wrote:
>>
>>> HI Alex,
>>>
>>> Thanks for taking a look.  We have simplified the design since the
>>> conference.  The Allocation and Anonymous modules where only helping us to
>>> control the offers sent to the frameworks.  Now we think that Roles and
>>> Quota in Moses elegantly solve this problem and we could take advantage of
>>> it.
>>>
>>
>> Sounds good, given that the design is entirely different now, can you
>> share some of these thoughts.
>>
>>
>>>
>>> The current design does not propose Mesos Modules, the POC we
>>> demonstrated @ the mesoscon is slightly out of date in that respect.
>>>
>>> The current design only enforces that any Policy Engine implementation
>>> should honour certain REST apis.   This also removes Consul out of the
>>> picture, but at Huawei our implementation would pretty much consider Consul
>>> or something similar.
>>>
>>> 1) Failure semantics
>>> I do agree it is not straight forward to declare that a DC is lost just
>>> because framework lost the connection intermittently.  Probing the
>>> 'Gossiper' we would know that the DC is still active but not just reachable
>>> to us,  In that case its worth the wait.  If the DC in question is not
>>> reachable from everyother DC, only then we could come to such conclusion.
>>>
>>>
>>
>> how do you envision frameworks integrating w/ this. Are you sayin

Re: Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-08-03 Thread DhilipKumar Sankaranarayanan
Hi All,

Only 9 have responded so far to the survey.  Your response is really
important to understand communities preference.

Thanks in Advance,
Dhilip

On Mon, Aug 1, 2016 at 4:37 PM, DhilipKumar Sankaranarayanan <
s.dhilipku...@gmail.com> wrote:

> Hi All,
>
> Sorry for the long gap.  We had an interesting discussion last week at
> Mesosphere HQ again on this topic before the Mesos SF Meetup.
>
> The discussion revolved around several areas and suggestions on the
> proposed design.
>
> One of the main item that popped up was the approach through which we
> should achieve Mesos Federation.  The intend was to take the approach that
> will be more sensible for the community and easy to adopt by most.
>
> *Approach 1:* (Peer to Peer with a separate policy Engine) Already
> Proposed Design
> *Approach 2:*  (Hierarchical Design) Design similar to Kubernetes
> Federation where we introduce a Federation Layer in-between Framework and
> the Masters.
>
> Both the designs have their unique advantages and dis-advantages.  So here
> is the survey link please provide your feedback, this should set the ball
> rolling for us.
>
> https://goo.gl/forms/DpVRV9Zh3kunhJkP2
>
> If you have third approach to be include please write to me, ill be happy
> to add that in the survey
>
> Regardless of the design chosen, following enhancement to the master will
> be helpful to reduce "offers" traffic across continents.
>
> Enhancement: A framework will be able to send RequestResource( constrains)
> to the master, the master then only sends those offers that match the
> constrain.
>
> Regards,
> Dhilip
>
>
>
>
> On Fri, Jul 15, 2016 at 3:46 PM, DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com> wrote:
>
>> Hi All,
>>
>> I got a chance to bring this up during yesterdays Community Sync.  It was
>> great discussing with you all.
>>
>> As a general feedback the role of policy engine in the design needs to be
>> clearer, i will update the Document with more information on PE very soon.
>>
>> We are yet to get more insight on the License issues like bringing in a
>> Mozzilla 2.0 library into an Apache 2.0 project.
>>
>> It will be fantastic to get more thoughts on this from the community so
>> please share if you or your organisation had thought about it.
>>
>> HI Alex,
>>
>> Thanks again.
>>
>> a) Yes you are correct, thats exactly what we thought, a Framework could
>> simply query and learn about its next step (bursting or load balancing).
>> b)  We are currently thinking that the Framework will run in only one
>> place and should be able to connect to other datacenters.  Each data
>> centres could have some Frameworks running the local and some part of a
>> federation.
>>
>> Regards,
>> Dhilip
>>
>>
>> On Thu, Jul 14, 2016 at 9:17 AM, Alexander Gallego <agall...@concord.io>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 14, 2016 at 2:40 AM, DhilipKumar Sankaranarayanan <
>>> s.dhilipku...@gmail.com> wrote:
>>>
>>>> HI Alex,
>>>>
>>>> Thanks for taking a look.  We have simplified the design since the
>>>> conference.  The Allocation and Anonymous modules where only helping us to
>>>> control the offers sent to the frameworks.  Now we think that Roles and
>>>> Quota in Moses elegantly solve this problem and we could take advantage of
>>>> it.
>>>>
>>>
>>> Sounds good, given that the design is entirely different now, can you
>>> share some of these thoughts.
>>>
>>>
>>>>
>>>> The current design does not propose Mesos Modules, the POC we
>>>> demonstrated @ the mesoscon is slightly out of date in that respect.
>>>>
>>>> The current design only enforces that any Policy Engine implementation
>>>> should honour certain REST apis.   This also removes Consul out of the
>>>> picture, but at Huawei our implementation would pretty much consider Consul
>>>> or something similar.
>>>>
>>>> 1) Failure semantics
>>>> I do agree it is not straight forward to declare that a DC is lost just
>>>> because framework lost the connection intermittently.  Probing the
>>>> 'Gossiper' we would know that the DC is still active but not just reachable
>>>> to us,  In that case its worth the wait.  If the DC in question is not
>>>> reachable from everyother DC, only then we could come to such conclusion.
>>>>
>>>>
>>>
&g

[Meetup] Bangalore's first Mesos Meetup

2016-08-01 Thread DhilipKumar Sankaranarayanan
Hi All,

Happy to announce Bangalore's first Mesos Meetup at Huawei R campus.  All
are welcome. Please RSVP at the below link

https://www.meetup.com/Bangalore-Mesos-cncf-User-Group/events/228745899/

Anyone wants to present or talk about Apache Mesos please let us know.

Regards,
Dhilip


Re: Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-08-01 Thread DhilipKumar Sankaranarayanan
Hi All,

Sorry for the long gap.  We had an interesting discussion last week at
Mesosphere HQ again on this topic before the Mesos SF Meetup.

The discussion revolved around several areas and suggestions on the
proposed design.

One of the main item that popped up was the approach through which we
should achieve Mesos Federation.  The intend was to take the approach that
will be more sensible for the community and easy to adopt by most.

*Approach 1:* (Peer to Peer with a separate policy Engine) Already Proposed
Design
*Approach 2:*  (Hierarchical Design) Design similar to Kubernetes
Federation where we introduce a Federation Layer in-between Framework and
the Masters.

Both the designs have their unique advantages and dis-advantages.  So here
is the survey link please provide your feedback, this should set the ball
rolling for us.

https://goo.gl/forms/DpVRV9Zh3kunhJkP2

If you have third approach to be include please write to me, ill be happy
to add that in the survey

Regardless of the design chosen, following enhancement to the master will
be helpful to reduce "offers" traffic across continents.

Enhancement: A framework will be able to send RequestResource( constrains)
to the master, the master then only sends those offers that match the
constrain.

Regards,
Dhilip




On Fri, Jul 15, 2016 at 3:46 PM, DhilipKumar Sankaranarayanan <
s.dhilipku...@gmail.com> wrote:

> Hi All,
>
> I got a chance to bring this up during yesterdays Community Sync.  It was
> great discussing with you all.
>
> As a general feedback the role of policy engine in the design needs to be
> clearer, i will update the Document with more information on PE very soon.
>
> We are yet to get more insight on the License issues like bringing in a
> Mozzilla 2.0 library into an Apache 2.0 project.
>
> It will be fantastic to get more thoughts on this from the community so
> please share if you or your organisation had thought about it.
>
> HI Alex,
>
> Thanks again.
>
> a) Yes you are correct, thats exactly what we thought, a Framework could
> simply query and learn about its next step (bursting or load balancing).
> b)  We are currently thinking that the Framework will run in only one
> place and should be able to connect to other datacenters.  Each data
> centres could have some Frameworks running the local and some part of a
> federation.
>
> Regards,
> Dhilip
>
>
> On Thu, Jul 14, 2016 at 9:17 AM, Alexander Gallego <agall...@concord.io>
> wrote:
>
>>
>>
>> On Thu, Jul 14, 2016 at 2:40 AM, DhilipKumar Sankaranarayanan <
>> s.dhilipku...@gmail.com> wrote:
>>
>>> HI Alex,
>>>
>>> Thanks for taking a look.  We have simplified the design since the
>>> conference.  The Allocation and Anonymous modules where only helping us to
>>> control the offers sent to the frameworks.  Now we think that Roles and
>>> Quota in Moses elegantly solve this problem and we could take advantage of
>>> it.
>>>
>>
>> Sounds good, given that the design is entirely different now, can you
>> share some of these thoughts.
>>
>>
>>>
>>> The current design does not propose Mesos Modules, the POC we
>>> demonstrated @ the mesoscon is slightly out of date in that respect.
>>>
>>> The current design only enforces that any Policy Engine implementation
>>> should honour certain REST apis.   This also removes Consul out of the
>>> picture, but at Huawei our implementation would pretty much consider Consul
>>> or something similar.
>>>
>>> 1) Failure semantics
>>> I do agree it is not straight forward to declare that a DC is lost just
>>> because framework lost the connection intermittently.  Probing the
>>> 'Gossiper' we would know that the DC is still active but not just reachable
>>> to us,  In that case its worth the wait.  If the DC in question is not
>>> reachable from everyother DC, only then we could come to such conclusion.
>>>
>>>
>>
>> how do you envision frameworks integrating w/ this. Are you saying that
>> frameworks should poll the HTTP endpoint of the Gossiper?
>>
>>
>>
>>> 2)  Can you share more details about the allocator modules.
>>> As mentioned earlier these modules are no longer relevant we have much
>>> simpler way to achieve this.
>>>
>>> 3) High Availability
>>> I think you are talking about the below section?
>>> "Sequence Diagram for High Availability
>>>
>>> (Incase of local datacenter failure)
>>> Very Similar to cloud bursting use-case scenario.  "
>>> The sequence diagram only represents flo

Re: Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-07-15 Thread DhilipKumar Sankaranarayanan
Hi All,

I got a chance to bring this up during yesterdays Community Sync.  It was
great discussing with you all.

As a general feedback the role of policy engine in the design needs to be
clearer, i will update the Document with more information on PE very soon.

We are yet to get more insight on the License issues like bringing in a
Mozzilla 2.0 library into an Apache 2.0 project.

It will be fantastic to get more thoughts on this from the community so
please share if you or your organisation had thought about it.

HI Alex,

Thanks again.

a) Yes you are correct, thats exactly what we thought, a Framework could
simply query and learn about its next step (bursting or load balancing).
b)  We are currently thinking that the Framework will run in only one place
and should be able to connect to other datacenters.  Each data centres
could have some Frameworks running the local and some part of a federation.

Regards,
Dhilip


On Thu, Jul 14, 2016 at 9:17 AM, Alexander Gallego <agall...@concord.io>
wrote:

>
>
> On Thu, Jul 14, 2016 at 2:40 AM, DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com> wrote:
>
>> HI Alex,
>>
>> Thanks for taking a look.  We have simplified the design since the
>> conference.  The Allocation and Anonymous modules where only helping us to
>> control the offers sent to the frameworks.  Now we think that Roles and
>> Quota in Moses elegantly solve this problem and we could take advantage of
>> it.
>>
>
> Sounds good, given that the design is entirely different now, can you
> share some of these thoughts.
>
>
>>
>> The current design does not propose Mesos Modules, the POC we
>> demonstrated @ the mesoscon is slightly out of date in that respect.
>>
>> The current design only enforces that any Policy Engine implementation
>> should honour certain REST apis.   This also removes Consul out of the
>> picture, but at Huawei our implementation would pretty much consider Consul
>> or something similar.
>>
>> 1) Failure semantics
>> I do agree it is not straight forward to declare that a DC is lost just
>> because framework lost the connection intermittently.  Probing the
>> 'Gossiper' we would know that the DC is still active but not just reachable
>> to us,  In that case its worth the wait.  If the DC in question is not
>> reachable from everyother DC, only then we could come to such conclusion.
>>
>>
>
> how do you envision frameworks integrating w/ this. Are you saying that
> frameworks should poll the HTTP endpoint of the Gossiper?
>
>
>
>> 2)  Can you share more details about the allocator modules.
>> As mentioned earlier these modules are no longer relevant we have much
>> simpler way to achieve this.
>>
>> 3) High Availability
>> I think you are talking about the below section?
>> "Sequence Diagram for High Availability
>>
>> (Incase of local datacenter failure)
>> Very Similar to cloud bursting use-case scenario.  "
>> The sequence diagram only represents flow of events in case if the
>> current datacenter fails and the framework needs to connect to a new one.
>> It is not talking about the approach you mentioned.  I will update doc
>> couple more diagrams soon to make it more understandable.  We would
>> certainly like to have a federated K/V storage layer across the DCs which
>> is why Consul was considered in the first place.
>>
>>
> Does this mean that you have to run the actual framework code in all of
> the DC's ?  or you have yet to iron this out?
>
>
>
>
>> 4) Metrics / Monitoring - probably down the line
>> The experimental version of gossiper already queries the maser at a
>> frequent interval and exchange it amongst them.
>>
>> Ultimately DC federation is a hard problem to solve.  We have plenty of
>> use cases which is why we wanted to reach out to the community, share our
>> experience and build something that is useful for all of us.
>>
>>
> Thanks !! excited about this work.
>
>
>> Regards,
>> Dhilip
>>
>>
>> On Wed, Jul 13, 2016 at 7:58 PM, Alexander Gallego <agall...@concord.io>
>> wrote:
>>
>>> This is very cool work, i had a chat w/ another company thinking about
>>> doing the exact same thing.
>>>
>>> I think the proposal is missing several details that make it hard to
>>> evaluate on paper (also saw your presentation).
>>>
>>>
>>> 1) Failure semantics, seem to be the same from the proposed design.
>>>
>>>
>>> As a framework author, how do you suggest you deal w/ tasks on multiple
>>> clusters, i.e.: 

Re: Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-07-14 Thread DhilipKumar Sankaranarayanan
Hi Jeff,

Thanks for takign a loook.  The current design does not enforce Consul, but
im not sure using golang has conflict with Apache Eco-system.  Im very new
to ASF but cant i not incubate a project written in golang under ASF?

I would love to hear more from others too.

Regards,
Dhilip

On Wed, Jul 13, 2016 at 8:46 PM, Jeff Schroeder <jeffschroe...@computer.org>
wrote:

> Would this mean introducing golang and as a result, consul, into mesos
> proper? Seems like a bit of an odd dependency when everything currently
> uses existing ASF projects.
>
> On Wed, Jul 13, 2016 at 5:11 PM, DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com> wrote:
>
>> Hi All,
>>
>> Please find the initial version of the Design Document
>> <https://docs.google.com/document/d/1U4IY_ObAXUPhtTa-0Rw_5zQxHDRnJFe5uFNOQ0VUcLg/edit?usp=sharing>
>> for Federating Mesos Clusters.
>>
>>
>> https://docs.google.com/document/d/1U4IY_ObAXUPhtTa-0Rw_5zQxHDRnJFe5uFNOQ0VUcLg/edit?usp=sharing
>>
>> We at Huawei had been working on this federation project for the past few
>> months.  We also got an opportunity to present this in recent MesosCon
>> 2016. From further discussions and feedback we have received so far, we
>> have greatly simplified the design.
>>
>> Also I see that no one assigned to this JIRA now could i get that
>> assigned to myself ? It would be great to know if there is anyone willing
>> to shepherd this too.
>>
>> I would also like to bring this up in the community Sync that happens
>> tomorrow.
>>
>> We would love to hear your thoughts. We will be glad to see collaborate
>> with you in the implementation.
>>
>> Regards,
>> Dhilip
>>
>>
>> Reference:
>> JIRA: https://issues.apache.org/jira/browse/MESOS-3548
>> Slides:
>> http://www.slideshare.net/mKrishnaKumar1/federated-mesos-clusters-for-global-data-center-designs
>> Video :
>> https://www.youtube.com/watch?v=kqyVQzwwD5E=17=PLGeM09tlguZQVL7ZsfNMffX9h1rGNVqnC
>>
>>
>
>
> --
> Jeff Schroeder
>
> Don't drink and derive, alcohol and analysis don't mix.
> http://www.digitalprognosis.com
>


Re: Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-07-14 Thread DhilipKumar Sankaranarayanan
HI Alex,

Thanks for taking a look.  We have simplified the design since the
conference.  The Allocation and Anonymous modules where only helping us to
control the offers sent to the frameworks.  Now we think that Roles and
Quota in Moses elegantly solve this problem and we could take advantage of
it.

The current design does not propose Mesos Modules, the POC we demonstrated
@ the mesoscon is slightly out of date in that respect.

The current design only enforces that any Policy Engine implementation
should honour certain REST apis.   This also removes Consul out of the
picture, but at Huawei our implementation would pretty much consider Consul
or something similar.

1) Failure semantics
I do agree it is not straight forward to declare that a DC is lost just
because framework lost the connection intermittently.  Probing the
'Gossiper' we would know that the DC is still active but not just reachable
to us,  In that case its worth the wait.  If the DC in question is not
reachable from everyother DC, only then we could come to such conclusion.

2)  Can you share more details about the allocator modules.
As mentioned earlier these modules are no longer relevant we have much
simpler way to achieve this.

3) High Availability
I think you are talking about the below section?
"Sequence Diagram for High Availability

(Incase of local datacenter failure)
Very Similar to cloud bursting use-case scenario.  "
The sequence diagram only represents flow of events in case if the current
datacenter fails and the framework needs to connect to a new one.  It is
not talking about the approach you mentioned.  I will update doc couple
more diagrams soon to make it more understandable.  We would certainly like
to have a federated K/V storage layer across the DCs which is why Consul
was considered in the first place.

4) Metrics / Monitoring - probably down the line
The experimental version of gossiper already queries the maser at a
frequent interval and exchange it amongst them.

Ultimately DC federation is a hard problem to solve.  We have plenty of use
cases which is why we wanted to reach out to the community, share our
experience and build something that is useful for all of us.

Regards,
Dhilip

On Wed, Jul 13, 2016 at 7:58 PM, Alexander Gallego <agall...@concord.io>
wrote:

> This is very cool work, i had a chat w/ another company thinking about
> doing the exact same thing.
>
> I think the proposal is missing several details that make it hard to
> evaluate on paper (also saw your presentation).
>
>
> 1) Failure semantics, seem to be the same from the proposed design.
>
>
> As a framework author, how do you suggest you deal w/ tasks on multiple
> clusters, i.e.: i feel like there have to be richer semantics about the
> task at least on the mesos.proto level where the state is
> STATUS_FAILED_DC_OUTAGE or smth along those lines.
>
> We respawn operators and having this information may allow me as a
> framework author to wait a little longer before trying to declare that task
> as dead (KILLED/FAILED/LOST) if I spawn it on a different data center.
>
> Would love to get details on how you were thinking of extending the
> failure semantics for multi datacenters.
>
>
> 2) Can you share more details about the allocator modules.
>
>
> After reading the proposal, I anderstand it as follows.
>
>
> [ gossiper ] -> [ allocator module ] -> [mesos master]
>
>
> Is this correct ? if so, you are saying that you can tell the mesos master
> to run a task  that was fulfilled by a framework on a different data
> center?
>
> Is the constraint that you are forced to run a scheduler per framework on
> each data center?
>
>
>
> 3) High availability
>
>
> High availability on a multi dc layout means something entirely different.
> So are all frameworks now on standby on every other cluster? the problem i
> see with this is that the metadata stored by each framework to support HA
> now has to spans multiple DC's. It would be nice to perhaps at the mesos
> level extend/expose an API for setting state.
>
> a) On the normal mesos layout, this key=value data store would be
> zookeeper.
>
> b) On the multi dc layout it could be zookeeper per data center but then
> one can piggy back on the gossiper to replicate that state in the other
> data centers.
>
>
> 4) Metrics / Monitoring - probably down the line, but would be good to
> also piggy back some of the mesos master endpoints
> through the gossip architecture.
>
>
>
> Again very cool work, would love to get some more details on the actual
> implementation that you built plus some of the points above.
>
> - Alex
>
>
>
>
>
>
>
> On Wed, Jul 13, 2016 at 6:11 PM, DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com> wrote:
>
>&g

Initial Design Document Apache Mesos Federation (JIRA 3548)

2016-07-13 Thread DhilipKumar Sankaranarayanan
Hi All,

Please find the initial version of the Design Document

for Federating Mesos Clusters.

https://docs.google.com/document/d/1U4IY_ObAXUPhtTa-0Rw_5zQxHDRnJFe5uFNOQ0VUcLg/edit?usp=sharing

We at Huawei had been working on this federation project for the past few
months.  We also got an opportunity to present this in recent MesosCon
2016. From further discussions and feedback we have received so far, we
have greatly simplified the design.

Also I see that no one assigned to this JIRA now could i get that assigned
to myself ? It would be great to know if there is anyone willing to
shepherd this too.

I would also like to bring this up in the community Sync that happens
tomorrow.

We would love to hear your thoughts. We will be glad to see collaborate
with you in the implementation.

Regards,
Dhilip


Reference:
JIRA: https://issues.apache.org/jira/browse/MESOS-3548
Slides:
http://www.slideshare.net/mKrishnaKumar1/federated-mesos-clusters-for-global-data-center-designs
Video :
https://www.youtube.com/watch?v=kqyVQzwwD5E=17=PLGeM09tlguZQVL7ZsfNMffX9h1rGNVqnC


Re: A Redis Framework for Apache Mesos

2016-07-05 Thread DhilipKumar Sankaranarayanan
Thanks Avinash, It is done now.

On Mon, Jul 4, 2016 at 12:01 PM, Avinash Sridharan <avin...@mesosphere.io>
wrote:

>
>
> On Mon, Jul 4, 2016 at 12:11 AM, DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com> wrote:
>
>> *"*
>> *Would be nice to get on the frameworks page:*
>>
>> *http://mesos.apache.org/documentation/latest/frameworks/
>> <http://mesos.apache.org/documentation/latest/frameworks/> *
>> *?*
>> *"*
>> Sure i would love to do that, where should i raise a PR?
>>
> I think you can raise a PR on github. The preferred route though is to
> generate an RB request on Apache review board (
> https://reviews.apache.org/r/) . One of the Shepherds could help with
> this.
> +Vinod ^^
>
>> *"Also is it already part of the DC/OS universe ?"*
>>
>> Yes we have reasonable documentation
>> <https://github.com/mesos/mr-redis#dcos>with regards to DCOS
>> integration, Please give it a try and it will be awesome to get your
>> feedback on that.
>>
>> Regards,
>> Dhilip
>>
>> On Sun, Jul 3, 2016 at 6:09 PM, Avinash Sridharan <avin...@mesosphere.io>
>> wrote:
>>
>>> Would be nice to get on the frameworks page:
>>> http://mesos.apache.org/documentation/latest/frameworks/
>>> ?
>>>
>>> Also is it already part of the DC/OS universe ?
>>>
>>> On Sun, Jul 3, 2016 at 2:29 PM, Christoph Heer <christ...@thelabmill.de>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> it looks really cool. Can you maybe explain why do you use etcd for
>>>> leader election instead of the normally already existing Zookeeper cluster?
>>>> Do you use some special etcd features?
>>>>
>>>> Best regards
>>>> Christoph
>>>>
>>>> > On 03 Jul 2016, at 17:49, tommy xiao <xia...@gmail.com> wrote:
>>>> >
>>>> > Cool. thanks for your sharing.
>>>> >
>>>> > 2016-07-03 23:44 GMT+08:00 DhilipKumar Sankaranarayanan <
>>>> s.dhilipku...@gmail.com>:
>>>> > Hello All,
>>>> >
>>>> >
>>>> >
>>>> > We have built a framework for provisioning redis-servers in Apache
>>>> Mesos enabled infrastructure. It would be awesome to get communities
>>>> feedback. While mesos ecosystem is strengthening its capability in storage
>>>> layer, redis could be an addition to the lineup.  This is primarily
>>>> intended for providers who would like to host redis as a service in their
>>>> infrastructure.
>>>> >
>>>> >
>>>> >
>>>> > There is an elaborate README about the project which should help in
>>>> setting it up: https://github.com/mesos/mr-redis. Please let us know
>>>> if there is anything missing in the documentation by raising a PR or
>>>> opening an issue or even writing to us.
>>>> >
>>>> >
>>>> >
>>>> > This project is also packaged as  mr-redis with DCOS so should be
>>>> pretty straight forward to install it via DCOS CLI or DCOS GUI.
>>>> >
>>>> >
>>>> >
>>>> > (A Step by Step guide is provided in the README for your convenience)
>>>> >
>>>> >
>>>> >
>>>> > Salient Features of this project include:
>>>> >
>>>> > 1)  Create multiple redis clusters (Master-Slave Cluster) with
>>>> ease
>>>> >
>>>> > 2)  Redis instances recover in seconds and not in minutes
>>>> >
>>>> > 3)  If a Master fails a slaves is automatically promoted as the
>>>> New master, all the old slaves now replicate from the newly master plus a
>>>> new slave is added to the cluster.  All this without using redis-sentinel
>>>> in your datacenter and all these happens in a couple of seconds.
>>>> >
>>>> > 4)  A CLI to perform basic operations such as create / status /
>>>> delete redis instances on the fly.  CLI is cross compiled for Windows and
>>>> Darwin users too.
>>>> >
>>>> > 5)  Scheduler is a HTTP REST server which also can respond to
>>>> simple Angular UI we have built to get started with.  Instructions on how
>>>> to setup the UI is here.
>>>> https://github.com/mesos/mr-redis/tree/master/ui/

Re: A Redis Framework for Apache Mesos

2016-07-04 Thread DhilipKumar Sankaranarayanan
We use ETCD to store the state of the scheduler (mr-redis) not to perform
master election like "Mesos Master".  The Scheduler itself is stateless it
stores plenty of information with respect to the redis instances in etcd.
 we already have a request <https://github.com/mesos/mr-redis/issues/17>from
a user to start supporting zookeeper from the community.  I haven't started
working on that yet.  Any volunteers are welcome.

Plus etcd is not very hard to setup either, probably few commands
<https://github.com/coreos/etcd#getting-started> and you have a running
instance of etcd.

Hope this makes sense,
Dhilip

On Sun, Jul 3, 2016 at 2:29 PM, Christoph Heer <christ...@thelabmill.de>
wrote:

> Hi,
>
> it looks really cool. Can you maybe explain why do you use etcd for leader
> election instead of the normally already existing Zookeeper cluster? Do you
> use some special etcd features?
>
> Best regards
> Christoph
>
> > On 03 Jul 2016, at 17:49, tommy xiao <xia...@gmail.com> wrote:
> >
> > Cool. thanks for your sharing.
> >
> > 2016-07-03 23:44 GMT+08:00 DhilipKumar Sankaranarayanan <
> s.dhilipku...@gmail.com>:
> > Hello All,
> >
> >
> >
> > We have built a framework for provisioning redis-servers in Apache Mesos
> enabled infrastructure. It would be awesome to get communities feedback.
> While mesos ecosystem is strengthening its capability in storage layer,
> redis could be an addition to the lineup.  This is primarily intended for
> providers who would like to host redis as a service in their infrastructure.
> >
> >
> >
> > There is an elaborate README about the project which should help in
> setting it up: https://github.com/mesos/mr-redis. Please let us know if
> there is anything missing in the documentation by raising a PR or opening
> an issue or even writing to us.
> >
> >
> >
> > This project is also packaged as  mr-redis with DCOS so should be pretty
> straight forward to install it via DCOS CLI or DCOS GUI.
> >
> >
> >
> > (A Step by Step guide is provided in the README for your convenience)
> >
> >
> >
> > Salient Features of this project include:
> >
> > 1)  Create multiple redis clusters (Master-Slave Cluster) with ease
> >
> > 2)  Redis instances recover in seconds and not in minutes
> >
> > 3)  If a Master fails a slaves is automatically promoted as the New
> master, all the old slaves now replicate from the newly master plus a new
> slave is added to the cluster.  All this without using redis-sentinel in
> your datacenter and all these happens in a couple of seconds.
> >
> > 4)  A CLI to perform basic operations such as create / status /
> delete redis instances on the fly.  CLI is cross compiled for Windows and
> Darwin users too.
> >
> > 5)  Scheduler is a HTTP REST server which also can respond to simple
> Angular UI we have built to get started with.  Instructions on how to setup
> the UI is here. https://github.com/mesos/mr-redis/tree/master/ui/app
> >
> >
> >
> > We also had an opportunity to talk about this during the recent MesosCon
> 2016:
> https://www.youtube.com/watch?v=xe-Gom5tOl0=32=PLGeM09tlguZQVL7ZsfNMffX9h1rGNVqnC
> >
> >
> >
> > Future Work:
> >
> > · Implement a proxy technique to expose one single endpoint for
> the redis instance.  (Work in progress)
> >
> > · Implement Memory Cgroups per redis PROCS
> >
> > · Add support for Redis 3.0 cluster instances (Adding shards to
> a running redis instance)
> >
> > · Implement integration test suite and benchmarking suite to the
> framework.
> >
> >
> >
> > Special thanks to Adobe.io team who expressed interest in collaborating
> in the development of this product. I’m sure it’s going to be great working
> with all of you folks.
> >
> >
> >
> > Advanced happy Independence day America and happy week ahead rest of the
> world.
> >
> >
> >
> > Looking forward to hear from you all,
> >
> > Dhilip
> >
> >
> >
> >
> >
> > --
> > Deshi Xiao
> > Twitter: xds2000
> > E-mail: xiaods(AT)gmail.com
>
>


Re: A Redis Framework for Apache Mesos

2016-07-04 Thread DhilipKumar Sankaranarayanan
*"*
*Would be nice to get on the frameworks page:*

*http://mesos.apache.org/documentation/latest/frameworks/
<http://mesos.apache.org/documentation/latest/frameworks/> *
*?*
*"*
Sure i would love to do that, where should i raise a PR?

*"Also is it already part of the DC/OS universe ?"*

Yes we have reasonable documentation
<https://github.com/mesos/mr-redis#dcos>with regards to DCOS integration,
Please give it a try and it will be awesome to get your feedback on that.

Regards,
Dhilip

On Sun, Jul 3, 2016 at 6:09 PM, Avinash Sridharan <avin...@mesosphere.io>
wrote:

> Would be nice to get on the frameworks page:
> http://mesos.apache.org/documentation/latest/frameworks/
> ?
>
> Also is it already part of the DC/OS universe ?
>
> On Sun, Jul 3, 2016 at 2:29 PM, Christoph Heer <christ...@thelabmill.de>
> wrote:
>
>> Hi,
>>
>> it looks really cool. Can you maybe explain why do you use etcd for
>> leader election instead of the normally already existing Zookeeper cluster?
>> Do you use some special etcd features?
>>
>> Best regards
>> Christoph
>>
>> > On 03 Jul 2016, at 17:49, tommy xiao <xia...@gmail.com> wrote:
>> >
>> > Cool. thanks for your sharing.
>> >
>> > 2016-07-03 23:44 GMT+08:00 DhilipKumar Sankaranarayanan <
>> s.dhilipku...@gmail.com>:
>> > Hello All,
>> >
>> >
>> >
>> > We have built a framework for provisioning redis-servers in Apache
>> Mesos enabled infrastructure. It would be awesome to get communities
>> feedback. While mesos ecosystem is strengthening its capability in storage
>> layer, redis could be an addition to the lineup.  This is primarily
>> intended for providers who would like to host redis as a service in their
>> infrastructure.
>> >
>> >
>> >
>> > There is an elaborate README about the project which should help in
>> setting it up: https://github.com/mesos/mr-redis. Please let us know if
>> there is anything missing in the documentation by raising a PR or opening
>> an issue or even writing to us.
>> >
>> >
>> >
>> > This project is also packaged as  mr-redis with DCOS so should be
>> pretty straight forward to install it via DCOS CLI or DCOS GUI.
>> >
>> >
>> >
>> > (A Step by Step guide is provided in the README for your convenience)
>> >
>> >
>> >
>> > Salient Features of this project include:
>> >
>> > 1)  Create multiple redis clusters (Master-Slave Cluster) with ease
>> >
>> > 2)  Redis instances recover in seconds and not in minutes
>> >
>> > 3)  If a Master fails a slaves is automatically promoted as the New
>> master, all the old slaves now replicate from the newly master plus a new
>> slave is added to the cluster.  All this without using redis-sentinel in
>> your datacenter and all these happens in a couple of seconds.
>> >
>> > 4)  A CLI to perform basic operations such as create / status /
>> delete redis instances on the fly.  CLI is cross compiled for Windows and
>> Darwin users too.
>> >
>> > 5)  Scheduler is a HTTP REST server which also can respond to
>> simple Angular UI we have built to get started with.  Instructions on how
>> to setup the UI is here.
>> https://github.com/mesos/mr-redis/tree/master/ui/app
>> >
>> >
>> >
>> > We also had an opportunity to talk about this during the recent
>> MesosCon 2016:
>> https://www.youtube.com/watch?v=xe-Gom5tOl0=32=PLGeM09tlguZQVL7ZsfNMffX9h1rGNVqnC
>> >
>> >
>> >
>> > Future Work:
>> >
>> > · Implement a proxy technique to expose one single endpoint for
>> the redis instance.  (Work in progress)
>> >
>> > · Implement Memory Cgroups per redis PROCS
>> >
>> > · Add support for Redis 3.0 cluster instances (Adding shards to
>> a running redis instance)
>> >
>> > · Implement integration test suite and benchmarking suite to
>> the framework.
>> >
>> >
>> >
>> > Special thanks to Adobe.io team who expressed interest in collaborating
>> in the development of this product. I’m sure it’s going to be great working
>> with all of you folks.
>> >
>> >
>> >
>> > Advanced happy Independence day America and happy week ahead rest of
>> the world.
>> >
>> >
>> >
>> > Looking forward to hear from you all,
>> >
>> > Dhilip
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Deshi Xiao
>> > Twitter: xds2000
>> > E-mail: xiaods(AT)gmail.com
>>
>>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245
>


A Redis Framework for Apache Mesos

2016-07-03 Thread DhilipKumar Sankaranarayanan
Hello All,



We have built a framework for provisioning redis-servers in Apache Mesos
enabled infrastructure. It would be awesome to get communities
feedback. While mesos ecosystem is strengthening its capability in storage
layer, redis could be an addition to the lineup.  This is primarily
intended for providers who would like to host redis as a service in their
infrastructure.



There is an elaborate README about the project which should help in setting
it up: https://github.com/mesos/mr-redis. Please let us know if there is
anything missing in the documentation by raising a PR or opening an issue
or even writing to us.



This project is also packaged as  mr-redis with DCOS so should be pretty
straight forward to install it via DCOS CLI or DCOS GUI.



(A Step by Step guide is provided in the README for your convenience)



Salient Features of this project include:

1)  Create multiple redis clusters (Master-Slave Cluster) with ease

2)  Redis instances recover in seconds and not in minutes

3)  If a Master fails a slaves is automatically promoted as the New
master, all the old slaves now replicate from the newly master plus a new
slave is added to the cluster.  All this without using redis-sentinel in
your datacenter and all these happens in a couple of seconds.

4)  A CLI to perform basic operations such as create / status / delete
redis instances on the fly.  CLI is cross compiled for Windows and Darwin
users too.

5)  Scheduler is a HTTP REST server which also can respond to simple
Angular UI we have built to get started with.  Instructions on how to setup
the UI is here. https://github.com/mesos/mr-redis/tree/master/ui/app



We also had an opportunity to talk about this during the recent MesosCon
2016:
https://www.youtube.com/watch?v=xe-Gom5tOl0=32=PLGeM09tlguZQVL7ZsfNMffX9h1rGNVqnC



Future Work:

· Implement a proxy technique to expose one single endpoint for the
redis instance.  (Work in progress)

· Implement Memory Cgroups per redis PROCS

· Add support for Redis 3.0 cluster instances (Adding shards to a
running redis instance)

· Implement integration test suite and benchmarking suite to the
framework.



Special thanks to Adobe.io team who expressed interest in collaborating in
the development of this product. I’m sure it’s going to be great working
with all of you folks.



Advanced happy Independence day America and happy week ahead rest of the
world.



Looking forward to hear from you all,

Dhilip