I thought I read that there was going to be a registry implementation backed by zookeeper; does anyone know why that was dropped?
Really excited to see the containerizer features rolling in, but the quorum looks at first glance to make Mesos a little harder to operate ("This means adding or removing masters must be done carefully! ") - I understand the benefits but was hoping we could get by with the zookeeper registry. On 13 June 2014 03:49, Dave Lester <daveles...@gmail.com> wrote: > Hi All, > > Below is a blog post that Ben Mahler wrote as release manager for Mesos > 0.19.0; it was published on the Mesos site today. > > I know that not everyone follows @ApacheMesos Twitter (even though you > should!), so I wanted to make sure was also shared on the user@ list. > > Cheers, > Dave > > > Apache Mesos 0.19.0 Released > > The latest Mesos release, 0.19.0 is now available for download. This new > version includes the following features and improvements: > > The master now persists the list of registered slaves in a durable > replicated manner using the Registrar and the replicated log. > Alpha support for custom container technologies has been added with the > ExternalContainerizer. > Metrics reporting has been overhauled and is now exposed on > <ip:port>/metrics/snapshot. > Slave Authentication: optionally, only authenticated slaves can register > with the master. > Numerous bug fixes and stability improvements. > > Full release notes are available on JIRA. > > Registrar > > Mesos 0.19.0 introduces the “Registrar”: the master now persists the list of > registered slaves in a durable replicated manner. The previous lack of > durable state was an intentional design decision that simplified failover > and allowed masters to be run and migrated with ease. However, the stateless > design had issues: > > In the event of a dual failure (slave fails while master is down), no lost > task notifications are sent. This leads to a task running according to the > framework but unknown to Mesos. > When a new master is elected, we may allow rogue slaves to re-register with > the master. This leads to tasks running on the slave that are not known to > the framework. > > Persisting the list of registered slaves allows failed over masters to > detect slaves that do not re-register, and notify frameworks accordingly. It > also allows us to prevent rogue slaves from re-registering; terminating the > rogue tasks in the process. > > The state is persisted using the replicated log (available since 0.9.0). > > External Containerization > > As alluded to during the containerization / isolation refactor in 0.18.0, > the ExternalContainerizer has landed in this release. This provides alpha > level support for custom containerization. > > Developers can implement their own external containerizers to provide > support for custom container technologies. Initial Docker support is now > available through some community driven external containerizers: Docker > Containerizer for Mesos by Tom Arnfeld and Deimos by Jason Dusek. Please > reach out on the mailing lists with questions! > > Metrics > > Previously, Mesos components had to use custom metrics code and custom HTTP > endpoints for exposing metrics. This made it difficult to expose additional > system metrics and often required having an endpoint for each libprocess > Process (Actor) for which metrics were desired. Having metrics spread across > endpoints was operationally complex. > > We needed a consistent, simple, and global way to expose metrics, which led > to the creation of a metrics library within libprocess. All metrics are now > exposed via /metrics/snapshot. The /stats.json endpoint remains for > backwards compatibility. > > Upgrading > > For backwards compatibility, the “Registrar” will be enabled in a phased > manner. By default, the “Registrar” is write-only in 0.19.0 and will be > read/write in 0.20.0. > > If running in high-availability mode with ZooKeeper, operators must now > specify the --work_dir for the master, along with the --quorum size of the > ensemble of masters. This means adding or removing masters must be done > carefully! The best practice is to only ever add or remove a single master > at a time and to allow a small amount of time for the replicated log to > catch up on the new master. Maintenance documentation will be added to > reflect this. > > Please refer to the upgrades document, which details how to perform an > upgrade from 0.18.x. > > Future Work > > Thanks to the Registrar, reconciliation primitives can now be provided to > ensure that the state of tasks between Mesos and frameworks is kept > consistent. This will remove the need for frameworks to implement > out-of-band task reconciliation to inspect the state of slaves. > Reconciliation work is being tracked at MESOS-1407. > > The addition of state through the Registrar opens up a rich set of possible > features that were previously not possible due to the lack of persistent > state in the master. These include: > > Cluster maintenance primitives (MESOS-1474) > Repair automation (MESOS-695) > Global resource reservations > > Getting Involved > > We encourage you to try out this release, and let us know what you think and > if you hit any issues on the user mailing list. You can also get in touch > with us via @ApacheMesos or via mailing lists and IRC. > > Thanks > > Thanks to the 32 contributors who made 0.19.0 possible: > > Ashutosh Jain, Adam B, Alexandra Sava, Anton Lindström, Archana kumari, > Benjamin Hindman, Benjamin Mahler, Bernardo Gomez Palacio, Bernd Mathiske, > Charlie Carson, Chengwei Yang, Chi Zhang, Dave Lester, Dominic Hamon, Ian > Downes, Isabel Jimenez, Jake Farrell, Jameel, Al-Aziz, Jiang Yan Xu, Jie Yu, > Nikita Vetoshkin, Niklas Q. Nielsen, Ritwik Yadav, Sam Taha, Steven Phung, > Till Toenshoff, Timothy St. Clair, Tobi Knaup, Tom Arnfeld, Tom Galloway, > Vinod Kone, Vinson Lee