Re: Talks at MesosCon 2015
Yes, agree that. I am looking for live video today on the youtube, but nothing there. Definitely the forum will help to follow up and discuss from there Thanks! Kenneth On Thu, Aug 20, 2015 at 11:26 AM, Haripriya Ayyalasomayajula aharipriy...@gmail.com wrote: Hi all, I'm at the MesosCon 2015 today and was just curious if all the talks/ presentations would be captured anywhere( mesosphere blog/ youtube). It would be very helpful to have them recorded. There are multiple interesting talks at the same time scheduled and Its not possible to cover all. I strongly believe if we have a forum to follow up with these talks / topics presented here will be helpful. Thanks. -- Regards, Haripriya Ayyalasomayajula
Re: StratOS: A Big Data platform for scientific computing
good to know about it! thanks! On Tue, Mar 10, 2015 at 3:14 PM, Nathaniel Stickley idi...@gmail.com wrote: Hello Mesos users, I am not sure that this is the best place for this announcement, but I thought it would be worth a try... The Multidisciplinary Image Processing Laboratory at the University of California, Riverside, is announcing a Mesos-based Big Data framework for scientific computing. The project is currently called StratOS (because it is closer to the user than Mesos). Although StratOS is primarily designed for scientists, it is useful for a much larger group of people because of its generality. StratOS can be thought of as a step between classical batch processors, like TORQUE, and the modern framework, Apache Spark. It is an HDFS-aware framework that allows arbitrary command-line-driven applications to be used in a datacenter. Pre-existing code can be used without modification and a Python module is provided for interactive use and scripting. The intuitive interface and compatibility with older software makes it quite attractive to scientists who have limited coding skills and limited resources with which to hire software professionals. The project page: https://bitbucket.org/stratos-project/stratos The formal announcement, submitted to Astronomy and Computing: http://arxiv.org/abs/1503.02233 The project is in its infancy, but it is already being used to analyze a 'multiverse' simulation (an ensemble of cosmological simulations) at UC Riverside. Proper installation scripts have not yet been written, but people on this mailing list should have very little difficulty. Feel free to contribute! Regards, Nathaniel R. Stickley, Ph.D. Assistant Project Scientist Department of Physics and Astronomy University of California, Riverside
Re: Mesos cluster auto scaling slaves
Thanks Michael for sharing this... Kenneth On Tue, Mar 3, 2015 at 11:31 AM, Michael Babineau michael.babin...@gmail.com wrote: I took a pass at this at least year's MesosCon: https://github.com/thefactory/autoscale-python https://github.com/thefactory/autoscale-python/blob/master/examples/mesos_ec2.py Lots of room for improvement, but it satisfies the basic requirement :) On Fri, Feb 27, 2015 at 3:32 PM, Kenneth Su su.ke...@gmail.com wrote: Thank Vinod /Andrew for clarify, Scaling Mesos slave is what I am looking for. Kenneth
Re: Mesos cluster auto scaling slaves
Hi Sharma, Thanks for the quick response! Yes, before I sent out this question I google and only saw your presentation at AWSconf 2014 which related. Thanks for point out there is a work on the framework for doing so. I am still looking for the guide or some related documents for implementing this for my poc trail. Thanks for again for the prompt response. Kenneth On Fri, Feb 27, 2015 at 2:14 PM, Sharma Podila spod...@netflix.com wrote: Hello Kenneth, There is a little bit of work needed in the framework to do autoscaling of the slave cluster. Theoretically, scaling up can be relatively easy by watching the utilization and adding nodes. However, in order to scale down, the framework must support two things - some kind of bin packing so it uses as few slaves as possible, and a call out to which slaves can be shutdown. I discussed how we achieve this at last year's MesosCon and also at AWS re:Invent, slides from which are at http://www.slideshare.net/spodila/aws-reinvent-2014-talk-scheduling-using-apache-mesos-in-the-cloud in case that helps you with ideas. On Fri, Feb 27, 2015 at 12:52 PM, Kenneth Su su.ke...@gmail.com wrote: Hi all, I am new to Mesos/Mesosphere, I have tried a test from the tutorials and successfully built up a single master with two slaves, also dispatched the tasks through Marathon to all slaves. It run as expected and it is great to scaling app to as many instance as it needs. However, I have a question came up and I tried to find out the related information to see how Mesos could automatically scaling the slaves as many as need on the hardware/machines, but seems not many details on how it works, how the process. Do we need to have another layer to watch, provision nodes on demand on Paas so the new nodes could automatically join Mesos cluster, or Mesos could also handle that kind of task. Appreciated if any of related information/documents. Thanks! Kenneth
Re: Mesos cluster auto scaling slaves
Thanks, Andrew! I will search for that and good to know Jenkins Mesos framework also does that work. Kenneth On Fri, Feb 27, 2015 at 2:37 PM, Andrew Langhorn and...@ajlanghorn.com wrote: Thanks for the slides, Sharma. I'll have a look this weekend! One thing you might find interesting, Kenneth, is the Jenkins Mesos framework which does automatic slave provisioning and horizontal scaling. Andrew Sent from my iPhone On 27 Feb 2015, at 21:16, Sharma Podila spod...@netflix.com wrote: Hello Kenneth, There is a little bit of work needed in the framework to do autoscaling of the slave cluster. Theoretically, scaling up can be relatively easy by watching the utilization and adding nodes. However, in order to scale down, the framework must support two things - some kind of bin packing so it uses as few slaves as possible, and a call out to which slaves can be shutdown. I discussed how we achieve this at last year's MesosCon and also at AWS re:Invent, slides from which are at http://www.slideshare.net/spodila/aws-reinvent-2014-talk-scheduling-using-apache-mesos-in-the-cloud in case that helps you with ideas. On Fri, Feb 27, 2015 at 12:52 PM, Kenneth Su su.ke...@gmail.com wrote: Hi all, I am new to Mesos/Mesosphere, I have tried a test from the tutorials and successfully built up a single master with two slaves, also dispatched the tasks through Marathon to all slaves. It run as expected and it is great to scaling app to as many instance as it needs. However, I have a question came up and I tried to find out the related information to see how Mesos could automatically scaling the slaves as many as need on the hardware/machines, but seems not many details on how it works, how the process. Do we need to have another layer to watch, provision nodes on demand on Paas so the new nodes could automatically join Mesos cluster, or Mesos could also handle that kind of task. Appreciated if any of related information/documents. Thanks! Kenneth
Re: Mesos cluster auto scaling slaves
Thank Vinod /Andrew for clarify, Scaling Mesos slave is what I am looking for. Kenneth
Re: Mesos Master / Slave communications issues
Hi Devin, I am new to Mesos as well, and I just configured it had the same problem like yours. For your reference, what my fix was use the actually master IP instead, then slave will pick it up and connected. I really wonder if 127.0.0.1, then Slave will use it to connect itself and that is why never get to master one. Hope it helps! Kenneth On Tue, Feb 24, 2015 at 2:50 PM, Devin Carlen devin.car...@gmail.com wrote: Hello all, I’m new to Mesos but have recently started trying to stand up a cluster using BOSH. There is a BOSH release for it at https://github.com/cf-platform-eng/mesos-boshrelease that is under active development. I was able to successfully deploy the cluster, however the slaves are not communicating with the master. Upon investigation I found that the leader election is happening properly with ZooKeeper. For this test I only have 1 Mesos master, 3 Mesos slaves, and 1 ZooKeeper instance for this test. All are running on their own VMs. The single master gets elected upon startup: I0224 21:20:40.716702 12024 contender.cpp:243] New candidate (id='0') has entered the contest for leadership I0224 21:20:40.717182 12024 detector.cpp:134] Detected a new leader: (id='0') I0224 21:20:40.717718 12030 group.cpp:629] Trying to get '/mesos/info_00' in ZooKeeper I0224 21:20:40.79 12030 detector.cpp:351] A new leading master ( UPID=master@127.0.0.1:80) is detected I0224 21:20:40.722367 12030 master.cpp:734] The newly elected leader is master@127.0.0.1:80 I0224 21:20:40.722394 12030 master.cpp:742] Elected as the leading master! I thought it odd that the IP listed here is 127.0.0.1. I have not specified localhost anywhere and I explicitly specify —ip=0.0.0.0 in my mesos-master command. The slave sees the election happen, but then appears to connect to 127.0.0.1:80: I0224 21:24:18.892083 17316 detector.cpp:134] Detected a new leader: (id='0') I0224 21:24:18.892290 17316 group.cpp:629] Trying to get '/mesos/info_00' in ZooKeeper I0224 21:24:18.894039 17316 detector.cpp:351] A new leading master ( UPID=master@127.0.0.1:80) is detected I0224 21:24:18.894130 17316 slave.cpp:500] New master detected at master@127.0.0.1:80 I0224 21:24:18.894383 17316 slave.cpp:525] Detecting new master I0224 21:24:18.894443 17316 status_update_manager.cpp:162] New master detected at master@127.0.0.1:80 I0224 21:24:18.894630 17320 slave.cpp:1957] master@127.0.0.1:80 exited W0224 21:24:18.894665 17320 slave.cpp:1960] Master disconnected! Waiting for a new master to be elected At this point the slave never successfully connects. Just to verify, I also checked what ZooKeeper was reporting: $ /zkCli.sh get /mesos/info_00 201502242120-16777343-80-12000��Pmaster@*127.0.0.1:80 http://127.0.0.1:80* cZxid = 0x20 ctime = Tue Feb 24 21:20:40 UTC http://airmail.calendar/2015-02-24%2013:20:40%20PST 2015 mZxid = 0x20 mtime = Tue Feb 24 21:20:40 UTC http://airmail.calendar/2015-02-24%2013:20:40%20PST 2015 pZxid = 0x20 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x14bbd711b6e0012 dataLength = 60 numChildren = 0 So somehow the IP 127.0.0.1 is written instead of the correct IP. Any thoughts on how I can fix this? Best, Devin