Re: May board report
Justin, Thanks for the detailed feedback. As someone involved with the project here is my understanding: 1. we do have a hangout to discuss the project every 2 weeks. Is this not appropriate?It really helps to sync up with where we are in the process. Would it help if we had meeting minutes and posted that the email list? 2. most of the work and PR are in a non ASF git repo at the moment. I believe we were waiting for eBay to officially donate the code (tied up in legal last I remember). At the time the donation is official we were going to change the package namespace and move to the ASF repo. 3. I would need to review in more detail. We really appreciate your involvement and feedback! Looking forward to more clarification if you have thoughts on my questions. ken On May 6, 2015, at 11:56 PM, Justin Mclean jus...@classsoftware.com wrote: Hi, I’ve been appointed shepherd for Myriad this month and while doing a review I noticed a couple of minor things that you may wish to address in your board report and/or on this dev list. Please note that this is just from a casual glance and you certainly know your project better that I do and I may of missed something/there may not be any issues at all. 1. Conversations seems to be happening off list [1]. 2. Git report has been created but there’s no code in it [2]. Pull requests are still happening on non ASF github. 3. May need some clarity around making releases in this way [3] I suggest you carefully read [4] [5]. Note that Under no circumstances are unapproved builds a substitute for releases. Again I’m not involved in this project and it up to the PMC (with the Mentors help) to decide what to do and don’t do, so feel free to ignore the above if you don’t think there’s any issue here. Thanks, Justin 1. http://mail-archives.apache.org/mod_mbox/incubator-myriad-dev/201504.mbox/%3ccakoqndusesuoko4yraqwsjblvjrhstbcqghcngwynuhzym7...@mail.gmail.com%3e 2. https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=summary 3. http://mail-archives.apache.org/mod_mbox/incubator-myriad-dev/201504.mbox/%3c985b60f3-eb09-4b54-8b4a-6a791c151...@mesosphere.io%3e 4. http://www.apache.org/dev/release.html#what-must-every-release-contain 5. http://www.apache.org/dev/release.html#what
Re: Recommending or requiring mesos dns?
John, 1. +1 to mesos-dns aware 2. all tasks deployed by mesos are already in mesos-dns. so all the nm are there (we should make sure they have good names. 3. the RM is not usually started with mesos… if it was it would also be listed in mesos-dns, however a process started outside mesos is not currently added to mesos-dns. At some point mesos-dns will allow for out of band server registration… but it isn’t there today. 4. I would like to see multi-yarn clusters on mesos supported with multi-myriad. Each myriad would managed it’s cluster and would register with a unique framework id. ken On May 7, 2015, at 5:51 AM, John Omernik j...@omernik.com wrote: I've implemented mesos-dns and use marathon to launch my myriad framework. It shows up as myriad.marahon.mesos and makes it easy to find what node the framework launched the resource manager on. What if we made myriad mesos-dns aware, and prior to launching the yarn rm, it could register in mesos dns. This would mean both the ip addresses and the ports (we need to figure out multiple ports in mesos-dns). Then it could write out ports and host names in the nm configs by checking mesos dns for which ports the resource manager is using. Side question: when a node manager registers with the resource manager are the ports the nm is running on completely up to the nm? Ie I can run my nm web server any port, Yarn just explains that to the rm on registration? Because then we need a mechanism at launch of the nm task to understand which ports mesos has allocated to the nm and update the yarn-site for that nm before launch Perhaps mesos-dns as a requirement isn't needed, but I am trying to walk through options that get us closer to multiple yarn clusters on a mesos cluster. John -- Sent from my iThing
Re: dev Digest of: get.107_109
Hi, 1. we do have a hangout to discuss the project every 2 weeks. Is this not appropriate? There’s nothing wrong with that and if fact I would assume that helps build community. It would certailly help if the minutes/summary are posted to the mailing list, in plain text preferably, posting to a doc may mean that eventually that information is lost and preferably in the same thread. That way people new to the project or those not involved in the meetup can look at the email archives and get a better idea to what is going on. Any decisions however need to made on the mailing list. [1] Thanks, Justin 1. http://www.apache.org/dev/pmc.html#mailing-list-naming-policy
Re: Recommending or requiring mesos dns?
Hi John, Great views about extending mesos dns for rm's discovery. Some thoughts: 1. There are 5 primary interfaces RM exposes that are bound to standard ports. a. RPC interface for clients that want to submit applications to YARN (port 8032). b. RPC interface for NMs to connect back/HB to RM (port 8031). c. RPC interface for App Masters to connect back/HB to RM (port 8030). d. RPC interface for admin to interact with RM via CLI (port 8033). e. Web Interface for RM's UI (port 8088). 2. When we launch RM using Marathon, it's probably better to mention in marathon's config that RM will use the above ports. This is because, if RM doesn't listens on random ports (as opposed to the above listed standard ports), when RM fails over, the new RM gets ports that might be different from the ones used by the old RM. This makes the RM's discovery hard, especially post failover. 3. It looks like what you are proposing is a way to update mesos-dns as to what ports RM's services are listening on. And when RM fails over, these ports would get updated in mesos-dns. Is my understanding correct? If yes, one challenge I see is that the clients that want to connect to the above listed RM interfaces also need to pull the changes to RM's port numbers from mesos-dns dynamically. Not sure how that might be possible. Regarding your question about NM ports 1. NM has the following ports: a. RPC port for app masters to launch containers (this is a random port). b. RPC port for localization service. (port 8040) c. Web port for NM's UI (port 8042). 2. Ports (a) and (c) are relayed to RM when NM registers with RM. Port (b) is passed to a local container executor process via command line args. 3. As you rightly reckon, we need a mechanism at launch of NM to pass the mesos allocated ports to NM for the above interfaces. We can try to use variable expansion http://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/conf/Configuration.html mechanism hadoop has to achieve this. Thanks, Santosh On Thu, May 7, 2015 at 3:51 AM, John Omernik j...@omernik.com wrote: I've implemented mesos-dns and use marathon to launch my myriad framework. It shows up as myriad.marahon.mesos and makes it easy to find what node the framework launched the resource manager on. What if we made myriad mesos-dns aware, and prior to launching the yarn rm, it could register in mesos dns. This would mean both the ip addresses and the ports (we need to figure out multiple ports in mesos-dns). Then it could write out ports and host names in the nm configs by checking mesos dns for which ports the resource manager is using. Side question: when a node manager registers with the resource manager are the ports the nm is running on completely up to the nm? Ie I can run my nm web server any port, Yarn just explains that to the rm on registration? Because then we need a mechanism at launch of the nm task to understand which ports mesos has allocated to the nm and update the yarn-site for that nm before launch Perhaps mesos-dns as a requirement isn't needed, but I am trying to walk through options that get us closer to multiple yarn clusters on a mesos cluster. John -- Sent from my iThing
Re: Recommending or requiring mesos dns?
So I may be lookng at this wrong, but where is the data for the rm stored if it does fail over? How will it know to pick up where it left off? This is just one area I am low in understanding on. That said, what about pre allocating a second failover rm some where on the cluster. (I am just tossing an idea here, in that there are probably many reasons not to do this) but here is how I could see it happening. 1. Myriad starts a rm asking for 5 random available ports. Mesos replies starting the rm and reports to myriad the 5 ports used for the services you listed below. 2. Myriad then checks a config value of number of hot spares lets say we specify 1. Myriad then puts in a resource request to mesos for CPU and memory required for the rm, but specifically asks for the same 5 ports allocated to the first. Basically it reserves a spot on another node with the same ports available. It may tak a bit, but there should be that availability. Until this request is met, the yarn cluster is in a ha compromised position. 3. At this point the perhaps we start another instance of rm right away (depends on my first question on where the rm stores into about jobs/applications) or the frame work just holds the spot, waiting for a lack of heart beat (failover condition) on the primay resource manager. 4. If we can run the spare with no issues, it's a simple update of the dns record and node managers connect to the new rm ( and another rm is preallocated for redundancy). If we can't actually execute the secondary rm until failover conditions, we can now execute the new rm, and the ports will be the same. This may seem kludgey at first, but done correctly, it may actually limit the length of failover time as the rm is preallocated. Rms are not huge from a resource perspective thus it may be a small cost for those who want failover and multiple clusters (thus having dynamic ports) I will keep thinking this through, and would welcome feedback. On Thursday, May 7, 2015, Santosh Marella smare...@maprtech.com wrote: Hi John, Great views about extending mesos dns for rm's discovery. Some thoughts: 1. There are 5 primary interfaces RM exposes that are bound to standard ports. a. RPC interface for clients that want to submit applications to YARN (port 8032). b. RPC interface for NMs to connect back/HB to RM (port 8031). c. RPC interface for App Masters to connect back/HB to RM (port 8030). d. RPC interface for admin to interact with RM via CLI (port 8033). e. Web Interface for RM's UI (port 8088). 2. When we launch RM using Marathon, it's probably better to mention in marathon's config that RM will use the above ports. This is because, if RM doesn't listens on random ports (as opposed to the above listed standard ports), when RM fails over, the new RM gets ports that might be different from the ones used by the old RM. This makes the RM's discovery hard, especially post failover. 3. It looks like what you are proposing is a way to update mesos-dns as to what ports RM's services are listening on. And when RM fails over, these ports would get updated in mesos-dns. Is my understanding correct? If yes, one challenge I see is that the clients that want to connect to the above listed RM interfaces also need to pull the changes to RM's port numbers from mesos-dns dynamically. Not sure how that might be possible. Regarding your question about NM ports 1. NM has the following ports: a. RPC port for app masters to launch containers (this is a random port). b. RPC port for localization service. (port 8040) c. Web port for NM's UI (port 8042). 2. Ports (a) and (c) are relayed to RM when NM registers with RM. Port (b) is passed to a local container executor process via command line args. 3. As you rightly reckon, we need a mechanism at launch of NM to pass the mesos allocated ports to NM for the above interfaces. We can try to use variable expansion http://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/conf/Configuration.html mechanism hadoop has to achieve this. Thanks, Santosh On Thu, May 7, 2015 at 3:51 AM, John Omernik j...@omernik.com javascript:; wrote: I've implemented mesos-dns and use marathon to launch my myriad framework. It shows up as myriad.marahon.mesos and makes it easy to find what node the framework launched the resource manager on. What if we made myriad mesos-dns aware, and prior to launching the yarn rm, it could register in mesos dns. This would mean both the ip addresses and the ports (we need to figure out multiple ports in mesos-dns). Then it could write out ports and host names in the nm configs by checking mesos dns for which ports the resource manager is using. Side question: when a node manager registers with the resource manager are the ports the nm is running on completely up to the nm? Ie I can run my nm web server any port, Yarn just