John, thanks a ton for your valuable feedback! We're glad to have your perspective as a user of the project, and I'm ready+willing to give you edit access to the wiki if you want to update it with your learnings, elaborate anything that's unclear, or add a new "John's tips" page. Just sign up for a wiki account, send me your accountId, and I'll grant you edit access. (I'll let others answer your specific questions)
On Wed, Aug 19, 2015 at 6:28 AM, John Omernik <[email protected]> wrote: > Today, I will be playing the role of the fool/jester trying to get Myriad > running. Basically, since getting Myriad running with Santosh quite a while > ago, and now trying again with new versions of Hadoop, MapR, and Myriad, I > wanted to hit up the wiki ( > https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home) and > outline > points that as a non-dev living the code, are unclear to someone trying to > utilize myriad or understand it's operation. > > Obviously, some of my points can be answered with "look here in the code" > or look at this page, but I will try to outline my thought processes as I > reviewed the current docs. Sometimes the way I approached the problem led > me down a path of to a certain page, missing the answer in a different > page, and thus some cross linking could be helpful. > > Please do not let my points be taken as anything other than a desire to > improve how accessible Myriad is to the community, this is not a critique > of the hard work everyone has done on the project. I also understand that > given the work load and other issues, that fixing these issues in > documentation may not be a priority. I am listing them out here, so that > those folks who are SMEs on various points may be able to quickly add stuff > and we'll organize it later. > > > *Remote Distribution: * > > https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Remote+Distribution > > This whole section could use some work from a standpoint of what runs where > and where that component gets its files. For example, I think it would > help people to understand that the whole tarball created in step 6 has all > the files for node managers and resource managers. Basically, everything > runs from there. Here is a small example I am currently working with: > > > Starting Myriad: > Option 1: Use Marathon (provide example json, here is mine) > { > "cmd": "env && export > > YARN_RESOURCEMANAGER_OPTS=-Dyarn.resourcemanager.hostname=myriad.marathon.mesos > && hadoop-2.7.0/bin/yarn resourcemanager", > "uris": ["maprfs:///mesos/myriad/hadoop-2.7.0.tar.gz"], > "cpus": 1.0, > "mem": 1024, > "id": "myriad", > "instances": 1, > "user": "mapr" > } > > In this case, Marathon grabs the hadoop tarball and pulls it down, this > tarball also has the Myriad yml file. When it executes the resource > manager, it is brought up in Myriad and ready to run node managers by > pulling the tarball to the slave nodes and executing the nodemanager. (I > would imagine the work with history server etc would also use this > tarball?). > > From here it will us NMInstances to launch a node manager. (Note, this is > different from when I originally set things up... before, I could run the > resource manager/myriad without a nodemanager, now it seems it's required > based on the config in the src... could we expound on this in the docs > somewhere?) > > > Option 2: ???? (Are there other ways to launch the resource manager?) > > Step 6: So something that is unclear to me is the handling of the > hadoop/yarn config files. In Step 6 on this page, there is "sudo rm > hadoop- > 2.5.0/etc/hadoop/*.xml" This doesn't makes sense to me. I actually ignored > this step. For me, if I remove these xml files, then there is no place to > get my files... I think? Since I am running the RM and NM from the same > tarball, and Myriad config is here, and my goal is to not have anything > installed on a node, where would I set yarn settings? This could be much > clearer to me, and probably others. > > Step 2: Should we just be copying the Myriad files to > /share/hadoop/yarn/lib folder? Do we worry about potentially overwrites of > jars or version conflicts? > > *Configuring Cgroups* > https://cwiki.apache.org/confluence/display/MYRIAD/Configuring+Cgroups > At some point a little bit more about why one would want CGroups and issues > that could occur with them. While many folks using Mesos/Myriad may > understand this, others may not, and it's a good way to help people think > positively about our project if we help educate them along the way. > > Minor point on enabling CGroups. This is confusing given my questions in > remote distribution. in this it says I need to edit my yarn-site.xml, but > in remote distribution it says delete my hadoop xml files. We need to > address this conflict cause it can be confusing for a user coming onboard > > Nitpick: Enabling cgroups for mess-slave - should be - Enabling cgroups for > mesos-slave > > *Myriad Configuration Properties*: > > https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Configuration+Properties > Based on the conversation on list with Yuliya, > "Currently, this file is built into Myriad Scheduler jar. So, if you need > to modify some of the properties in this file, modify them before building > Myriad Scheduler." > isn't accurate any more, and we should address that. > > The configuration file in the wiki is an old one, the nmInstances isn't in > it, (and see my question about that above). > > Frameworks and usernames. I think the users that the framework runs as, > the actual node and resource managers, etc is confusing to a user (I am > very confused!) When I first got Myriad up I set my user under the > executor to be mapr, and then it appeared to work with impersonation from > queries etc. Now, I am trying the remote distribution and I have users set > in the config, potentially a user in my marathon json, and I am getting > errors on permissions of files when a node manager tries to start (a > separate issue I will post later). Basically, this is complex, and a page > describing out what needs to run where with which permissions and how that > interacts will be huge for people looking to put this into play. > > *Example Yarn Site:* > https://cwiki.apache.org/confluence/display/MYRIAD/Example%3A+yarn-site.xml > > This is helpful, but where does it go? Remember, the remote distribution > had us delete the yarn-site in the hadoop etc folder. > > *Myriad Webapp * > https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Webapp > > This should be fleshed out a bit more. Also, it's in the > /myriad-scheduler/src/main/resources/webapp based on my git clone, but in > the wiki that's not listed. I had to dig for it. > > Some questions here: could the webapp be built during the myriad building > process? Could it be then be packaged as tarball for execution either > manually via marathon or automatically in a container on mesos? I > understand this is a fresh piece of the puzzle, I am just thinking about > and verbalizing the "where" on this for the future > > > > Those are the items that come to mind thus far. I hope the tone of my > email is correct, this is a great project, and I want others to try it as I > have. > > John Omernik >
