John, thanks a ton for your valuable feedback! We're glad to have your
perspective as a user of the project, and I'm ready+willing to give you
edit access to the wiki if you want to update it with your learnings,
elaborate anything that's unclear, or add a new "John's tips" page. Just
sign up for a wiki account, send me your accountId, and I'll grant you edit
access.
(I'll let others answer your specific questions)

On Wed, Aug 19, 2015 at 6:28 AM, John Omernik <[email protected]> wrote:

> Today, I will be playing the role of the fool/jester trying to get Myriad
> running. Basically, since getting Myriad running with Santosh quite a while
> ago, and now trying again with new versions of Hadoop, MapR, and Myriad, I
> wanted to hit up the wiki (
> https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home) and
> outline
> points that as a non-dev living the code, are unclear to someone trying to
> utilize myriad or understand it's operation.
>
> Obviously, some of my points can be answered with "look here in the code"
> or look at this page, but I will try to outline my thought processes as I
> reviewed the current docs.  Sometimes the way I approached the problem led
> me down a path of to a certain page, missing the answer in a different
> page, and thus some cross linking could be helpful.
>
> Please do not let my points be taken as anything other than a desire to
> improve how accessible Myriad is to the community, this is not a critique
> of the hard work everyone has done on the project.  I also understand that
> given the work load and other issues, that fixing these issues in
> documentation may not be a priority.  I am listing them out here, so that
> those folks who are SMEs on various points may be able to quickly add stuff
> and we'll organize it later.
>
>
> *Remote Distribution: *
>
> https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Remote+Distribution
>
> This whole section could use some work from a standpoint of what runs where
> and where that component gets its files.  For example, I think it would
> help people to understand that the whole tarball created in step 6 has all
> the files for node managers and resource managers.  Basically, everything
> runs from there. Here is a small example I am currently working with:
>
>
> Starting Myriad:
> Option 1: Use Marathon (provide example json, here is mine)
> {
> "cmd": "env && export
>
> YARN_RESOURCEMANAGER_OPTS=-Dyarn.resourcemanager.hostname=myriad.marathon.mesos
> && hadoop-2.7.0/bin/yarn resourcemanager",
> "uris": ["maprfs:///mesos/myriad/hadoop-2.7.0.tar.gz"],
> "cpus": 1.0,
> "mem": 1024,
> "id": "myriad",
> "instances": 1,
> "user": "mapr"
> }
>
> In this case, Marathon grabs the hadoop tarball and pulls it down, this
> tarball also has the Myriad yml file. When it executes the resource
> manager, it is brought up in Myriad and ready to run node managers by
> pulling the tarball to the slave nodes and executing the nodemanager.  (I
> would imagine the work with history server etc would also use this
> tarball?).
>
> From here it will us NMInstances to launch a node manager.  (Note, this is
> different from when I originally set things up... before, I could run the
> resource manager/myriad without a nodemanager, now it seems it's required
> based on the config in the src... could we expound on this in the docs
> somewhere?)
>
>
> Option 2: ???? (Are there other ways to launch the resource manager?)
>
> Step 6: So something that is unclear to me is  the handling of the
> hadoop/yarn config files.  In Step 6 on this page, there is "sudo rm
> hadoop-
> 2.5.0/etc/hadoop/*.xml"  This doesn't makes sense to me. I actually ignored
> this step.  For me, if I remove these xml files, then there is no place to
> get my files... I think? Since I am running the RM and NM from the same
> tarball, and Myriad config is here, and my goal is to not have anything
> installed on a node, where would I set  yarn settings? This could be much
> clearer to me, and probably others.
>
> Step 2:  Should we just be copying the Myriad files to
> /share/hadoop/yarn/lib folder? Do we worry about potentially overwrites of
> jars or version conflicts?
>
> *Configuring Cgroups*
> https://cwiki.apache.org/confluence/display/MYRIAD/Configuring+Cgroups
> At some point a little bit more about why one would want CGroups and issues
> that could occur with them. While many folks using Mesos/Myriad may
> understand this, others may not, and it's a good way to help people think
> positively about our project if we help educate them along the way.
>
> Minor point on enabling CGroups. This is confusing given my questions in
> remote distribution. in this it says I need to edit my yarn-site.xml, but
> in remote distribution it says delete my hadoop xml files. We need to
> address this conflict cause it can be confusing for a user coming onboard
>
> Nitpick: Enabling cgroups for mess-slave - should be - Enabling cgroups for
> mesos-slave
>
> *Myriad Configuration Properties*:
>
> https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Configuration+Properties
> Based on the conversation on list with Yuliya,
> "Currently, this file is built into Myriad Scheduler jar. So, if you need
> to modify some of the properties in this file, modify them before building
> Myriad Scheduler."
> isn't accurate any more, and we should address that.
>
> The configuration file in the wiki is an old one, the nmInstances isn't in
> it, (and see my question about that above).
>
> Frameworks and usernames.   I think the users that the framework runs as,
> the actual node and resource managers, etc is confusing to a user (I am
> very confused!)  When I first got Myriad up I set my user under the
> executor to be mapr, and then it appeared to work with impersonation from
> queries etc.  Now, I am trying the remote distribution and I have users set
> in the config, potentially a user in my marathon json, and I am getting
> errors on permissions of files when a node manager tries to start (a
> separate issue I will post later). Basically, this is complex, and a page
> describing out what needs to run where with which permissions and how that
> interacts will be huge for people looking to put this into play.
>
> *Example Yarn Site:*
> https://cwiki.apache.org/confluence/display/MYRIAD/Example%3A+yarn-site.xml
>
> This is helpful, but where does it go?  Remember, the remote distribution
> had us delete the yarn-site in the hadoop etc folder.
>
> *Myriad Webapp *
>  https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Webapp
>
> This should be fleshed out a bit more.  Also, it's in the
> /myriad-scheduler/src/main/resources/webapp based on my git clone, but in
> the wiki that's not listed.  I had to dig for it.
>
> Some questions here: could the webapp  be built during the myriad building
> process? Could it be then be packaged as tarball for execution either
> manually via marathon or automatically in a container on mesos?  I
> understand this is a fresh piece of the puzzle, I am just thinking about
> and verbalizing the "where" on this for the future
>
>
>
> Those are the items that come to mind thus far.  I hope the tone of my
> email is correct, this is a great project, and I want others to try it as I
> have.
>
> John Omernik
>

Reply via email to