from:"Darin Johnson"

Re: [RESULT] [DISCUSS] Leave Apache and the Incubator

2020-01-20 Thread Darin Johnson

Need to inform the incubator list as well.

Cheers,
Darin

On Mon, Jan 20, 2020, 1:51 AM Javi Roman  wrote:

> The community has voted retired Myriad project from Apache Incubator:
>
> +1 6 votes
> -1 0 votes.
>
> We will start the retirement process accordingly with IPMC guidelines.
> --
> Javi Roman
>
> Twitter: @javiromanrh
> GitHub: github.com/javiroman
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
> Apache Id: javiroman
>

Re: [DISCUSS] Leave Apache and the Incubator

2020-01-03 Thread Darin Johnson

+1

On Fri, Jan 3, 2020 at 11:14 AM John Yost  wrote:

> +1
>
> > On Jan 2, 2020, at 11:51 PM, Javi Roman 
> wrote:
> >
> > Please vote to retire Myriad from the ASF Incubator. My take is that
> > Myriad has failed to get adequate community traction in the ASF
> > Incubator and the administrative work of the ASF is more of a burden
> > than a benefit to the project. Exiting incubation right now seems like
> > the obvious benefit to everyone -- the decision to return can always be
> > made later and should be easier to execute the second time around.
> >
> > [ ] +1 to retire
> > [ ] -1 Not retiring because..
> >
> > Here's my +1
> >
> > This will be open for at least 72 hours and is a Majority vote.
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> > Apache Id: javiroman
>
>

Re: Myriad 0.3.0 Release Planning

2018-11-20 Thread Darin Johnson

I think the big issue here is going to be getting committers to vote on the
release.  Suggest giving them plenty of time.

On Sun, Nov 18, 2018, 7:09 AM Javi Roman  I would like to offer myself as release manager for Myriad 0.3.0.
> I guess, we have to try release a new version of Apache Myriad, and
> try to follow a new release per quarter. This is the first serious
> attempt of rebooting the project and this is a way of pulse the
> reaction of the community regarding this project.
>
> I am hoping to cut the first release candidate for 0.3.0 in a few
> days, so we are going to tag the includes JIRAs in the "Myriad 0.3.0"
> version here [1]. Additionally we are going to follow the release
> process in this document [2], in order to document the guidelines and
> actions for this release.
>
> [1] https://issues.apache.org/jira/projects/MYRIAD/versions/12335763
> [2]
> https://docs.google.com/document/d/1ZZdBFVKyBtQlKRK_O1d_obi5wjRAfmoHFPo_sye7wQY/edit#heading=h.ouya5fmdjqg1
> --
> Javi Roman
>
> Twitter: @javiromanrh
> GitHub: github.com/javiroman
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
> Apache Id: javiroman
>

Re: Myriad running at Mesos 1.5.x

2018-11-20 Thread Darin Johnson

Cool thanks for working this.

On Mon, Nov 19, 2018, 4:16 AM JP Gilaberte  Hi Javi, I'm happy too. Thank you very much for the effort to all. I think
> it's a very important first step.This helps us to align with Mesosphere, it
> gives us greater visibility and allows the probability that the community
> grows to increase.
>
> On Sun, 18 Nov 2018 at 12:58, Javi Roman  wrote:
>
> > I am happy to announce the availability of Apache Myriad updated to
> > Apache Mesos 1.5.x.
> > The updated is tracked here [1].
> >
> > We still have to integrated Myriad with the new features (states) of
> > Mesos 1.5.x, but this is a good starting point.
> >
> > I would like to thank to Juan Pedro for this effort, he was working in
> > different approaches and finally based in one simple idea from Yuliya
> > Feldman, the patch is already merged. So special thanks to Yuliya.
> >
> > [1] https://issues.apache.org/jira/browse/MYRIAD-264
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> > Apache Id: javiroman
> >
>

Re: Podling Report Reminder - March 2018

2018-03-05 Thread Darin Johnson

That means we probably need to get Ben or Ted's attention.

On Mon, Mar 5, 2018 at 2:40 PM, Zachary Jaffee <z...@case.edu> wrote:

> It wasn't automatic, I added it for you. Also be aware that the report will
> not be accepted without a mentor sign off.
>
> On Mon, Mar 5, 2018 at 7:31 AM, Javi Roman <jroman.espi...@gmail.com>
> wrote:
>
> > Darin,
> >
> > The report was finally added, probably automatically by the e-mail
> > reply. Anyway I'm going to ask for access at this address, many
> > thanks.
> > --
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > GitHub: github.com/javiroman
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> >
> >
> > On Mon, Mar 5, 2018 at 4:05 PM, Darin Johnson <dbjohnson1...@gmail.com>
> > wrote:
> > > Javi please email
> > > gene...@incubator.apache.org
> > >
> > > And ask for write access to the wireless.  You can then edit the wiki.
> > >
> > > Thanks,
> > > Darin
> > >
> > >
> > > On Sun, Mar 4, 2018 at 7:15 AM, Javi Roman <jroman.espi...@gmail.com>
> > wrote:
> > >
> > >> Project Name:
> > >> Myriad
> > >>
> > >> Project Description:
> > >> Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
> > >> together on the same cluster and allows dynamic resource allocations
> > >> across both Hadoop and other applications running on the same physical
> > >> data center infrastructure. Myriad has been incubating since
> > >> 2015-03-01.
> > >>
> > >> Three most important issues to address in the move towards graduation:
> > >> In this project state the most important action to move towards
> > >> graduation is to create traction on the project again. So from my
> > >> point of view the list of actions is the following:
> > >>
> > >> [X] Promote the project in social networks, blog entries and so forth
> > >> for getting interest again.
> > >> [X] Get a few active new users.
> > >> [X] Refresh the state of unattended issues reported for interested
> > >> users in the pass.
> > >>
> > >> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> > >> aware of?
> > >> Only to be informed about new users are willing to boosting up the
> > >> project again.
> > >>
> > >> How has the community developed since the last report
> > >> The most relevant thing is the activation of mailing list again and a
> > >> few new users willing to participate.
> > >>
> > >> How has the project developed since the last report.
> > >> The project has not evolved since the last report.
> > >>
> > >> How does the podling rate their own maturity.
> > >> [ ] Initial setup
> > >> [ ] Working towards first release
> > >> [X] Community building
> > >> [ ] Nearing graduation
> > >>
> > >> Date of last release:
> > >> 2016-05-29
> > >>
> > >>
> > >> Please, any help for including this report to the Incubator Wiki.
> > Thanks!
> > >> --
> > >> Javi Roman
> > >>
> > >> Twitter: @javiromanrh
> > >> GitHub: github.com/javiroman
> > >> Linkedin: es.linkedin.com/in/javiroman
> > >> Big Data Blog: dataintensive.info
> > >>
> > >>
> > >> On Sat, Mar 3, 2018 at 3:55 PM,  <johndam...@apache.org> wrote:
> > >> > Dear podling,
> > >> >
> > >> > This email was sent by an automated system on behalf of the Apache
> > >> > Incubator PMC. It is an initial reminder to give you plenty of time
> to
> > >> > prepare your quarterly board report.
> > >> >
> > >> > The board meeting is scheduled for Wed, 21 March 2018, 10:30 am PDT.
> > >> > The report for your podling will form a part of the Incubator PMC
> > >> > report. The Incubator PMC requires your report to be submitted 2
> weeks
> > >> > before the board meeting, to allow sufficient time for review and
> > >> > submission (Wed, March 07).
> > >> >
> > >> > Please submit your report with sufficient time to allow the
> Incubator
> > >> > PMC, and subsequently board members to review and digest. Again, the
>

Re: Podling Report Reminder - March 2018

2018-03-05 Thread Darin Johnson

Javi please email
gene...@incubator.apache.org

And ask for write access to the wireless.  You can then edit the wiki.

Thanks,
Darin


On Sun, Mar 4, 2018 at 7:15 AM, Javi Roman  wrote:

> Project Name:
> Myriad
>
> Project Description:
> Myriad enables co-existence of Apache Hadoop YARN and Apache Mesos
> together on the same cluster and allows dynamic resource allocations
> across both Hadoop and other applications running on the same physical
> data center infrastructure. Myriad has been incubating since
> 2015-03-01.
>
> Three most important issues to address in the move towards graduation:
> In this project state the most important action to move towards
> graduation is to create traction on the project again. So from my
> point of view the list of actions is the following:
>
> [X] Promote the project in social networks, blog entries and so forth
> for getting interest again.
> [X] Get a few active new users.
> [X] Refresh the state of unattended issues reported for interested
> users in the pass.
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware of?
> Only to be informed about new users are willing to boosting up the
> project again.
>
> How has the community developed since the last report
> The most relevant thing is the activation of mailing list again and a
> few new users willing to participate.
>
> How has the project developed since the last report.
> The project has not evolved since the last report.
>
> How does the podling rate their own maturity.
> [ ] Initial setup
> [ ] Working towards first release
> [X] Community building
> [ ] Nearing graduation
>
> Date of last release:
> 2016-05-29
>
>
> Please, any help for including this report to the Incubator Wiki. Thanks!
> --
> Javi Roman
>
> Twitter: @javiromanrh
> GitHub: github.com/javiroman
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
>
>
> On Sat, Mar 3, 2018 at 3:55 PM,   wrote:
> > Dear podling,
> >
> > This email was sent by an automated system on behalf of the Apache
> > Incubator PMC. It is an initial reminder to give you plenty of time to
> > prepare your quarterly board report.
> >
> > The board meeting is scheduled for Wed, 21 March 2018, 10:30 am PDT.
> > The report for your podling will form a part of the Incubator PMC
> > report. The Incubator PMC requires your report to be submitted 2 weeks
> > before the board meeting, to allow sufficient time for review and
> > submission (Wed, March 07).
> >
> > Please submit your report with sufficient time to allow the Incubator
> > PMC, and subsequently board members to review and digest. Again, the
> > very latest you should submit your report is 2 weeks prior to the board
> > meeting.
> >
> > Thanks,
> >
> > The Apache Incubator PMC
> >
> > Submitting your Report
> >
> > --
> >
> > Your report should contain the following:
> >
> > *   Your project name
> > *   A brief description of your project, which assumes no knowledge of
> > the project or necessarily of its field
> > *   A list of the three most important issues to address in the move
> > towards graduation.
> > *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> > aware of
> > *   How has the community developed since the last report
> > *   How has the project developed since the last report.
> > *   How does the podling rate their own maturity.
> >
> > This should be appended to the Incubator Wiki page at:
> >
> > https://wiki.apache.org/incubator/March2018
> >
> > Note: This is manually populated. You may need to wait a little before
> > this page is created from a template.
> >
> > Mentors
> > ---
> >
> > Mentors should review reports for their project(s) and sign them off on
> > the Incubator wiki page. Signing off reports shows that you are
> > following the project - projects that are not signed may raise alarms
> > for the Incubator PMC.
> >
> > Incubator PMC
>

Re: [DISCUSS] Retire Myriad?

2018-01-31 Thread Darin Johnson

+1

On Mon, Jan 29, 2018 at 11:10 PM, John D. Ament 
wrote:

> Hi All,
>
> It seems that Myriad has more or less stopped activity.  Recent user
> question has lead to a recommendation to avoid using Myriad in production
> [1], git commits seem to have stopped a long time ago [2], and potential
> community contributions [3] have gone unanswered.  Your reports have been
> missing for some time.
>
> I believe based on this, it's time for the Myriad podling
>
> [1]:
> https://lists.apache.org/thread.html/4d5db8709e48a6de732726ab68e6b6
> b09768a2f026a5ad1b44a208a1@%3Cdev.myriad.apache.org%3E
> [2]:
> https://github.com/apache/incubator-myriad/commit/
> 382eb7bc77cec1ac7a13d027d8c28cfb1a379a6b
> [3]: https://github.com/apache/incubator-myriad/pulls
>

Re: Please do not retire Myriad

2018-01-31 Thread Darin Johnson

If people are interested in working on it, it's possible to fork the
project and continue.  If anyone wants to fork and continue I'd be happy to
work with them (I'll even create the fork).  However, the community is so
small and distracted it's not ideal to continue developing within Apache. I
think Myriad might have gone into Apache Incubator to quickly in the first
place.

On Wed, Jan 31, 2018 at 11:57 AM, Javi Roman 
wrote:

> Hi Juan P.
>
> If the project doesn't have technical reasons for rejecting it, I
> would like contribute for giving a boost.
> Javi Roman
>
> Twitter: @javiromanrh
> Linkedin: es.linkedin.com/in/javiroman
> Big Data Blog: dataintensive.info
>
>
> On Wed, Jan 31, 2018 at 2:41 PM, Juan P  wrote:
> > Hello, in relation to this mail:
> > https://www.mail-archive.com/dev@myriad.incubator.apache.org
> /msg02319.html
> >
> > I totally agree, that it is a pity that the project is closed and I
> wanted
> > to express my intention to actively collaborate with it. I would need to
> > update the Mesos version to be compatible with the latest versions.
> >
> > Thank you
> >
> > Juan Pedro Gilaberte
> > http://github.com/jpgilaberte
>

Re: Regarding Big Data Mesos Frameworks and Myriad

2018-01-31 Thread Darin Johnson

Javi,

I can't speak for Kafka definitively (Does Kafka have a YARN runner
nowdays? I use the Mesos one), however I did run Map/Reduce, Spark and
Flink via YARN Using Myriad.

Darin

On Wed, Jan 31, 2018 at 1:51 PM, Swapnil Daingade <
swapnil.daing...@gmail.com> wrote:

> yes, you don't need a new Mesos framework for your YARN apps with Myriad.
>
> Regards
> Swapnil
>
>
> On Tue, Jan 30, 2018 at 10:25 PM, Javi Roman 
> wrote:
>
> > Hi there!
> >
> > I have seen that this project probably is in the end of its life, it's
> > a real shame.
> >
> > I would like contribute to raise the project, obviously I will need
> > any kind mentoring or initial guidance.
> >
> > Before that I would like ask an important question (the focus of my
> > interest):
> >
> > With Myriad (YARN converted to a Mesos framework) I could use YARN
> > aware projects (such as Hadoop, Hive, HBase, Kafka and so on) without
> > the need of crate the Mesos framework version of every tool?
> >
> > I mean can I use, for example, a plain Apache Kafka with Myriad,
> > without he effort of maintaining a Apache Kafka Mesos framework?
> >
> > If this is correct, for my is a huge advantage and I will be really
> > interested in contributing to the Myriad project.
> >
> >
> >
> > Javi Roman
> >
> > Twitter: @javiromanrh
> > Linkedin: es.linkedin.com/in/javiroman
> > Big Data Blog: dataintensive.info
> >
>

Re: Myriad status?

2017-12-13 Thread Darin Johnson

Nicolas, we're currently in a bit of a lull.  I'd strongly discourage you
from using Myriad in a production environment.  More than happy to field
questions getting it up and running though.

Darin

On Mon, Dec 11, 2017 at 3:04 PM, Nicolas Tilmans 
wrote:

> Hi!
>
> We were looking at using Myriad, but were noticing that there’s been very
> little development on the GitHub page since Oct. 2016. Is this project
> still live? It certainly looks interesting!
>
> Nicolas
>
>
>
> Nicolas Tilmans
> Sr. Director Data Engineering
> ntilm...@lumiata.com 
> 240.441.3429
> www.lumiata.com 
> Confidentiality Notice | This e-mail message, including any attachments,
> is for the sole use of the intended recipient(s) and may contain
> confidential or proprietary information. Any unauthorized review, use,
> disclosure or distribution is prohibited. If you are not the intended
> recipient, immediately contact the sender by reply e-mail and destroy all
> copies of the original message.
>
>
>

Re: Myriad Sync 11/29

2017-11-29 Thread Darin Johnson

I can't make it today, sorry.  Would be interested in notes.  Anybody
considers a reboot using marathon vs pure mesos framework?  We could do a
lot just curling the Resource Manager's Rest API and Responding by curling
Marathon's Rest API.  Happy to elaborate a different time.
Darin

On Tue, Nov 28, 2017 at 4:27 PM, mohit soni <mohitsoni1...@gmail.com> wrote:

> No worries, its the one that's scheduled from 10AM-10:30AM PST.
>
> If we are doing this more often I can send a new calendar invite.
>
> On Nov 28, 2017 1:24 PM, "Adam Bordelon" <a...@mesosphere.io> wrote:
>
> > What time (and time zone) is this again? Slipped off my calendar.
> >
> > On Tue, Nov 28, 2017 at 7:04 AM, mohit soni <mohitsoni1...@gmail.com>
> > wrote:
> >
> > > Here you go:
> > > https://hangouts.google.com/hangouts/_/calendar/
> > > bW9oaXRzb25pMTk4OUBnbWFpbC5jb20.ajan3qfvb8c48egfc8rpc92b10
> > >
> > > On Tue, Nov 28, 2017 at 8:51 AM, Darin Johnson <
> dbjohnson1...@gmail.com>
> > > wrote:
> > >
> > > > Can you put a link to the Hangouts in the thread?
> > > >
> > > > On Nov 28, 2017 8:47 AM, "mohit soni" <mohitsoni1...@gmail.com>
> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I'm planning to attend the Myriad sync tomorrow. Please join if you
> > > would
> > > > > like to discuss the upcoming changes to the project.
> > > > >
> > > > > Best
> > > > > Mohit
> > > > >
> > > >
> > >
> >
>

Re: Myriad Sync 11/29

2017-11-28 Thread Darin Johnson

Can you put a link to the Hangouts in the thread?

On Nov 28, 2017 8:47 AM, "mohit soni"  wrote:

> Hi All,
>
> I'm planning to attend the Myriad sync tomorrow. Please join if you would
> like to discuss the upcoming changes to the project.
>
> Best
> Mohit
>

Re: Myriad Sync - 11/15/17

2017-11-16 Thread Darin Johnson

+1 for advanced notice.

On Nov 15, 2017 1:30 PM, "yuliya Feldman" 
wrote:

>  Could it be a bit more advance notice? Or I missed that notice?
> On Wednesday, November 15, 2017, 10:03:37 AM PST, mohit soni <
> mohitsoni1...@gmail.com> wrote:
>
>  Hi
>
> Please join:
> https://hangouts.google.com/hangouts/_/uiq2idcclngybc3hlifznfdxi4e for
> today's Myriad Sync.
>
> Best
> Mohit
>

Re: [jira] [Commented] (MYRIAD-256) Cannot launch with vagrant

2017-11-08 Thread Darin Johnson

Appears to be a Java version error.  Will need to upgrade that I guess.

On Nov 7, 2017 3:19 AM, "Kwang-in (Dennis) JUNG (JIRA)" 
wrote:

>
> [ https://issues.apache.org/jira/browse/MYRIAD-256?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=16241658#comment-16241658 ]
>
> Kwang-in (Dennis) JUNG commented on MYRIAD-256:
> ---
>
> I just change hadoop version to 2.7.4, and it causes other error like
> below.
>
> ==
> default: Running: /var/folders/hk/1t3k4z1d6jn71tzy08znymjrgn
> /T/vagrant-shell20171107-19945-cz51p0.sh
> ==> default: stdin: is not a tty
> ==> default: #!/bin/bash -v
> ==> default: #
> ==> default: # Licensed to the Apache Software Foundation (ASF) under one
> ==> default: # or more contributor license agreements.  See the NOTICE file
> ==> default: # distributed with this work for additional information
> ==> default: # regarding copyright ownership.  The ASF licenses this file
> ==> default: # to you under the Apache License, Version 2.0 (the
> ==> default: # "License"); you may not use this file except in compliance
> ==> default: # with the License.  You may obtain a copy of the License at
> ==> default: #
> ==> default: # http://www.apache.org/licenses/LICENSE-2.0
> ==> default: #
> ==> default: # Unless required by applicable law or agreed to in writing,
> ==> default: # software distributed under the License is distributed on an
> ==> default: # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> ==> default: # KIND, either express or implied.  See the License for the
> ==> default: # specific language governing permissions and limitations
> ==> default: # under the License.
> ==> default: #
> ==> default: set -e
> ==> default:
> ==> default: # Format NameNode
> ==> default: sudo -u hduser sh -c 'yes Y | /usr/local/hadoop/bin/hdfs
> namenode -format'
> ==> default: 17/11/07 08:05:40 INFO namenode.NameNode: STARTUP_MSG:
> ==> default: /
> ==> default: STARTUP_MSG: Starting NameNode
> ==> default: STARTUP_MSG:   host = vagrant-ubuntu-trusty-64/10.0.2.15
> ==> default: STARTUP_MSG:   args = [-format]
> ==> default: STARTUP_MSG:   version = 2.7.4
> ==> default: STARTUP_MSG:   classpath = /usr/local/hadoop/etc/hadoop:/
> usr/local/hadoop/share/hadoop/common/lib/htrace-core-3.1.0-
> incubating.jar:/usr/local/hadoop/share/hadoop/common/
> lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/
> lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/
> common/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/
> common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/
> hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/
> hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/
> local/hadoop/share/hadoop/common/lib/jetty-sslengine-6.
> 1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/netty-
> 3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/
> lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/
> common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/
> common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/
> share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/
> hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/
> local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/
> usr/local/hadoop/share/hadoop/common/lib/commons-httpclient-
> 3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/
> zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-
> configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-
> collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/
> common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/
> hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/
> hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/
> local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.
> jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-
> annotations-2.7.4.jar:/usr/local/hadoop/share/hadoop/
> common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/
> share/hadoop/common/lib/jersey-json-1.9.jar:/usr/
> local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:
> /usr/local/hadoop/share/hadoop/common/lib/commons-
> digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/
> lib/jsch-0.1.54.jar:/usr/local/hadoop/share/hadoop/
> common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/
> share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/
> local/hadoop/share/hadoop/common/lib/commons-beanutils-
> core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/
> lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/
> hadoop/common/lib/curator-framework-2.7.1.jar:/usr/
> local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/
> usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.
> jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.
> jar:/usr/local/hadoop/share/hadoop/common/lib/gson-2.2.4.
>

Re: [DISCUSS] Retire Myriad

2017-11-06 Thread Darin Johnson

I certainly think the project is relevant but I'm concerned if it belongs
in Apache.  When I was active it was difficult to get people to review pull
requests and when releasing v0.2.0 it was hard to even get people to vote
for a release.  If only a few people are going to work on it, maybe it's
easier to develop outside the incubator until there's something stable?

I'd be interested in potentially working on it again provided there's at
least two other people actively working on it.

Darin

On Mon, Nov 6, 2017 at 1:13 PM, mohit soni  wrote:

> The project's core idea of dynamic creation and management of YARN clusters
> without creating static partitions is still relevant. This project and it's
> community were one of the first movers in the still developing space of
> stateful containerized distributed systems.
>
> I believe the project is currently lagging behind the fast paced
> development of Apache Mesos ecosystem project and is currently not designed
> to take advantage of the fast growing Kubernetes (Apache License 2.0)
> ecosystem.
>
> I also believe the project is not just about the code which gets old once
> it's written, unless tendered and cared for regularly, it's also about the
> community, and the mindshare.
>
> Here are the things that we can do to bring life to the project again:
> 1. Refactor the project codebase to take advantage of the advances in the
> Apache Mesos Ecosystem, which from my direct experience, will simplify the
> project codebase a lot, making it easier once again to maintain it, while
> making it more relevant to the Apache Mesos community again.
> 2. Extend the project to take advantage of Kubernetes project's and enable
> dynamic creation and management of YARN clusters on Kubernetes. Kubernetes
> project now has great support for building stateful containerized
> applications. Refactoring work in Step #1 will lay the groundwork that will
> make it easier to extend the project for the Kubernetes ecosystem.
>
> I'm happy to take a lead to drive both above initiatives, and can post a
> detailed plan and design for accomplishing above in coming months. I am
> also happy to take a lead on submitting monthly incubator reports.
>
> I truly believe that we should give the project one more chance to
> recuperate and thrive.
>
> Best
> Mohit
>
>
> On Mon, Nov 6, 2017 at 5:06 AM, John D. Ament 
> wrote:
>
> > Just to be clear, this is a discussion thread.  Not a vote thread.  While
> > I believe many people vote +1 because they feel its clear why, when you
> > vote against the idea you should explain why.
> >
> > So please, treat this more like a discussion rather than a vote and
> > explain why.
> >
> > On 2017-11-05 13:50, mohit soni  wrote:
> > > -1 vote to not retire the project (committer, PMC)
> > >
> > > I believe the project is still relevant.
> > >
> > > On Sun, Nov 5, 2017 at 8:48 AM, Niels Basjes  wrote:
> > >
> > > > +1 vote to retire (non-committer)
> > > >
> > > > On Sun, Nov 5, 2017 at 3:16 PM, Brandon Gulla <
> gulla.bran...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > +1 vote to retire (non-committer)
> > > > >
> > > > > On Sun, Nov 5, 2017 at 9:03 AM, John D. Ament <
> johndam...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > All,
> > > > > >
> > > > > > Based on the current state of Myriad and prior discussions I'd
> > like to
> > > > > > start the discussions around retiring the project.
> > > > > >
> > > > > > John
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Brandon
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards / Met vriendelijke groeten,
> > > >
> > > > Niels Basjes
> > > >
> > >
> >
>

Re: Report

2017-10-10 Thread Darin Johnson

I'll second that sentiment.

On Tue, Oct 10, 2017 at 11:45 AM, yuliya Feldman <
yufeld...@yahoo.com.invalid> wrote:

>  Feels like project is dead, Wonder if anybody at MapR and/or Mesosphere
> interested in continuing the project?
> On Tuesday, October 10, 2017, 4:42:56 AM PDT, Ted Dunning <
> ted.dunn...@gmail.com> wrote:
>
>  The incubator report for myriad seems to be missing.
>
> Is anybody up for filing it? (probably too late for this month's report)
>
>

Re: Podling Report Reminder - September 2017

2017-09-11 Thread Darin Johnson

+1 retire

On Sun, Sep 10, 2017 at 8:41 PM, John D. Ament 
wrote:

> All,
>
> A discussion back in July indicated the project had enough people to stay
> alive.  There has been no report submitted and no on list activity since
> then.
>
> Should we plan to retire Myriad after all?
>
> https://lists.apache.org/thread.html/7ef01bd386dea6ed651b6aea6783e4
> e8e1d3cc5ace9b73bc7fae225f@%3Cdev.myriad.apache.org%3E
>
> John
>
> On 2017-09-05 21:40, johndam...@apache.org wrote:
> > Dear podling,
> >
> > This email was sent by an automated system on behalf of the Apache
> > Incubator PMC. It is an initial reminder to give you plenty of time to
> > prepare your quarterly board report.
> >
> > The board meeting is scheduled for Wed, 20 September 2017, 10:30 am PDT.
> > The report for your podling will form a part of the Incubator PMC
> > report. The Incubator PMC requires your report to be submitted 2 weeks
> > before the board meeting, to allow sufficient time for review and
> > submission (Wed, September 06).
> >
> > Please submit your report with sufficient time to allow the Incubator
> > PMC, and subsequently board members to review and digest. Again, the
> > very latest you should submit your report is 2 weeks prior to the board
> > meeting.
> >
> > Thanks,
> >
> > The Apache Incubator PMC
> >
> > Submitting your Report
> >
> > --
> >
> > Your report should contain the following:
> >
> > *   Your project name
> > *   A brief description of your project, which assumes no knowledge of
> > the project or necessarily of its field
> > *   A list of the three most important issues to address in the move
> > towards graduation.
> > *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> > aware of
> > *   How has the community developed since the last report
> > *   How has the project developed since the last report.
> > *   How does the podling rate their own maturity.
> >
> > This should be appended to the Incubator Wiki page at:
> >
> > https://wiki.apache.org/incubator/September2017
> >
> > Note: This is manually populated. You may need to wait a little before
> > this page is created from a template.
> >
> > Mentors
> > ---
> >
> > Mentors should review reports for their project(s) and sign them off on
> > the Incubator wiki page. Signing off reports shows that you are
> > following the project - projects that are not signed may raise alarms
> > for the Incubator PMC.
> >
> > Incubator PMC
> >
>

Re: Podling Report Reminder - July 2017

2017-07-07 Thread Darin Johnson

Swapnil, generally a request on incubator does it.

On Jul 7, 2017 5:23 AM, "Swapnil Daingade" 
wrote:

> I created an account on the wiki but was not able to edit the page.
> Do I need permissions to edit it? I'll try to figure it out but any
> pointers would
> be helpful. My account name is sdaingade
>
> Regards
> Swapnil
>
>
> On 07/06/2017 08:20 PM, John D. Ament wrote:
>
>> Ping.  Based on the active dicsussion around retirement, is anyone
>> available to write a report?
>>
>> On 2017-06-27 20:10 (-0400), Adam Bordelon  wrote:
>>
>>> I will be out of town during this period. Can somebody else fill out the
>>> podling report in the wiki, with particular attention to the retirement
>>> vote/discussion?
>>>
>>> On Tue, Jun 27, 2017 at 4:54 PM,  wrote:
>>>
>>> Dear podling,

 This email was sent by an automated system on behalf of the Apache
 Incubator PMC. It is an initial reminder to give you plenty of time to
 prepare your quarterly board report.

 The board meeting is scheduled for Wed, 19 July 2017, 10:30 am PDT.
 The report for your podling will form a part of the Incubator PMC
 report. The Incubator PMC requires your report to be submitted 2 weeks
 before the board meeting, to allow sufficient time for review and
 submission (Wed, July 05).

 Please submit your report with sufficient time to allow the Incubator
 PMC, and subsequently board members to review and digest. Again, the
 very latest you should submit your report is 2 weeks prior to the board
 meeting.

 Thanks,

 The Apache Incubator PMC

 Submitting your Report

 --

 Your report should contain the following:

 *   Your project name
 *   A brief description of your project, which assumes no knowledge of
  the project or necessarily of its field
 *   A list of the three most important issues to address in the move
  towards graduation.
 *   Any issues that the Incubator PMC or ASF Board might wish/need to be
  aware of
 *   How has the community developed since the last report
 *   How has the project developed since the last report.
 *   How does the podling rate their own maturity.

 This should be appended to the Incubator Wiki page at:

 https://wiki.apache.org/incubator/July2017

 Note: This is manually populated. You may need to wait a little before
 this page is created from a template.

 Mentors
 ---

 Mentors should review reports for their project(s) and sign them off on
 the Incubator wiki page. Signing off reports shows that you are
 following the project - projects that are not signed may raise alarms
 for the Incubator PMC.

 Incubator PMC

>

Re: [Vote] Retire Myriad

2017-06-24 Thread Darin Johnson

Swapnil, I'm happy to count you as a -1.  I'd like to see if anyone else is
a -1 though as I've seen lots of +1's though contributors (but not any
explicit votes from committers).


On Jun 22, 2017 6:09 PM, "Swapnil Daingade" <swapnil.daing...@gmail.com>
wrote:

> yes, for more discussion.
>
> Unfortunately my situation too is no different, at least for the next 3-4
> months.
>
> I am happy to vote -1 to buy us more time or we can cancel the vote.
>
> Regards
> Swapnil
>
>
> On Thu, Jun 22, 2017 at 9:41 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Sounds like lots of committers are interested in vetting/merging, but not
> > leading new releases or building community.  Is it worth putting a
> discuss
> > up about that?
> >
> > Darin
> >
> > On Wed, Jun 21, 2017 at 6:34 PM, Mohit Soni <mo...@apache.org> wrote:
> >
> > > I'm coming in pretty late to the conversation, but I am disengaged for
> a
> > > long time now. I won't be able to actively contribute new features.
> But,
> > I
> > > can certainly help with refactoring the project to use
> > > https://github.com/mesosphere/dcos-commons, if somebody is willing to
> > lead
> > > that effort. I personally think that will help us reduce the overall
> > > complexity. And, will make it easier for us to keep the project alive
> > > moving forward.
> > >
> > > On Tue, Jun 20, 2017 at 11:54 AM, Swapnil Daingade <
> > > swapnil.daing...@gmail.com> wrote:
> > >
> > > > @Darin Initially it was you, Adam, me & Santosh who replied.
> > > >
> > > > I think the confusion may have been due to me replying directly to
> Ted
> > > > (instead of reply all)
> > > > and later forwarding the reply to dev@myriad. My bad
> > > >
> > > > Now with Yuliya and Ken, we have 6 committers who are willing to vet
> > and
> > > > merge commits!
> > > >
> > > > Regards
> > > > Swapnil
> > > >
> > > >
> > > > On Tue, Jun 20, 2017 at 11:30 AM, Ken Sipe <k...@mesosphere.io>
> wrote:
> > > >
> > > > > I’ve been busy keeping up with a number of things and disengaged
> for
> > > some
> > > > > time.. I would like to come back to active status as a committer
> and
> > > I’m
> > > > > willing to commit to:
> > > > > 1. Review and merge of code.
> > > > > 2. Vet releases
> > > > >
> > > > > Ken
> > > > >
> > > > > > On Jun 20, 2017, at 10:26 AM, yuliya Feldman
> > > > <yufeld...@yahoo.com.INVALID>
> > > > > wrote:
> > > > > >
> > > > > > Sorry for chiming in late. I was out of town.
> > > > > > I think we should keep the project going if there is an activity
> on
> > > the
> > > > > project.
> > > > > > I can definitely contribute time to vet the release, review and
> > > merge.
> > > > I
> > > > > may not be able to actively work on the project as it is not inline
> > > with
> > > > my
> > > > > current work schedule.
> > > > > > Thanks,Yuliya
> > > > > >
> > > > > >  From: Darin Johnson <dbjohnson1...@gmail.com>
> > > > > > To: Dev <dev@myriad.incubator.apache.org>
> > > > > > Sent: Tuesday, June 20, 2017 11:19 AM
> > > > > > Subject: Re: [Vote] Retire Myriad
> > > > > >
> > > > > > Swapnil: I only counted myself and Adam, if there's two
> additional
> > > > > > committers who are willing to vet the release AND merge commits
> > I'll
> > > > > > consider changing my vote.
> > > > > >
> > > > > > Darin
> > > > > >
> > > > > > On Tue, Jun 20, 2017 at 12:54 PM, Swapnil Daingade <
> > > > > > swapnil.daing...@gmail.com> wrote:
> > > > > >
> > > > > >> I am trying to understand what changed since the last
> discussion.
> > > > > >>
> > > > > >> If I remember correctly, Ted asked for 3-5 committers to vet the
> > > next
> > > > > >> release.
> > > > > >> 4 committers said they were willing. Did I miss something ?
> > > > > >>
> > &g

Re: [Vote] Retire Myriad

2017-06-22 Thread Darin Johnson

Sounds like lots of committers are interested in vetting/merging, but not
leading new releases or building community.  Is it worth putting a discuss
up about that?

Darin

On Wed, Jun 21, 2017 at 6:34 PM, Mohit Soni <mo...@apache.org> wrote:

> I'm coming in pretty late to the conversation, but I am disengaged for a
> long time now. I won't be able to actively contribute new features. But, I
> can certainly help with refactoring the project to use
> https://github.com/mesosphere/dcos-commons, if somebody is willing to lead
> that effort. I personally think that will help us reduce the overall
> complexity. And, will make it easier for us to keep the project alive
> moving forward.
>
> On Tue, Jun 20, 2017 at 11:54 AM, Swapnil Daingade <
> swapnil.daing...@gmail.com> wrote:
>
> > @Darin Initially it was you, Adam, me & Santosh who replied.
> >
> > I think the confusion may have been due to me replying directly to Ted
> > (instead of reply all)
> > and later forwarding the reply to dev@myriad. My bad
> >
> > Now with Yuliya and Ken, we have 6 committers who are willing to vet and
> > merge commits!
> >
> > Regards
> > Swapnil
> >
> >
> > On Tue, Jun 20, 2017 at 11:30 AM, Ken Sipe <k...@mesosphere.io> wrote:
> >
> > > I’ve been busy keeping up with a number of things and disengaged for
> some
> > > time.. I would like to come back to active status as a committer and
> I’m
> > > willing to commit to:
> > > 1. Review and merge of code.
> > > 2. Vet releases
> > >
> > > Ken
> > >
> > > > On Jun 20, 2017, at 10:26 AM, yuliya Feldman
> > <yufeld...@yahoo.com.INVALID>
> > > wrote:
> > > >
> > > > Sorry for chiming in late. I was out of town.
> > > > I think we should keep the project going if there is an activity on
> the
> > > project.
> > > > I can definitely contribute time to vet the release, review and
> merge.
> > I
> > > may not be able to actively work on the project as it is not inline
> with
> > my
> > > current work schedule.
> > > > Thanks,Yuliya
> > > >
> > > >  From: Darin Johnson <dbjohnson1...@gmail.com>
> > > > To: Dev <dev@myriad.incubator.apache.org>
> > > > Sent: Tuesday, June 20, 2017 11:19 AM
> > > > Subject: Re: [Vote] Retire Myriad
> > > >
> > > > Swapnil: I only counted myself and Adam, if there's two additional
> > > > committers who are willing to vet the release AND merge commits I'll
> > > > consider changing my vote.
> > > >
> > > > Darin
> > > >
> > > > On Tue, Jun 20, 2017 at 12:54 PM, Swapnil Daingade <
> > > > swapnil.daing...@gmail.com> wrote:
> > > >
> > > >> I am trying to understand what changed since the last discussion.
> > > >>
> > > >> If I remember correctly, Ted asked for 3-5 committers to vet the
> next
> > > >> release.
> > > >> 4 committers said they were willing. Did I miss something ?
> > > >>
> > > >> Regards
> > > >> Swapnil
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Jun 20, 2017 at 7:35 AM, John Yost <hokiege...@gmail.com>
> > > wrote:
> > > >>
> > > >>> +1
> > > >>>
> > > >>> On Tue, Jun 20, 2017 at 10:22 AM, Klaus Ma <klaus1982...@gmail.com
> >
> > > >> wrote:
> > > >>>
> > > >>>> +1
> > > >>>>
> > > >>>> On Tue, Jun 20, 2017 at 10:13 PM Brandon Gulla <
> > > >> gulla.bran...@gmail.com>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> +1 (if the vote is open to non-comitters)
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On Tue, Jun 20, 2017 at 9:22 AM, Darin Johnson <
> dar...@apache.org>
> > > >>>> wrote:
> > > >>>>>
> > > >>>>>> Based on previous discussions it seems the best course of
> action.
> > > >>> I'm
> > > >>>>>> holding the vote open for 3 business days.
> > > >>>>>>
> > > >>>>>> I'm +1 binding.
> > > >>>>>>
> > > >>>>>> Darin
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Brandon
> > > >>>>>
> > > >>>> --
> > > >>>>
> > > >>>> Regards,
> > > >>>> 
> > > >>>> Da (Klaus), Ma (马达), PMP® | Software Architect
> > > >>>> IBM Platform Development & Support, STG, IBM GCG
> > > >>>> +86-10-8245 4084 | mad...@cn.ibm.com | http://k82.me
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> >
>

Re: [Vote] Retire Myriad

2017-06-20 Thread Darin Johnson

Swapnil: I only counted myself and Adam, if there's two additional
committers who are willing to vet the release AND merge commits I'll
consider changing my vote.

Darin

On Tue, Jun 20, 2017 at 12:54 PM, Swapnil Daingade <
swapnil.daing...@gmail.com> wrote:

> I am trying to understand what changed since the last discussion.
>
> If I remember correctly, Ted asked for 3-5 committers to vet the next
> release.
> 4 committers said they were willing. Did I miss something ?
>
> Regards
> Swapnil
>
>
>
> On Tue, Jun 20, 2017 at 7:35 AM, John Yost <hokiege...@gmail.com> wrote:
>
> > +1
> >
> > On Tue, Jun 20, 2017 at 10:22 AM, Klaus Ma <klaus1982...@gmail.com>
> wrote:
> >
> > > +1
> > >
> > > On Tue, Jun 20, 2017 at 10:13 PM Brandon Gulla <
> gulla.bran...@gmail.com>
> > > wrote:
> > >
> > > > +1 (if the vote is open to non-comitters)
> > > >
> > > >
> > > >
> > > > On Tue, Jun 20, 2017 at 9:22 AM, Darin Johnson <dar...@apache.org>
> > > wrote:
> > > >
> > > > > Based on previous discussions it seems the best course of action.
> > I'm
> > > > > holding the vote open for 3 business days.
> > > > >
> > > > > I'm +1 binding.
> > > > >
> > > > > Darin
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Brandon
> > > >
> > > --
> > >
> > > Regards,
> > > 
> > > Da (Klaus), Ma (马达), PMP® | Software Architect
> > > IBM Platform Development & Support, STG, IBM GCG
> > > +86-10-8245 4084 | mad...@cn.ibm.com | http://k82.me
> > >
> >
>

[Vote] Retire Myriad

2017-06-20 Thread Darin Johnson

Based on previous discussions it seems the best course of action.  I'm
holding the vote open for 3 business days.

I'm +1 binding.

Darin

Re: Is Apache Myriad dead?

2017-06-07 Thread Darin Johnson

Can you name the 3-5 active PMC members who will vet the next release?

I'm willing to vet next release and contribute additional work we did to
Myriad but only if I get a solid commitment from others.  Otherwise I'm
happy to retire and let mapr host their fork.

On Jun 6, 2017 2:29 AM, "Ted Dunning"  wrote:

>
>
> On Tue, Jun 6, 2017 at 12:56 AM, Swapnil Daingade <
> swapnil.daing...@gmail.com> wrote:
>
>> >> The problem is that there is essentially no real community that is
>> happening.
>>
>> retiring doesn't help that
>>
>
> The core problem here is lack of a viable PMC. A PMC has to have 3 active
> members at any given point. Typically this requires about 8 live members.
> Myriad is wildly short of that and thus will have serious problems doing
> any releases.
>
>
>>
>> >> None of the engineers previously working on this will be working on
>> this now. And that sort of situation isn't going to change.
>>
>> Events at MapR contributed to this situation. MapR scaled back its
>> involvement in Myriad and all its committers left.
>>
>
> Well, that is one way to look at it.
>
> On the other hand, if you actually were involved in the situations, you
> would know that none of the committers left because they didn't get to work
> on Myriad as part of their day jobs, nor did any of them feel enough
> attachment to work after hours (as I do on my projects), nor did any of
> them continue with the project after leaving for a new startup.
>
>
>> MapR is of course free to take its own decisions. But it sounds like
>> there is interest in working on Myriad, just not under the ASF umbrella.
>> I feel without ASF, one company will have too much control on Myriad.
>>
>
> The ASF is moving to retire Myriad because it can't make the cut as a
> viable project. No company will have control over the Apache version of the
> project at that point because the project is nothing to control.
>
> The desire to try to reboot the project outside of Apache has almost
> everything to do with the fact that Apache processes and the lack of active
> contributors means that nothing can happen. It isn't an end run around
> Apache constraints for the purpose of control, it is an attempt to keep the
> project alive at all.
>
>
>> Ted, you yourself warned us against this
>> http://www.zdnet.com/article/hadoop-veteran-ted-dunning-when
>> -open-source-is-anything-but-open/
>>
>
> Read the article. I warned about projects like Ambari. One company has all
> of the PMC.
>
> At this point, the situation with Myriad is almost the opposite.
>
>
>>
>> >>That means that it will always be a distraction to get committers
>> qualified as PMC so that they can approve releases and it will never really
>> be possible to exit from incubation.
>>
>> I suggest we start with the contributions first.
>>
>
> Can you name the 3-5 active PMC members who will vet the next release?
>
>
>
>>
>>
>> On Mon, Jun 5, 2017 at 2:53 PM, Ted Dunning 
>> wrote:
>>
>>>
>>> On Mon, Jun 5, 2017 at 10:15 PM, Swapnil Daingade <
>>> swapnil.daing...@gmail.com> wrote:
>>>
 In that case I suggest we not retire

 >> "Darin - yes we've done more planning internally, and we do plan on
 having some engineers spend some time on this project, doing some (minor)
 maintenance for our customers."

>>>
>>> The problem is that there is essentially no real community that is
>>> happening.
>>>
>>> None of the engineers previously working on this will be working on this
>>> now. And that sort of situation isn't going to change.
>>>
>>> That means that it will always be a distraction to get committers
>>> qualified as PMC so that they can approve releases and it will never really
>>> be possible to exit from incubation.
>>>
>>> Outside of the Apache limits, we can have a much more flexible structure
>>> of who can commit. We don't plan to limit who can commit. In fact, we will
>>> probably make it more open than an Apache project normally is.
>>>
>>>
>>
>

Re: Is Apache Myriad dead?

2017-06-05 Thread Darin Johnson

Swapnil, the reasons Ted mentioned are precisely the reasons I've stopped
committing to Myriad (we're running a fork).  Apache is more overhead than
this project needs and actually hinders the project from developing to a
maturity level where a community can form.

Darin

On Jun 5, 2017 5:53 PM, "Ted Dunning"  wrote:

On Mon, Jun 5, 2017 at 10:15 PM, Swapnil Daingade <
swapnil.daing...@gmail.com> wrote:

> In that case I suggest we not retire
>
> >> "Darin - yes we've done more planning internally, and we do plan on
> having some engineers spend some time on this project, doing some (minor)
> maintenance for our customers."
>

The problem is that there is essentially no real community that is
happening.

None of the engineers previously working on this will be working on this
now. And that sort of situation isn't going to change.

That means that it will always be a distraction to get committers qualified
as PMC so that they can approve releases and it will never really be
possible to exit from incubation.

Outside of the Apache limits, we can have a much more flexible structure of
who can commit. We don't plan to limit who can commit. In fact, we will
probably make it more open than an Apache project normally is.

Re: Podling Report Reminder - June 2017

2017-06-01 Thread Darin Johnson

I'm willing to write this, but I think it'll be with the recommendation to
retire.  Does anyone else wish to volunteer?

On Jun 1, 2017 7:43 AM,  wrote:

> Dear podling,
>
> This email was sent by an automated system on behalf of the Apache
> Incubator PMC. It is an initial reminder to give you plenty of time to
> prepare your quarterly board report.
>
> The board meeting is scheduled for Wed, 21 June 2017, 10:30 am PDT.
> The report for your podling will form a part of the Incubator PMC
> report. The Incubator PMC requires your report to be submitted 2 weeks
> before the board meeting, to allow sufficient time for review and
> submission (Wed, June 07).
>
> Please submit your report with sufficient time to allow the Incubator
> PMC, and subsequently board members to review and digest. Again, the
> very latest you should submit your report is 2 weeks prior to the board
> meeting.
>
> Thanks,
>
> The Apache Incubator PMC
>
> Submitting your Report
>
> --
>
> Your report should contain the following:
>
> *   Your project name
> *   A brief description of your project, which assumes no knowledge of
> the project or necessarily of its field
> *   A list of the three most important issues to address in the move
> towards graduation.
> *   Any issues that the Incubator PMC or ASF Board might wish/need to be
> aware of
> *   How has the community developed since the last report
> *   How has the project developed since the last report.
> *   How does the podling rate their own maturity.
>
> This should be appended to the Incubator Wiki page at:
>
> https://wiki.apache.org/incubator/June2017
>
> Note: This is manually populated. You may need to wait a little before
> this page is created from a template.
>
> Mentors
> ---
>
> Mentors should review reports for their project(s) and sign them off on
> the Incubator wiki page. Signing off reports shows that you are
> following the project - projects that are not signed may raise alarms
> for the Incubator PMC.
>
> Incubator PMC
>

Re: Is Apache Myriad dead?

2017-05-05 Thread Darin Johnson

That sounds OK to me, I'd like to see Myriad continue but realistically I
can't support it on my own.

Darin

On Wed, May 3, 2017 at 10:19 AM, Will Ochandarena <wochandar...@mapr.com>
wrote:

> All - sorry for the delay in commenting.  We (MapR) are in the midst of
> roadmap planning for Myriad and other projects.
>
>
> Please give us a couple of weeks to plan our resourcing.  I'll come back
> soon with a proposal for how we take the project forward.
>
>
> Will Ochandarena
>
> MapR Product Management
>
> 
> From: Adam Bordelon <a...@mesosphere.io>
> Sent: Friday, April 28, 2017 8:40:30 PM
> To: dev@myriad.incubator.apache.org
> Cc: dan...@apache.org; tdunn...@apache.org; lrese...@apache.org;
> b...@apache.org
> Subject: Re: Is Apache Myriad dead?
>
> Maybe not dead, but it's in a coma, and I'm not sure if/when it'll wake up
> again.
> I'm not opposed to retiring, except that moving off of Apache infra sounds
> like work.
>
> On Fri, Apr 28, 2017 at 5:46 AM, Klaus Ma <k8...@icloud.com> wrote:
>
> > +1 on retire
> >
> > > On 28 Apr 2017, at 20:33, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
> > >
> > > I think that's an accurate assessment, as much as I'd like to say
> > otherwise.
> > >
> > > I'd suggest we start a vote to retire.
> > >
> > > Darin
> > >
> > > On Apr 28, 2017 5:52 AM, "Niels Basjes" <ni...@basjes.nl> wrote:
> > >
> > > Hi,
> > >
> > > A few weeks ago at the Dataworks/Hadoop Summit in Munich I discussed
> the
> > > upcoming docker support in Yarn (Hadoop 3.0) and I mentioned Apache
> > Myriad
> > > as a seemingly related project.
> > > Someone then stated that Myriad is a dead project and I should avoid
> it.
> > >
> > > Out of curiosity to check the validity of that statement I had a look
> at
> > > the project today and I found that
> > > - In 2017 only 2 jira tickets were touched (actually 3, but 1 is a
> > > duplicate)
> > > - The last commit to any branch I could find was about 7 months ago.
> The
> > > last JIRA ticket was 'Fixed' around the same time.
> > > - The dev mailing (when ignoring these jira issues and ASF generic
> > > messages) is also almost silent.
> > >
> > > To me this looks like just about everyone involved lost interest in the
> > > project about 6 months ago.
> > >
> > > So can I conclude this project is actually dead?
> > >
> > > --
> > > Best regards
> > >
> > > Niels Basjes
> > > nielsbas...@apache.org
> >
> >
>

Sync today?

2017-03-08 Thread Darin Johnson

Tried joining.

Re: will miss standup today - doc appt.

2017-01-13 Thread Darin Johnson

Main things I think we need to address are the PR's, I've reviewed and most
can be merged.  Also, a few bugs were filed, I've got a fix for two, which
I'll try to push soon.

On Wed, Jan 11, 2017 at 11:40 AM, Adam Bordelon <a...@mesosphere.io> wrote:

> Nothing urgent today. I've got the dcos universe package for Myriad nearly
> complete.
> https://github.com/mesosphere/universe/pull/841
> Just need to write up an example usage doc to go with it.
> Let's do any updates over email this week.
>
> On Wed, Jan 11, 2017 at 8:34 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Adam anything to discuss?  Otherwise I'm in favor of canceling.  Maybe
> just
> > do an email conversation about anything anyone wants to see done or
> working
> > on?
> >
> > On Wed, Jan 11, 2017 at 10:37 AM, yuliya Feldman <
> > yufeld...@yahoo.com.invalid> wrote:
> >
> > >
> > >
> >
>

Unable to join dev sync

2016-12-14 Thread Darin Johnson

Eom

Hangout today

2016-11-16 Thread Darin Johnson

Missed due to another meeting going long.  Anyone else attend?

Re: [DISCUSS] handling roles in Myriad code

2016-11-04 Thread Darin Johnson

Alright, I think the best approach is for me to write some unit tests that
show Yuliya's bugs and then correct the code accordingly.  I'll try to get
that out next week.



On Wed, Nov 2, 2016 at 12:31 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> If you just add both sets of resources as is to the launchTask call with
> the proper offerId(s), you should be able to launch a single task that uses
> both sets of resources. I wouldn't advise modifying the roles fields on the
> Resources, in case Mesos looks for an exact match with known sets of
> resources. Just copy the resources from the offer and reduce the cpu/mem
> values or select ports from the range if necessary.
>
> On Wed, Nov 2, 2016 at 4:44 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Adam to clarify with an example. An offer may have an offer with reserved
> > resources (role hadoop) 2gb Mem, 1 CPU and 31000-31005 ports, and *
> > resources 2 GB Mem 3 CPU and 31500-31999 ports.
> > If one wants to create a task using 4 CPU and 3 GB of memory 31000 and
> > 35001 ports, does one have to add resources of 2gb Mem, 1 CPU, 35001 port
> > of role hadoop and 3 CPU and 2 GB 31000 of role * (or empty)?
> > We had issues with storm not being able to accept both reserved and
> > unreserved resources leading to not being able to use all memory.  We
> > worked with Brenden to correct the issue and the solution was to add the
> > two resources to the task.  Failure to add both resources and simply
> using
> > one resource 4 CPU and 3 GB 31000 and 35001 resulted in TaskFailed.
> >
> > Darin
> >
> > On Nov 2, 2016 5:54 AM, "Adam Bordelon" <a...@mesosphere.io> wrote:
> >
> > > Sorry for the delayed response. If you haven't already, I'd recommend
> > > reading https://mesos.apache.org/documentation/latest/roles/
> > > Beyond that, let me try to clear up a few things:
> > >
> > > ==FrameworkInfo.role
> > > 1. Every framework registers with Mesos with a role. If you don't
> specify
> > > one, Mesos defaults the framework to '*'. Reservations and quota cannot
> > be
> > > assigned to "*", but it is in the list (with weight=1) when DRF
> > calculates
> > > which "role" is furthest below its fair share, to decide which role's
> > > framework(s) should get the next offer.
> > > 2. At offer time, every resource offered to a framework is allocated
> (in
> > > the Mesos allocator) to the framework's role, regardless of
> reservations.
> > > This is how Mesos determines a role's current usage vs. its "fair
> share".
> > > If the framework declines the offer, then the resource is "recovered"
> or
> > > deallocated from the framework's role. If the framework launches a task
> > > with the resource, it is not recovered until the task exits. Mesos
> > > considers offered resources when calculating current "usage" so that
> > > frameworks cannot hoard offers to take over the cluster.
> > >
> > > ==Resource.role
> > > 3.  Straight from the protobuf: "The role that this resource is
> reserved
> > > for. If "*", this indicates that the resource is unreserved. Otherwise,
> > the
> > > resource will only be offered to frameworks that belong to this role."
> > > 4. Resources offered to a framework may be reserved for that
> framework's
> > > role, so that they are never offered to other frameworks in different
> > > roles. These reserved resources will have role="foo" (for some value of
> > foo
> > > other than "*") set in their offers.
> > > 5. Unreserved resources may be offered to any framework, regardless of
> > what
> > > role the framework registered with. Mesos uses weighted DRF to select
> the
> > > role that is furthest below its fair share and offer the unreserved
> > > resource to a framework registered with that role (using DRF between
> > > frameworks in the same role).
> > > 6.  When unreserved resources are offered, even if Mesos didn't set the
> > > role field, the protobuf parser scheduler-side should set it to the
> > default
> > > "*". Either is equivalent to "unreserved".
> > >
> > > ==Implications for Myriad
> > > 7. If there is code (like Yuliya references) checking if offered
> > resources
> > > are unreserved, we should be checking if role == "*", not role.isEmpty.
> > Or
> > > check for both if you like.
> > > 8. There shouldn't be any r

Re: [DISCUSS] handling roles in Myriad code

2016-11-02 Thread Darin Johnson

in
> design, but will take months to implement.
>
> Hope this helps. I've numbered my points in case you have questions about
> any of it.
> Cheers,
> -Adam-
>
>
> On Fri, Oct 28, 2016 at 4:10 PM, yuliya Feldman
> <yufeld...@yahoo.com.invalid
> > wrote:
>
> > We clearly need a word from Adam, Mohit, Ken
> > My impression is that Myriad will not get any resources that are not
> > specific to the role it has (or entitled to), so we may not need much
> roles
> > manipulation in Myriad code.
> > Just my 2c of gut feelings :)
> > Thanks,Yuliya
> >
> >   From: Darin Johnson <dbjohnson1...@gmail.com>
> >  To: Dev <dev@myriad.incubator.apache.org>
> >  Sent: Friday, October 28, 2016 2:54 PM
> >  Subject: Re: [DISCUSS] handling roles in Myriad code
> >
> > Any word from Adam or Mohit?
> >
> > On Oct 20, 2016 12:36 AM, "Klaus Ma" <klaus1982...@gmail.com> wrote:
> >
> > > I can help on this discussion; I used to be Mesos contributor for a
> year
> > > :).
> > >
> > > Mesos allocate regular resources based on role by DRF; and role is also
> > > used for reservation & quotas. So, the framework (like Myriad), may get
> > two
> > > kind of resources: "*" or "myriad-s role". When Myriad launch tasks, it
> > can
> > > not overuse any kind of resources: for example, if Myarid got offers:
> > > cpu(*):1;cpu(myriad):1, Myriad can not launch tasks by cpu(*):2 which
> > will
> > > be rejected by Mesos master.
> > >
> > > Thanks
> > > Klaus
> > >
> > >
> > > On Thu, Oct 20, 2016 at 12:10 PM Yuliya <yufeld...@yahoo.com.invalid>
> > > wrote:
> > >
> > > > I really would like Mesosphere guys to comment here. I had a chat
> with
> > > > Adam today morning and I did not get the same impression
> > > >
> > > > Thanks,
> > > > Yuliya
> > > >
> > > > > On Oct 19, 2016, at 8:50 PM, Darin Johnson <
> dbjohnson1...@gmail.com>
> > > > wrote:
> > > > >
> > > > > We use roles extensively to ensure different frameworks can (or
> > can't)
> > > > get
> > > > > resources via mechanisms such as reserved resorces and quotas.
> Also
> > if
> > > > you
> > > > > don't pay attention you can miss a lot of the resources you're
> given.
> > > I
> > > > > wish it was we didn't have to do all the book keeping our selves,
> > but I
> > > > > suppose there are good reasons for delegating it to the framework,
> > for
> > > > > instance we can choose when to fave a reserved vs a default
> resource.
> > > > >
> > > > > On Wed, Oct 19, 2016 at 11:30 PM, yuliya Feldman <
> > > > > yufeld...@yahoo.com.invalid> wrote:
> > > > >
> > > > >> I am not sure we should care about role being set or not, what if
> in
> > > the
> > > > >> future we will have multiple rolesNot even sure if
> presence/absence
> > of
> > > > role
> > > > >> should play role (no pun intended :) ).
> > > > >>
> > > > >>  From: Darin Johnson <dbjohnson1...@gmail.com>
> > > > >> To: Dev <dev@myriad.incubator.apache.org>; yuliya Feldman <
> > > > >> yufeld...@yahoo.com>
> > > > >> Sent: Wednesday, October 19, 2016 7:17 PM
> > > > >> Subject: Re: [DISCUSS] handling roles in Myriad code
> > > > >>
> > > > >> Ah so if I understand correctly, if frameworkRole='*' is present
> in
> > > the
> > > > >> config, it's handled as thought it's the framework role.  I
> believe
> > > > when I
> > > > >> was testing I was using frameworkRole="test" or commenting out
> > > > >> frameworkRole="test".  It looks as though in MyriadConfiguration,
> > > > >> getFrameworkRole now returns "*" even if not set.
> > > > >>
> > > > >> Seems like we should be able to add a check like r.hasRole() &&
> > > > >> r.getRole().equals(role)
> > > > >> && !role.equals("*") in a few places. Though it may be better
> > > > >> to pass think about a better approach here.
> > > > >>
> > > > >> Darin
> > &

Re: [DISCUSS] handling roles in Myriad code

2016-10-28 Thread Darin Johnson

Any word from Adam or Mohit?

On Oct 20, 2016 12:36 AM, "Klaus Ma" <klaus1982...@gmail.com> wrote:

> I can help on this discussion; I used to be Mesos contributor for a year
> :).
>
> Mesos allocate regular resources based on role by DRF; and role is also
> used for reservation & quotas. So, the framework (like Myriad), may get two
> kind of resources: "*" or "myriad-s role". When Myriad launch tasks, it can
> not overuse any kind of resources: for example, if Myarid got offers:
> cpu(*):1;cpu(myriad):1, Myriad can not launch tasks by cpu(*):2 which will
> be rejected by Mesos master.
>
> Thanks
> Klaus
>
>
> On Thu, Oct 20, 2016 at 12:10 PM Yuliya <yufeld...@yahoo.com.invalid>
> wrote:
>
> > I really would like Mesosphere guys to comment here. I had a chat with
> > Adam today morning and I did not get the same impression
> >
> > Thanks,
> > Yuliya
> >
> > > On Oct 19, 2016, at 8:50 PM, Darin Johnson <dbjohnson1...@gmail.com>
> > wrote:
> > >
> > > We use roles extensively to ensure different frameworks can (or can't)
> > get
> > > resources via mechanisms such as reserved resorces and quotas.  Also if
> > you
> > > don't pay attention you can miss a lot of the resources you're given.
> I
> > > wish it was we didn't have to do all the book keeping our selves, but I
> > > suppose there are good reasons for delegating it to the framework, for
> > > instance we can choose when to fave a reserved vs a default resource.
> > >
> > > On Wed, Oct 19, 2016 at 11:30 PM, yuliya Feldman <
> > > yufeld...@yahoo.com.invalid> wrote:
> > >
> > >> I am not sure we should care about role being set or not, what if in
> the
> > >> future we will have multiple rolesNot even sure if presence/absence of
> > role
> > >> should play role (no pun intended :) ).
> > >>
> > >>  From: Darin Johnson <dbjohnson1...@gmail.com>
> > >> To: Dev <dev@myriad.incubator.apache.org>; yuliya Feldman <
> > >> yufeld...@yahoo.com>
> > >> Sent: Wednesday, October 19, 2016 7:17 PM
> > >> Subject: Re: [DISCUSS] handling roles in Myriad code
> > >>
> > >> Ah so if I understand correctly, if frameworkRole='*' is present in
> the
> > >> config, it's handled as thought it's the framework role.  I believe
> > when I
> > >> was testing I was using frameworkRole="test" or commenting out
> > >> frameworkRole="test".  It looks as though in MyriadConfiguration,
> > >> getFrameworkRole now returns "*" even if not set.
> > >>
> > >> Seems like we should be able to add a check like r.hasRole() &&
> > >> r.getRole().equals(role)
> > >> && !role.equals("*") in a few places. Though it may be better
> > >> to pass think about a better approach here.
> > >>
> > >> Darin
> > >>
> > >> On Wed, Oct 19, 2016 at 9:28 PM, yuliya Feldman
> > >> <yufeld...@yahoo.com.invalid
> > >>> wrote:
> > >>
> > >>> Hello Darrin,
> > >>> I kind of see the point regarding JHS ports. May be there is truth to
> > it.
> > >>> Regarding my issues with role/no role.
> > >>> I had this issue for NMs with random ports (not hardcoded), as it has
> > >>> different code path when role is present and when it is not. My
> > >> impression
> > >>> those are bugs.
> > >>> I am happy to point you to the places in the code that caused issues
> on
> > >>> master (at least for me).[1] does not increment numDefaultValues if
> > role
> > >> is
> > >>> set (which is always set), subsequently [2] has issues[3] same thing
> -
> > >>> fills out list only if there is no role, but again it is always
> there,
> > >> just
> > >>> set to "*"
> > >>>
> > >>>
> > >>> Regarding:>>> To handle nodemanager persistence I think we should
> work
> > >>> with Klaus's PR's to get thecorrect ports, though we'll need to use
> > some
> > >>> disk persistence as well to
> > >>> keep the NM state.
> > >>> Disk persistence won't help here (not even sure NM has much state to
> > >>> persist - even if it does it should be taken care by YARN), as
> > containers
> >

Re: [DISCUSS] handling roles in Myriad code

2016-10-19 Thread Darin Johnson

We use roles extensively to ensure different frameworks can (or can't) get
resources via mechanisms such as reserved resorces and quotas.  Also if you
don't pay attention you can miss a lot of the resources you're given.  I
wish it was we didn't have to do all the book keeping our selves, but I
suppose there are good reasons for delegating it to the framework, for
instance we can choose when to fave a reserved vs a default resource.

On Wed, Oct 19, 2016 at 11:30 PM, yuliya Feldman <
yufeld...@yahoo.com.invalid> wrote:

> I am not sure we should care about role being set or not, what if in the
> future we will have multiple rolesNot even sure if presence/absence of role
> should play role (no pun intended :) ).
>
>   From: Darin Johnson <dbjohnson1...@gmail.com>
>  To: Dev <dev@myriad.incubator.apache.org>; yuliya Feldman <
> yufeld...@yahoo.com>
>  Sent: Wednesday, October 19, 2016 7:17 PM
>  Subject: Re: [DISCUSS] handling roles in Myriad code
>
> Ah so if I understand correctly, if frameworkRole='*' is present in the
> config, it's handled as thought it's the framework role.  I believe when I
> was testing I was using frameworkRole="test" or commenting out
> frameworkRole="test".  It looks as though in MyriadConfiguration,
> getFrameworkRole now returns "*" even if not set.
>
> Seems like we should be able to add a check like r.hasRole() &&
> r.getRole().equals(role)
> && !role.equals("*") in a few places. Though it may be better
> to pass think about a better approach here.
>
> Darin
>
> On Wed, Oct 19, 2016 at 9:28 PM, yuliya Feldman
> <yufeld...@yahoo.com.invalid
> > wrote:
>
> > Hello Darrin,
> > I kind of see the point regarding JHS ports. May be there is truth to it.
> > Regarding my issues with role/no role.
> > I had this issue for NMs with random ports (not hardcoded), as it has
> > different code path when role is present and when it is not. My
> impression
> > those are bugs.
> > I am happy to point you to the places in the code that caused issues on
> > master (at least for me).[1] does not increment numDefaultValues if role
> is
> > set (which is always set), subsequently [2] has issues[3] same thing -
> > fills out list only if there is no role, but again it is always there,
> just
> > set to "*"
> >
> >
> > Regarding:>>> To handle nodemanager persistence I think we should work
> > with Klaus's PR's to get thecorrect ports, though we'll need to use some
> > disk persistence as well to
> > keep the NM state.
> > Disk persistence won't help here (not even sure NM has much state to
> > persist - even if it does it should be taken care by YARN), as containers
> > have to reconnect to NM after it restarts, so they have to know RPC port.
> > Thanks,Yuliya
> > [1] https://github.com/apache/incubator-myriad/blob/master/
> > myriad-scheduler/src/main/java/org/apache/myriad/scheduler/resource/
> > RangeResource.java#L85
> > [2] https://github.com/apache/incubator-myriad/blob/master/
> > myriad-scheduler/src/main/java/org/apache/myriad/scheduler/resource/
> > RangeResource.java#L128
> >
> > [3] https://github.com/apache/incubator-myriad/blob/master/
> > myriad-scheduler/src/main/java/org/apache/myriad/scheduler/resource/
> > RangeResource.java#L140
> >
> >
> >  From: Darin Johnson <dbjohnson1...@gmail.com>
> >  To: Dev <dev@myriad.incubator.apache.org>; yuliya Feldman <
> > yufeld...@yahoo.com>
> >  Sent: Wednesday, October 19, 2016 6:04 PM
> >  Subject: Re: [DISCUSS] handling roles in Myriad code
> >
> > Yuyiya,
> >
> > Yes on master a lot of refactoring was done, in particular you specify
> > ports other than 0 in the myriad-default.yaml, it will only return those
> > ports (not random ones).  This was done in part because the we were
> > attempting the use the JHS on a port like 32001, but it the port was
> > already in use by another app and hence the port wasn't offered myriad
> was
> > still launching the JHS only to have it crash.
> >
> > If you want to use static ports you can just not put anything in the
> > myriad-default.yaml and configure the yarn-site.xml and mapred-site.xml
> as
> > usual (they should be outside the range mesos offers).  To handle
> > nodemanager persistence I think we should work with Klaus's PR's to get
> the
> > correct ports, though we'll need to use some disk persistance as well to
> > keep the NM state.
> >
> > As for a bug in NM's getting zero ports could you send a copy of your
> > configuration and

Re: [DISCUSS] handling roles in Myriad code

2016-10-19 Thread Darin Johnson

I'd really like to see if Mohit could answer how the dcos-commons library
might be able to help here.  Also can dcos commons work with vanilla mesos
or just dcos?

On Wed, Oct 19, 2016 at 9:51 PM, Klaus Ma <klaus1982...@gmail.com> wrote:

> And for the role, we also need to handle the principal with it. For
> example, it need principal to use reserved resources if necessary.
>
> 
> Da (Klaus), Ma (马达) | PMP® | Software Architect
> Platform OpenSource Technology, STG, IBM GCG
> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
>
> On Thu, Oct 20, 2016 at 9:28 AM, yuliya Feldman
> <yufeld...@yahoo.com.invalid
> > wrote:
>
> > Hello Darrin,
> > I kind of see the point regarding JHS ports. May be there is truth to it.
> > Regarding my issues with role/no role.
> > I had this issue for NMs with random ports (not hardcoded), as it has
> > different code path when role is present and when it is not. My
> impression
> > those are bugs.
> > I am happy to point you to the places in the code that caused issues on
> > master (at least for me).[1] does not increment numDefaultValues if role
> is
> > set (which is always set), subsequently [2] has issues[3] same thing -
> > fills out list only if there is no role, but again it is always there,
> just
> > set to "*"
> >
> >
> > Regarding:>>> To handle nodemanager persistence I think we should work
> > with Klaus's PR's to get thecorrect ports, though we'll need to use some
> > disk persistence as well to
> > keep the NM state.
> > Disk persistence won't help here (not even sure NM has much state to
> > persist - even if it does it should be taken care by YARN), as containers
> > have to reconnect to NM after it restarts, so they have to know RPC port.
> > Thanks,Yuliya
> > [1] https://github.com/apache/incubator-myriad/blob/master/
> > myriad-scheduler/src/main/java/org/apache/myriad/scheduler/resource/
> > RangeResource.java#L85
> > [2] https://github.com/apache/incubator-myriad/blob/master/
> > myriad-scheduler/src/main/java/org/apache/myriad/scheduler/resource/
> > RangeResource.java#L128
> >
> > [3] https://github.com/apache/incubator-myriad/blob/master/
> > myriad-scheduler/src/main/java/org/apache/myriad/scheduler/resource/
> > RangeResource.java#L140
> >
> >
> >   From: Darin Johnson <dbjohnson1...@gmail.com>
> >  To: Dev <dev@myriad.incubator.apache.org>; yuliya Feldman <
> > yufeld...@yahoo.com>
> >  Sent: Wednesday, October 19, 2016 6:04 PM
> >  Subject: Re: [DISCUSS] handling roles in Myriad code
> >
> > Yuyiya,
> >
> > Yes on master a lot of refactoring was done, in particular you specify
> > ports other than 0 in the myriad-default.yaml, it will only return those
> > ports (not random ones).  This was done in part because the we were
> > attempting the use the JHS on a port like 32001, but it the port was
> > already in use by another app and hence the port wasn't offered myriad
> was
> > still launching the JHS only to have it crash.
> >
> > If you want to use static ports you can just not put anything in the
> > myriad-default.yaml and configure the yarn-site.xml and mapred-site.xml
> as
> > usual (they should be outside the range mesos offers).  To handle
> > nodemanager persistence I think we should work with Klaus's PR's to get
> the
> > correct ports, though we'll need to use some disk persistance as well to
> > keep the NM state.
> >
> > As for a bug in NM's getting zero ports could you send a copy of your
> > configuration and I'll try to recreate the problem?
> >
> > On Wed, Oct 19, 2016 at 3:29 PM, yuliya Feldman
> > <yufeld...@yahoo.com.invalid
> > > wrote:
> >
> > > Hello there,
> > > I wanted to discuss current handling of roles in Myriad code.
> > Specifically
> > > on "master" branch. Most likely due to heavy refactoring.
> > > As far as I can see we try to handle presence or absence of a role on a
> > > resource(s) based on the fact that framework may or may not have a
> > role.On
> > > the other hand we always set framework role to "*" - which means it
> will
> > > always have a role, just that role will be "default".
> > > So far I encountered couple of bugs related to roles in RangeResource
> > > related to ports and inability to spin up NodeManagers, as no ports
> were
> > > assigned because of the fact how we handle roles.
> > > I would like @Adam and other Mesosphere folks to comment on how should
> we
> > > handle relationship between frameworkRole and resource role(s)
> > > Thanks,Yuliya
> >
> >
> >
> >
>

Re: [DISCUSS] handling roles in Myriad code

2016-10-19 Thread Darin Johnson

Yuyiya,

Yes on master a lot of refactoring was done, in particular you specify
ports other than 0 in the myriad-default.yaml, it will only return those
ports (not random ones).  This was done in part because the we were
attempting the use the JHS on a port like 32001, but it the port was
already in use by another app and hence the port wasn't offered myriad was
still launching the JHS only to have it crash.

If you want to use static ports you can just not put anything in the
myriad-default.yaml and configure the yarn-site.xml and mapred-site.xml as
usual (they should be outside the range mesos offers).  To handle
nodemanager persistence I think we should work with Klaus's PR's to get the
correct ports, though we'll need to use some disk persistance as well to
keep the NM state.

As for a bug in NM's getting zero ports could you send a copy of your
configuration and I'll try to recreate the problem?

On Wed, Oct 19, 2016 at 3:29 PM, yuliya Feldman  wrote:

> Hello there,
> I wanted to discuss current handling of roles in Myriad code. Specifically
> on "master" branch. Most likely due to heavy refactoring.
> As far as I can see we try to handle presence or absence of a role on a
> resource(s) based on the fact that framework may or may not have a role.On
> the other hand we always set framework role to "*" - which means it will
> always have a role, just that role will be "default".
> So far I encountered couple of bugs related to roles in RangeResource
> related to ports and inability to spin up NodeManagers, as no ports were
> assigned because of the fact how we handle roles.
> I would like @Adam and other Mesosphere folks to comment on how should we
> handle relationship between frameworkRole and resource role(s)
> Thanks,Yuliya

Re: Do we have sync up today, or I am too late?

2016-08-24 Thread Darin Johnson

Adam and I showed up.  I'm willing to hop back on a chat if you want.

On Wed, Aug 24, 2016 at 12:21 PM, yuliya Feldman <
yufeld...@yahoo.com.invalid> wrote:

>
>

Re: Resource manager error

2016-08-17 Thread Darin Johnson

Take a look at your myriad configuration under yarnEnvironment.  You can
set JAVA_HOME there, should solve the issue. See below.
yarnEnvironment:
YARN_HOME: /usr/local/hadoop
#HADOOP_CONF_DIR=config
#HADOOP_TMP_DIR=$MESOS_SANDBOX
#YARN_HOME: hadoop-2.7.0 #this should be relative if nodeManagerUri is set
#JAVA_HOME: /usr/lib/jvm/java-default #System dependent, but sometimes
necessary
#JAVA_HOME: jre1.7.0_76 # Path to JRE distribution, relative to sandbox
directory
#JAVA_LIBRARY_PATH: /opt/mycompany/lib

On Wed, Aug 17, 2016 at 3:13 PM, Matthew J. Loppatto <mloppa...@keywcorp.com
> wrote:

> I'm running the resource manager as the root user.  Checking a few of my
> nodes, JAVA_HOME is set on all of them for the root env.  Am I ok to be
> using openjdk1.7 or do I have to use Oracle jdk?
>
> Matt
>
> -Original Message-
> From: John Yost [mailto:hokiege...@gmail.com]
> Sent: Wednesday, August 17, 2016 3:01 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Resource manager error
>
> Progress is nice! What user are you running myriad as? root? yarn? If it
> is the former and you are running via sudo, I've seen this type of error.
> If so, sudo to the root user and then launch. Otherwise, please type in env
> if you are on linux box and confirm you see JAVA_HOME for the user you are
> launching myriad as.
>
> --John
>
> On Wed, Aug 17, 2016 at 2:56 PM, Matthew J. Loppatto <
> mloppa...@keywcorp.com
> > wrote:
>
> > Hey John,
> >
> > I set up a role for myriad, restarted mesos-master, and now I'm seeing
> > RMs starting on the Mesos UI, but they fail with the message "lost
> > with exit
> > status: 256".  The executor log says "Error: JAVA_HOME is not set and
> > could not be found."  $JAVA_HOME is set on all my slaves as far as I'm
> aware.
> > Running `java -version` confirms openjdk 1.7.0_111.  Looks like its
> > close to a working state.  Am I missing something?
> >
> > Thanks!
> > Matt
> >
> > -Original Message-
> > From: John Yost [mailto:hokiege...@gmail.com]
> > Sent: Wednesday, August 17, 2016 2:38 PM
> > To: dev@myriad.incubator.apache.org
> > Subject: Re: Resource manager error
> >
> > Please uncomment frameworkRole and then add the name of whatever Mesos
> > role you have configured that is not *. Note: at the risk of telling
> > you something you already know, you define roles in
> /etc/mesos-master/roles.
> >
> > In the meantime, I opened up a JIRA ticket and gonna fix this ASAP
> > starting now! :)
> >
> > --John
> >
> > On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <
> > mloppa...@keywcorp.com
> > > wrote:
> >
> > > Hey Darin,
> > >
> > > Commenting out myriadFrameworkRole got rid of the log message about
> > > the missing role, but I'm still seeing the "n must be positive"
> > exception.
> > >
> > > The only other thing of interest I see in the log is WARN fair.
> > AllocationFileLoaderService:
> > > fair-scheduler.xml not found on the classpath.  Not sure if that is
> > > causing any issue though.
> > >
> > > Matt
> > >
> > > -Original Message-
> > > From: Darin Johnson [mailto:dbjohnson1...@gmail.com]
> > > Sent: Wednesday, August 17, 2016 1:26 PM
> > > To: Dev
> > > Subject: Re: Resource manager error
> > >
> > > Hey Matt,
> > >
> > > Looking through the code, I think setting myriadFrameworkRole to "*"
> > > might be the problem.  Can you try commenting out that line in your
> > > config?  I'll double check this in a little while too.  If that
> > > works I'll submit a patch that checks that.
> > >
> > > Sorry - Myriad is still a pretty young project!  Thanks for checking
> > > it out though!
> > >
> > > Darin
> > >
> > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto <
> > > mloppa...@keywcorp.com> wrote:
> > >
> > > > Hey Darin,
> > > >
> > > > Pulling from master got rid of the errors I was seeing, however
> > > > I'm running into a new issue.  After starting the resource
> > > > manager, I see this in the logs:
> > > >
> > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1
> > > > NM(s) with profile medium
> > > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> > > MyriadOperations:
> > > > Adding 1 NM instances to cluster
> > &

Re: Resource manager error

2016-08-16 Thread Darin Johnson

Hey Mathew, my coworker found the same issue recently, I fixed it on my
last pull request, if you'd like to pull from master.

Alternatively, you could comment out the appendCgroups line in
myriad-scheduler
/src

/main

/java

/org

/apache

/myriad

/scheduler

/*NMExecutorCLGenImpl* and rebuild.

Sorry that missed my QA unfortunately I'm always using cgroups and didn't
test that.  We may do a 0.2.1 release but I can say when.

Darin

On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" 
wrote:

> Hi,
>
>
>
> I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> https://cwiki.apache.org/confluence/display/MYRIAD/
> Installing+for+Developers
>
>
>
> And I get the following error in the resource manager executor log in
> mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn resourcemanager`:
>
>
>
> chown: cannot access 
> ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-f298affb6442’:
> No such file or directory
>
> env: /bin/yarn: No such file or directory
>
> ory
>
>
>
> It appears the ‘mesos’ directory doesn’t exist under /sys/fs/cgroup/cpu.
> Any ideas what the issue could be?
>
>
>
> This is my yarn-site.xml:
>
>
>
> 
>
> 
>
>
>
>yarn.nodemanager.aux-services
>
>mapreduce_shuffle,myriad_executor
>
>
>
>
>
>
>
>yarn.nodemanager.aux-services.mapreduce_shuffle.class
>
>org.apache.hadoop.mapred.ShuffleHandler
>
>
>
>
>
>yarn.nodemanager.aux-services.myriad_executor.class
>
>org.apache.myriad.executor.MyriadExecutorAuxService
>
>
>
>
>
>yarn.nm.liveness-monitor.expiry-interval-ms
>
>2000
>
>
>
>
>
>yarn.am.liveness-monitor.expiry-interval-ms
>
>1
>
>
>
>
>
>yarn.resourcemanager.nm.liveness-monitor.interval-ms
>
>1000
>
>
>
> 
>
>
>
>yarn.scheduler.minimum-allocation-vcores
>
>0
>
>
>
>
>
>yarn.scheduler.minimum-allocation-mb
>
>0
>
>
>
> 
>
> 
>
>yarn.nodemanager.resource.cpu-vcores
>
>${nodemanager.resource.cpu-vcores}
>
> 
>
> 
>
>yarn.nodemanager.resource.memory-mb
>
>${nodemanager.resource.memory-mb}
>
> 
>
> 
>
> 
>
>yarn.nodemanager.address
>
>${myriad.yarn.nodemanager.address}
>
> 
>
> 
>
>yarn.nodemanager.webapp.address
>
>${myriad.yarn.nodemanager.webapp.address}
>
> 
>
> 
>
>yarn.nodemanager.webapp.https.address
>
>${myriad.yarn.nodemanager.webapp.address}
>
> 
>
> 
>
>yarn.nodemanager.localizer.address
>
>${myriad.yarn.nodemanager.localizer.address}
>
> 
>
> 
>
> 
>
>yarn.resourcemanager.scheduler.class
>
>org.apache.myriad.scheduler.yarn.MyriadFairScheduler
>
>One can configure other scehdulers as well from following
> list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> org.apache.myriad.scheduler.yarn.MyriadFifoScheduler
>
> 
>
> 
>
> 
>
>yarn.nodemanager.pmem-check-enabled
>
>false
>
> 
>
> 
>
>yarn.nodemanager.vmem-check-enabled
>
>false
>
> 
>
> 
>
>
>
>
>
> My myriad-config-default.yml:
>
>
>
> mesosMaster: zk://myip:2181/mesos
>
> checkpoint: false
>
> frameworkFailoverTimeout: 4320
>
> frameworkName: MyriadAlpha
>
> frameworkRole:
>
> frameworkUser: root # User the Node Manager runs as, required if
> nodeManagerURI set, otherwise defaults to the user
>
>  # running the resource manager.
>
> frameworkSuperUser: root  # To be depricated, currently permissions need
> set by a superuser due to Mesos-1790.  Must be
>
>  # root or have passwordless sudo. Required if
> nodeManagerURI set, ignored otherwise.
>
> nativeLibrary: /usr/local/lib/libmesos.so
>
> zkServers: myip:2181
>
> zkTimeout: 2
>
> restApiPort: 8192
>
> servedConfigPath: dist/config.tgz
>
> servedBinaryPath: dist/binary.tgz
>
> profiles:
>
> zero:  # NMs launched with this profile dynamically obtain cpu/mem from
> Mesos
>
>cpu: 0
>
>mem: 0
>
> small:
>
>cpu: 2
>
>mem: 2048
>
> medium:
>
>cpu: 4
>
>mem: 4096
>
> large:
>
>cpu: 10
>
>mem: 12288
>
> nmInstances: # NMs to start with. Requires at least 1 NM with a non-zero
> profile.
>
> medium: 1 # 
>
> rebalancer: false
>
> haEnabled: false
>
> nodemanager:

Sync tomorrow?

2016-08-09 Thread Darin Johnson

Trying to plan my day tomorrow.

Re: Sync tomorrow?

2016-07-27 Thread Darin Johnson

Ended up in another meeting taking longer than I thought. Sorry.

On Tue, Jul 26, 2016 at 2:27 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> Yes, I'll be there. Sorry for not making it last time.
>
> On Tue, Jul 26, 2016 at 11:20 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Is there going to be a sync tomorrow?
> >
> > Darin
> >
>

Re: vagrant install doesn't show new framework registering

2016-07-26 Thread Darin Johnson

Hey David,

Thanks for the info.  I haven't used the vagrant install for a while, so it
may be good for me to start a fresh instance to check it.  In the meantime
though any notes you have would be great!  We'd be happy to update the
documentation.

On Jul 26, 2016 6:24 PM, "Reno, David"  wrote:

As a follow-up, registration failure seems to have been based on using the
wrong mesosMaster IP address or format. I changed it to "zk://
10.0.2.15:2181/mesos” following syntax from this list archive:
https://mail-archives.apache.org/mod_mbox/myriad-dev/201602.mbox/%3c1519159574.1366234.1456242921704.javamail.ya...@mail.yahoo.com%3e

The mesos master now shows MyriadAlpha as an active framework. Still, the
myriad tasks list shows the default medium as a pending task, so there
still seems to be a problem. The Mesos slave does not show any frameworks
or completed frameworks.

Again, just trying to test-drive the vanilla vagrant install. Happy to
provide notes of what works and doesn’t if anyone wants to update the
vagrant install docs to Myriad 0.2.0:
https://cwiki.apache.org/confluence/display/MYRIAD/Installing+using+Vagrant

Further detail on users:
step 1 seems best to complete as the vagrant user
remaining steps seem to need to be completed as the hadoop user (i.e.
hduser)

Regards,
David

> On Jul 26, 2016, at 9:27 AM, Reno, David  wrote:
>
> Hi Myriaders,
>
> Sorry if I’m reaching out to the wrong alias or help, this is all I see.
I’m getting stuck with the myriad install with vagrant. The wiki seem to
assume 0.1.0 though I’ve cloned the latest 0.2.0 release from github.
>
> I’m following these instructions:
https://cwiki.apache.org/confluence/display/MYRIAD/Installing+using+Vagrant
>
> Step 1 seems to go fine and I can open the HDFS name node and mesos
master http ports and see the pages showing active/started. Step 2 starts
go to a little sideways as it references “myriad-executor-0.1.0.jar” which
seems to be replaced by “myriad-executor-0.2.0.jar” which I use instead.
Step 3 asks for minimum configuration changes which seem to already be
completed. However, I change the line:
>   path:
file://localhost/usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> to:
>   path:
file:///usr/local/hadoop/share/hadoop/yarn/lib/myriad-executor-0.2.0.jar
>
> For step 4, I add all properties listed to the yarn-site.xml file. I then
launch the resource manager using the “yarn-daemon.sh start
resourcemanager” command.
>
> At this point, I can load the http://10.141.141.20:8192 port and see the
myriad about and API page but the http://10.141.141.20:5050/#/frameworks
page does not show myriad or hadoop as an active framework. I use the
myriad flex tab to “flex up” a small server, it appears as a pending task,
but stays pending and mesos frameworks don’t change.
>
> Interesting lines from
/usr/local/hadoop/logs/yarn-hduser-resourcemanager-vagrant-ubuntu-trusty-64.out
include the following:
> I0726 13:01:41.358747 15817 sched.cpp:164] Version: 0.24.1
> I0726 13:01:41.361140 15847 sched.cpp:262] New master detected at
master@10.0.2.15:5050
> I0726 13:01:41.361538 15847 sched.cpp:272] No credentials provided.
Attempting to register without authentication
> E0726 13:01:41.362741 15852 socket.hpp:174] Shutdown failed on fd=231:
Transport endpoint is not connected [107]
> E0726 13:01:41.363302 15852 socket.hpp:174] Shutdown failed on fd=231:
Transport endpoint is not connected [107]
> E0726 13:01:41.396867 15852 socket.hpp:174] Shutdown failed on fd=231:
Transport endpoint is not connected [107]
> Jul 26, 2016 1:01:41 PM com.google.inject.servlet.GuiceFilter setPipeline
> WARNING: Multiple Servlet injectors detected. This is a warning
indicating that you have more than one GuiceFilter running in your web
application. If this is deliberate, you may safely ignore this message. If
this is NOT deliberate however, your application may not work as expected.
> E0726 13:01:44.780588 15852 socket.hpp:174] Shutdown failed on fd=275:
Transport endpoint is not connected [107]
> E0726 13:01:51.604310 15852 socket.hpp:174] Shutdown failed on fd=275:
Transport endpoint is not connected [107]
> E0726 13:02:01.226771 15852 socket.hpp:174] Shutdown failed on fd=275:
Transport endpoint is not connected [107]
> E0726 13:02:11.525804 15852 socket.hpp:174] Shutdown failed on fd=277:
Transport endpoint is not connected [107]
> Jul 26, 2016 1:02:15 PM
com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8
resolve
> SEVERE: null
> java.lang.IllegalAccessException: Class
com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator$8
can not access a member of class javax.ws.rs.core.Response with modifiers
"protected"
>
> Any help or suggestions are much appreciated,
> David Reno
> Systems Architect, Comcast

Sync tomorrow?

2016-07-26 Thread Darin Johnson

Is there going to be a sync tomorrow?

Darin

Sync today?

2016-07-13 Thread Darin Johnson

Couldn't connect

Re: NPE in removing container

2016-07-12 Thread Darin Johnson

Hey Stephen,

I was on vacation last week, I'm looking over the logs this week.  I've got
a few ideas for a first but may take me a while as I get back into work.

Darin

On Fri, Jul 1, 2016 at 2:43 AM, Stephen Gran <stephen.g...@piksel.com>
wrote:

> Hi,
>
> It's not a problem at all.  Anything I can do to help.
>
> I've attached the log file for the relevant time period.  This is hadoop
> 2.7.2 - you have a good memory :)
>
> Cheers,
>
> On 30/06/16 22:56, Darin Johnson wrote:
> > Hey Steven,
> >
> > Looks like this might be slightly different than what I was originally
> > expecting.  Sorry to keep asking for more info but it will help me
> recreate
> > the issue.  Could you possibly get me more of the ResourceManager logs?
> In
> > particular, I'm trying to figure out where upgradeNodeCapacity is getting
> > called from and any transitions of slave2.  Also, what version of hadoop
> > are you running, I think I recall it being 2.72 but should verify.
> >
> > Thanks for taken the time to work with me on this.
> >
> > Darin
> >
> > On Thu, Jun 30, 2016 at 5:10 PM, Stephen Gran <stephen.g...@piksel.com>
> > wrote:
> >
> >> Hi,
> >>
> >> Yes - the imaginatively named slave2 was a zero-sized nm at that point -
> >> I am looking at how small a pool of reserved resource I can get away
> >> with, and use FGS for burst activity.
> >>
> >>
> >> Here are all the logs related to that host:port combination around that
> >> time:
> >>
> >> 2016-06-30 19:47:43,756 INFO
> >> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor:
> >> Expired:slave2:24679 Timed out after 2 secs
> >> 2016-06-30 19:47:43,771 INFO
> >> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl:
> >> Deactivating Node slave2:24679 as it is now LOST
> >> 2016-06-30 19:47:43,771 INFO
> >> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl:
> >> slave2:24679 Node Transitioned from RUNNING to LOST
> >> 2016-06-30 19:47:43,909 INFO
> >> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Removed task
> >> yarn_Container: [ContainerId: container_1467314892573_0009_01_05,
> >> NodeId: slave2:24679, NodeHttpAddress: slave2:23177, Resource:
> >> <memory:2048, vCores:1>, Priority: 20, Token: Token { kind:
> >> ContainerToken, service: 10.0.5.5:24679 }, ] with exit status freeing 0
> >> cpu and 1 mem.
> >> 2016-06-30 19:47:43,909 INFO
> >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> >> Released container container_1467314892573_0009_01_05 of capacity
> >> <memory:2048, vCores:1> on host slave2:24679, which currently has 1
> >> containers, <memory:2048, vCores:1> used and <memory:2048, vCores:1>
> >> available, release resources=true
> >> 2016-06-30 19:47:43,909 INFO
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> >> Application attempt appattempt_1467314892573_0009_01 released
> >> container container_1467314892573_0009_01_05 on node: host:
> >> slave2:24679 #containers=1 available=<memory:2048, vCores:1>
> >> used=<memory:2048, vCores:1> with event: KILL
> >> 2016-06-30 19:47:43,909 INFO
> >> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
> >> Node not found resyncing slave2:24679
> >> 2016-06-30 19:47:43,952 INFO
> >> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Removed task
> >> yarn_Container: [ContainerId: container_1467314892573_0009_01_06,
> >> NodeId: slave2:24679, NodeHttpAddress: slave2:23177, Resource:
> >> <memory:2048, vCores:1>, Priority: 20, Token: Token { kind:
> >> ContainerToken, service: 10.0.5.5:24679 }, ] with exit status freeing 0
> >> cpu and 1 mem.
> >> 2016-06-30 19:47:43,952 INFO
> >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> >> Released container container_1467314892573_0009_01_06 of capacity
> >> <memory:2048, vCores:1> on host slave2:24679, which currently has 0
> >> containers, <memory:0, vCores:0> used and <memory:4096, vCores:2>
> >> available, release resources=true
> >> 2016-06-30 19:47:43,952 INFO
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> >> Application attempt appattempt_1467314892573_0009_01 released
> >> container container_1467314892573_0009_01_06 on n

Re: NPE in removing container

2016-06-30 Thread Darin Johnson

Steven, thanks.  I thought I had fixed that but perhaps a regression was
made in another merge.  I'll look into it, can you answer a few questions?
Was the node (slave2) a zero sided nodemanager (for fgs)?  In the node
manager logs had it recently become unhealthy?  I'm pretty concerned about
this and will try to get a patch soon.

Thanks,

Darin
On Jun 30, 2016 3:53 PM, "Stephen Gran"  wrote:

> Hi,
>
> Just playing with the 0.2.0 release (congratulations, by the way!)
>
> I have seen this twice now, although it is by no means consistent - I
> will have a dozen successful runs, and then one of these.  This exits
> the RM, which makes it rather noticable.
>
> 2016-06-30 19:47:43,952 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Removed node slave2:24679 cluster capacity:  s:4>
> 2016-06-30 19:47:43,953 FATAL
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in
> handling event type NODE_RESOURCE_UPDATE to the scheduler
> java.lang.NullPointerException
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:563)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.updateNodeResource(FairScheduler.java:1652)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1222)
>  at
>
> org.apache.myriad.scheduler.yarn.MyriadFairScheduler.handle(MyriadFairScheduler.java:102)
>  at
>
> org.apache.myriad.scheduler.yarn.MyriadFairScheduler.handle(MyriadFairScheduler.java:42)
>  at
>
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:671)
>  at java.lang.Thread.run(Thread.java:745)
> 2016-06-30 19:47:43,972 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting,
> bbye..
>
> --
> Stephen Gran
> Senior Technical Architect
>
> picture the possibilities | piksel.com
> This message is private and confidential. If you have received this
> message in error, please notify the sender or serviced...@piksel.com and
> remove it from your system.
>
> Piksel Inc is a company registered in the United States New York City,
> 1250 Broadway, Suite 1902, New York, NY 10001. F No. = 2931986
>

Website is updated, 0.2.0 is official!

2016-06-29 Thread Darin Johnson

http://myriad.apache.org/

Tell your friends!

Re: Myraid Slack

2016-06-29 Thread Darin Johnson

Still no luck

On Wed, Jun 29, 2016 at 12:17 PM, Ken Sipe <k...@mesosphere.io> wrote:

> Darin try one more time… I think we had a miss configuration
> > On Jun 29, 2016, at 11:15 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
> >
> > Still no luck off wifi or cell.
> > On Jun 29, 2016 12:11 PM, "Ken Sipe" <k...@mesosphere.io> wrote:
> >
> >> https://plus.google.com/hangouts/_/mesosphere.io/myriad <
> >> https://plus.google.com/hangouts/_/mesosphere.io/myriad>
> >>
> >>
> >>> On Jun 29, 2016, at 11:08 AM, yuliya Feldman
> <yufeld...@yahoo.com.INVALID>
> >> wrote:
> >>>
> >>> no luck joining so far
> >>>
> >>> From: Ken Sipe <k...@mesosphere.io>
> >>> To: dev@myriad.incubator.apache.org
> >>> Sent: Wednesday, June 29, 2016 9:04 AM
> >>> Subject: Re: Myraid Slack
> >>>
> >>> I am on
> >>>> On Jun 29, 2016, at 11:04 AM, Darin Johnson <dbjohnson1...@gmail.com>
> >> wrote:
> >>>>
> >>>> Having issues getting on, is anybody else able to connect?
> >>>> On Jun 28, 2016 10:34 PM, "Adam Bordelon" <a...@mesosphere.io> wrote:
> >>>>
> >>>>> (Next dev sync is tomorrow, 9am Pacific time)
> >>>>>
> >>>>> On Tue, Jun 28, 2016 at 12:32 PM, Darin Johnson <
> >> dbjohnson1...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> We also have a dev sync every other Wednesday via Google Hangouts:
> >>>>>> https://plus.google.com/hangouts/_/mesosphere.io/myriad
> >>>>>>
> >>>>>> Darin
> >>>>>>
> >>>>>> On Thu, Jun 23, 2016 at 3:01 AM, Swapnil Daingade <
> >>>>>> swapnil.daing...@gmail.com> wrote:
> >>>>>>
> >>>>>>> Hi Sam,
> >>>>>>>
> >>>>>>> Myriad is a fairly new project. The IPMC vote for Myriad 0.2 just
> >>>>> passed
> >>>>>>> this week.
> >>>>>>> Given we are early in the incubation stage, its not uncommon for
> one
> >> or
> >>>>>>> two vendors
> >>>>>>> to back the project.
> >>>>>>>
> >>>>>>> I'll let other community members talk about their experiences
> >> deploying
> >>>>>>> Myriad
> >>>>>>> but Its really great that you are considering deploying Myriad in
> >>>>>>> production.
> >>>>>>> Your feedback will definitely help shape the road map for Myriad
> >> going
> >>>>>>> forward.
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Swapnil
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 06/22/2016 11:24 PM, Sam Chen wrote:
> >>>>>>>
> >>>>>>>> Hi Swapnil,
> >>>>>>>> MapR is one company to give Myriad support, right?  Any reference
> ?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Sam
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Sent from my iPhone
> >>>>>>>>
> >>>>>>>> On Jun 23, 2016, at 10:39 AM, Swapnil Daingade <
> >>>>>>>>> swapnil.daing...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> MapR supports Myriad 0.1 currently
> >>>>>>>>>
> >>>>>>>>> https://www.mapr.com/products/whats-included
> >>>>>>>>> https://www.mapr.com/products/product-overview/apache-myriad
> >>>>>>>>>
> >>>>>>>>> Regards
> >>>>>>>>> Swapnil
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 22, 2016 at 6:51 PM, Sam Chen <
> >> rc...@linkernetworks.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Darin,
> >>>>>>>>>> Thanks for you reply. Makes sense to use Slack. Btw, we are
> going
> >> to
> >>>>>> use
> >>>>>>>>>> Myriad in production, any company have capability to support
> this
> >> ?
> >>>>>> And
> >>>>>>>>>> is
> >>>>>>>>>> there any reference in production ?
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Sam
> >>>>>>>>>>
> >>>>>>>>>> Sent from my iPhone
> >>>>>>>>>>
> >>>>>>>>>> On Jun 23, 2016, at 2:30 AM, Darin Johnson <
> >> dbjohnson1...@gmail.com
> >>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Sam,
> >>>>>>>>>>>
> >>>>>>>>>>> I don't believe so.  But we do have an IRC channel #myriad on
> >>>>>> FreeNode.
> >>>>>>>>>>>
> >>>>>>>>>> I
> >>>>>>>>>>
> >>>>>>>>>>> know the mesosphere guys set up slackbots to interact with it.
> >> I'm
> >>>>>>>>>>> only
> >>>>>>>>>>> there occasionally or by appointment. I did notice Kudu now
> uses
> >>>>>> slack,
> >>>>>>>>>>>
> >>>>>>>>>> so
> >>>>>>>>>>
> >>>>>>>>>>> maybe slack makes more sense than IRC these days, or Gitter
> Chat.
> >>>>>>>>>>>
> >>>>>>>>>>> Darin
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jun 22, 2016 at 1:55 AM, Sam Chen <
> >>>>> rc...@linkernetworks.com>
> >>>>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Guys,
> >>>>>>>>>>>> Do we have Slack for Myraid?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards ,
> >>>>>>>>>>>> Sam
> >>>>>>>>>>>>
> >>>>>>>>>>>> Sent from my iPhone
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>>
> >>
> >>
>
>

Re: Myraid Slack

2016-06-29 Thread Darin Johnson

Still no luck off wifi or cell.
On Jun 29, 2016 12:11 PM, "Ken Sipe" <k...@mesosphere.io> wrote:

> https://plus.google.com/hangouts/_/mesosphere.io/myriad <
> https://plus.google.com/hangouts/_/mesosphere.io/myriad>
>
>
> > On Jun 29, 2016, at 11:08 AM, yuliya Feldman <yufeld...@yahoo.com.INVALID>
> wrote:
> >
> > no luck joining so far
> >
> >  From: Ken Sipe <k...@mesosphere.io>
> > To: dev@myriad.incubator.apache.org
> > Sent: Wednesday, June 29, 2016 9:04 AM
> > Subject: Re: Myraid Slack
> >
> > I am on
> >> On Jun 29, 2016, at 11:04 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
> >>
> >> Having issues getting on, is anybody else able to connect?
> >> On Jun 28, 2016 10:34 PM, "Adam Bordelon" <a...@mesosphere.io> wrote:
> >>
> >>> (Next dev sync is tomorrow, 9am Pacific time)
> >>>
> >>> On Tue, Jun 28, 2016 at 12:32 PM, Darin Johnson <
> dbjohnson1...@gmail.com>
> >>> wrote:
> >>>
> >>>> We also have a dev sync every other Wednesday via Google Hangouts:
> >>>> https://plus.google.com/hangouts/_/mesosphere.io/myriad
> >>>>
> >>>> Darin
> >>>>
> >>>> On Thu, Jun 23, 2016 at 3:01 AM, Swapnil Daingade <
> >>>> swapnil.daing...@gmail.com> wrote:
> >>>>
> >>>>> Hi Sam,
> >>>>>
> >>>>> Myriad is a fairly new project. The IPMC vote for Myriad 0.2 just
> >>> passed
> >>>>> this week.
> >>>>> Given we are early in the incubation stage, its not uncommon for one
> or
> >>>>> two vendors
> >>>>> to back the project.
> >>>>>
> >>>>> I'll let other community members talk about their experiences
> deploying
> >>>>> Myriad
> >>>>> but Its really great that you are considering deploying Myriad in
> >>>>> production.
> >>>>> Your feedback will definitely help shape the road map for Myriad
> going
> >>>>> forward.
> >>>>>
> >>>>> Regards
> >>>>> Swapnil
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 06/22/2016 11:24 PM, Sam Chen wrote:
> >>>>>
> >>>>>> Hi Swapnil,
> >>>>>> MapR is one company to give Myriad support, right?  Any reference ?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Sam
> >>>>>>
> >>>>>>
> >>>>>> Sent from my iPhone
> >>>>>>
> >>>>>> On Jun 23, 2016, at 10:39 AM, Swapnil Daingade <
> >>>>>>> swapnil.daing...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> MapR supports Myriad 0.1 currently
> >>>>>>>
> >>>>>>> https://www.mapr.com/products/whats-included
> >>>>>>> https://www.mapr.com/products/product-overview/apache-myriad
> >>>>>>>
> >>>>>>> Regards
> >>>>>>> Swapnil
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jun 22, 2016 at 6:51 PM, Sam Chen <
> rc...@linkernetworks.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Darin,
> >>>>>>>> Thanks for you reply. Makes sense to use Slack. Btw, we are going
> to
> >>>> use
> >>>>>>>> Myriad in production, any company have capability to support this
> ?
> >>>> And
> >>>>>>>> is
> >>>>>>>> there any reference in production ?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Sam
> >>>>>>>>
> >>>>>>>> Sent from my iPhone
> >>>>>>>>
> >>>>>>>> On Jun 23, 2016, at 2:30 AM, Darin Johnson <
> dbjohnson1...@gmail.com
> >>>>
> >>>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Sam,
> >>>>>>>>>
> >>>>>>>>> I don't believe so.  But we do have an IRC channel #myriad on
> >>>> FreeNode.
> >>>>>>>>>
> >>>>>>>> I
> >>>>>>>>
> >>>>>>>>> know the mesosphere guys set up slackbots to interact with it.
> I'm
> >>>>>>>>> only
> >>>>>>>>> there occasionally or by appointment. I did notice Kudu now uses
> >>>> slack,
> >>>>>>>>>
> >>>>>>>> so
> >>>>>>>>
> >>>>>>>>> maybe slack makes more sense than IRC these days, or Gitter Chat.
> >>>>>>>>>
> >>>>>>>>> Darin
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 22, 2016 at 1:55 AM, Sam Chen <
> >>> rc...@linkernetworks.com>
> >>>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Guys,
> >>>>>>>>>> Do we have Slack for Myraid?
> >>>>>>>>>>
> >>>>>>>>>> Regards ,
> >>>>>>>>>> Sam
> >>>>>>>>>>
> >>>>>>>>>> Sent from my iPhone
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >
> >
>
>

Re: Myraid Slack

2016-06-29 Thread Darin Johnson

Maybe you can post the link?  Maybe mine is old.
On Jun 29, 2016 12:05 PM, "Ken Sipe" <k...@mesosphere.io> wrote:

> I am on
> > On Jun 29, 2016, at 11:04 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
> >
> > Having issues getting on, is anybody else able to connect?
> > On Jun 28, 2016 10:34 PM, "Adam Bordelon" <a...@mesosphere.io> wrote:
> >
> >> (Next dev sync is tomorrow, 9am Pacific time)
> >>
> >> On Tue, Jun 28, 2016 at 12:32 PM, Darin Johnson <
> dbjohnson1...@gmail.com>
> >> wrote:
> >>
> >>> We also have a dev sync every other Wednesday via Google Hangouts:
> >>> https://plus.google.com/hangouts/_/mesosphere.io/myriad
> >>>
> >>> Darin
> >>>
> >>> On Thu, Jun 23, 2016 at 3:01 AM, Swapnil Daingade <
> >>> swapnil.daing...@gmail.com> wrote:
> >>>
> >>>> Hi Sam,
> >>>>
> >>>> Myriad is a fairly new project. The IPMC vote for Myriad 0.2 just
> >> passed
> >>>> this week.
> >>>> Given we are early in the incubation stage, its not uncommon for one
> or
> >>>> two vendors
> >>>> to back the project.
> >>>>
> >>>> I'll let other community members talk about their experiences
> deploying
> >>>> Myriad
> >>>> but Its really great that you are considering deploying Myriad in
> >>>> production.
> >>>> Your feedback will definitely help shape the road map for Myriad going
> >>>> forward.
> >>>>
> >>>> Regards
> >>>> Swapnil
> >>>>
> >>>>
> >>>>
> >>>> On 06/22/2016 11:24 PM, Sam Chen wrote:
> >>>>
> >>>>> Hi Swapnil,
> >>>>> MapR is one company to give Myriad support, right?  Any reference ?
> >>>>>
> >>>>> Regards,
> >>>>> Sam
> >>>>>
> >>>>>
> >>>>> Sent from my iPhone
> >>>>>
> >>>>> On Jun 23, 2016, at 10:39 AM, Swapnil Daingade <
> >>>>>> swapnil.daing...@gmail.com> wrote:
> >>>>>>
> >>>>>> MapR supports Myriad 0.1 currently
> >>>>>>
> >>>>>> https://www.mapr.com/products/whats-included
> >>>>>> https://www.mapr.com/products/product-overview/apache-myriad
> >>>>>>
> >>>>>> Regards
> >>>>>> Swapnil
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Jun 22, 2016 at 6:51 PM, Sam Chen <rc...@linkernetworks.com
> >
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi Darin,
> >>>>>>> Thanks for you reply. Makes sense to use Slack. Btw, we are going
> to
> >>> use
> >>>>>>> Myriad in production, any company have capability to support this ?
> >>> And
> >>>>>>> is
> >>>>>>> there any reference in production ?
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Sam
> >>>>>>>
> >>>>>>> Sent from my iPhone
> >>>>>>>
> >>>>>>> On Jun 23, 2016, at 2:30 AM, Darin Johnson <
> dbjohnson1...@gmail.com
> >>>
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Sam,
> >>>>>>>>
> >>>>>>>> I don't believe so.  But we do have an IRC channel #myriad on
> >>> FreeNode.
> >>>>>>>>
> >>>>>>> I
> >>>>>>>
> >>>>>>>> know the mesosphere guys set up slackbots to interact with it.
> I'm
> >>>>>>>> only
> >>>>>>>> there occasionally or by appointment. I did notice Kudu now uses
> >>> slack,
> >>>>>>>>
> >>>>>>> so
> >>>>>>>
> >>>>>>>> maybe slack makes more sense than IRC these days, or Gitter Chat.
> >>>>>>>>
> >>>>>>>> Darin
> >>>>>>>>
> >>>>>>>> On Wed, Jun 22, 2016 at 1:55 AM, Sam Chen <
> >> rc...@linkernetworks.com>
> >>>>>>>>>
> >>>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Guys,
> >>>>>>>>> Do we have Slack for Myraid?
> >>>>>>>>>
> >>>>>>>>> Regards ,
> >>>>>>>>> Sam
> >>>>>>>>>
> >>>>>>>>> Sent from my iPhone
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: Myraid Slack

2016-06-29 Thread Darin Johnson

Having issues getting on, is anybody else able to connect?
On Jun 28, 2016 10:34 PM, "Adam Bordelon" <a...@mesosphere.io> wrote:

> (Next dev sync is tomorrow, 9am Pacific time)
>
> On Tue, Jun 28, 2016 at 12:32 PM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > We also have a dev sync every other Wednesday via Google Hangouts:
> > https://plus.google.com/hangouts/_/mesosphere.io/myriad
> >
> > Darin
> >
> > On Thu, Jun 23, 2016 at 3:01 AM, Swapnil Daingade <
> > swapnil.daing...@gmail.com> wrote:
> >
> > > Hi Sam,
> > >
> > > Myriad is a fairly new project. The IPMC vote for Myriad 0.2 just
> passed
> > > this week.
> > > Given we are early in the incubation stage, its not uncommon for one or
> > > two vendors
> > > to back the project.
> > >
> > > I'll let other community members talk about their experiences deploying
> > > Myriad
> > > but Its really great that you are considering deploying Myriad in
> > > production.
> > > Your feedback will definitely help shape the road map for Myriad going
> > > forward.
> > >
> > > Regards
> > > Swapnil
> > >
> > >
> > >
> > > On 06/22/2016 11:24 PM, Sam Chen wrote:
> > >
> > >> Hi Swapnil,
> > >> MapR is one company to give Myriad support, right?  Any reference ?
> > >>
> > >> Regards,
> > >> Sam
> > >>
> > >>
> > >> Sent from my iPhone
> > >>
> > >> On Jun 23, 2016, at 10:39 AM, Swapnil Daingade <
> > >>> swapnil.daing...@gmail.com> wrote:
> > >>>
> > >>> MapR supports Myriad 0.1 currently
> > >>>
> > >>> https://www.mapr.com/products/whats-included
> > >>> https://www.mapr.com/products/product-overview/apache-myriad
> > >>>
> > >>> Regards
> > >>> Swapnil
> > >>>
> > >>>
> > >>> On Wed, Jun 22, 2016 at 6:51 PM, Sam Chen <rc...@linkernetworks.com>
> > >>>> wrote:
> > >>>>
> > >>>> Hi Darin,
> > >>>> Thanks for you reply. Makes sense to use Slack. Btw, we are going to
> > use
> > >>>> Myriad in production, any company have capability to support this ?
> > And
> > >>>> is
> > >>>> there any reference in production ?
> > >>>>
> > >>>> Regards,
> > >>>> Sam
> > >>>>
> > >>>> Sent from my iPhone
> > >>>>
> > >>>> On Jun 23, 2016, at 2:30 AM, Darin Johnson <dbjohnson1...@gmail.com
> >
> > >>>>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>> Sam,
> > >>>>>
> > >>>>> I don't believe so.  But we do have an IRC channel #myriad on
> > FreeNode.
> > >>>>>
> > >>>> I
> > >>>>
> > >>>>> know the mesosphere guys set up slackbots to interact with it.  I'm
> > >>>>> only
> > >>>>> there occasionally or by appointment. I did notice Kudu now uses
> > slack,
> > >>>>>
> > >>>> so
> > >>>>
> > >>>>> maybe slack makes more sense than IRC these days, or Gitter Chat.
> > >>>>>
> > >>>>> Darin
> > >>>>>
> > >>>>> On Wed, Jun 22, 2016 at 1:55 AM, Sam Chen <
> rc...@linkernetworks.com>
> > >>>>>>
> > >>>>> wrote:
> > >>>>
> > >>>>> Guys,
> > >>>>>> Do we have Slack for Myraid?
> > >>>>>>
> > >>>>>> Regards ,
> > >>>>>> Sam
> > >>>>>>
> > >>>>>> Sent from my iPhone
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>
> > >>
> > >
> >
>

Myriad is 0.2.0!

2016-06-28 Thread Darin Johnson

I've used the release script to publish the source tarball which is
available here:

https://www.apache.org/dist/incubator/myriad/myriad-0.2.0-incubating/

In addition, I've written a short release note and updated the downloads
page, in the following PR:

https://github.com/apache/incubator-myriad/pull/81

I'll leave that open until Friday at 5pm or until I get 3 +1's.  Once
that's done I'll update the svn for the website.

Darin

Re: Myraid Slack

2016-06-28 Thread Darin Johnson

We also have a dev sync every other Wednesday via Google Hangouts:
https://plus.google.com/hangouts/_/mesosphere.io/myriad

Darin

On Thu, Jun 23, 2016 at 3:01 AM, Swapnil Daingade <
swapnil.daing...@gmail.com> wrote:

> Hi Sam,
>
> Myriad is a fairly new project. The IPMC vote for Myriad 0.2 just passed
> this week.
> Given we are early in the incubation stage, its not uncommon for one or
> two vendors
> to back the project.
>
> I'll let other community members talk about their experiences deploying
> Myriad
> but Its really great that you are considering deploying Myriad in
> production.
> Your feedback will definitely help shape the road map for Myriad going
> forward.
>
> Regards
> Swapnil
>
>
>
> On 06/22/2016 11:24 PM, Sam Chen wrote:
>
>> Hi Swapnil,
>> MapR is one company to give Myriad support, right?  Any reference ?
>>
>> Regards,
>> Sam
>>
>>
>> Sent from my iPhone
>>
>> On Jun 23, 2016, at 10:39 AM, Swapnil Daingade <
>>> swapnil.daing...@gmail.com> wrote:
>>>
>>> MapR supports Myriad 0.1 currently
>>>
>>> https://www.mapr.com/products/whats-included
>>> https://www.mapr.com/products/product-overview/apache-myriad
>>>
>>> Regards
>>> Swapnil
>>>
>>>
>>> On Wed, Jun 22, 2016 at 6:51 PM, Sam Chen <rc...@linkernetworks.com>
>>>> wrote:
>>>>
>>>> Hi Darin,
>>>> Thanks for you reply. Makes sense to use Slack. Btw, we are going to use
>>>> Myriad in production, any company have capability to support this ? And
>>>> is
>>>> there any reference in production ?
>>>>
>>>> Regards,
>>>> Sam
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Jun 23, 2016, at 2:30 AM, Darin Johnson <dbjohnson1...@gmail.com>
>>>>>>
>>>>> wrote:
>>>>>
>>>>> Sam,
>>>>>
>>>>> I don't believe so.  But we do have an IRC channel #myriad on FreeNode.
>>>>>
>>>> I
>>>>
>>>>> know the mesosphere guys set up slackbots to interact with it.  I'm
>>>>> only
>>>>> there occasionally or by appointment. I did notice Kudu now uses slack,
>>>>>
>>>> so
>>>>
>>>>> maybe slack makes more sense than IRC these days, or Gitter Chat.
>>>>>
>>>>> Darin
>>>>>
>>>>> On Wed, Jun 22, 2016 at 1:55 AM, Sam Chen <rc...@linkernetworks.com>
>>>>>>
>>>>> wrote:
>>>>
>>>>> Guys,
>>>>>> Do we have Slack for Myraid?
>>>>>>
>>>>>> Regards ,
>>>>>> Sam
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>
>>>>
>>>>
>>
>>
>

Re: Myraid Slack

2016-06-22 Thread Darin Johnson

Sam,

I don't believe so.  But we do have an IRC channel #myriad on FreeNode.  I
know the mesosphere guys set up slackbots to interact with it.  I'm only
there occasionally or by appointment. I did notice Kudu now uses slack, so
maybe slack makes more sense than IRC these days, or Gitter Chat.

Darin

On Wed, Jun 22, 2016 at 1:55 AM, Sam Chen  wrote:

> Guys,
> Do we have Slack for Myraid?
>
> Regards ,
> Sam
>
> Sent from my iPhone
>
>

[RESULT] [VOTE] Release Apache Myriad 0.2.0 (incubating)

2016-06-20 Thread Darin Johnson

The vote passed with 3 +1 binding votes from IPMC members and no -1s.

+1 binding votes:
Justin Mclean
Drew Farris
John Ament

We will proceed with the post release activities:
  - Make the release artifacts available from [1] and [2]
  - github tag with "myriad-0.2.0-incubating"
  - Close the "myriad-0.2.0" release in JIRA.
  - Announce the release on Myriad's website with a blog post.

1. https://dist.apache.org/repos/dist/release/incubator/myriad/
2. http://myriad.incubator.apache.org/downloads/

Re: Myriad hangout tomorrow?

2016-06-14 Thread Darin Johnson

I'm planning on calling in.

Darin

On Tue, Jun 14, 2016 at 4:25 PM, Swapnil Daingade <
swapnil.daing...@gmail.com> wrote:

> Hi All,
>
> Was wondering if we have a Myriad hangout tomorrow.
>
> Regards
> Swapnil
>

Re: [Vote] Release apache-myriad-0.2.0-incubating (release candidate 4)

2016-06-09 Thread Darin Johnson

The vote for 0.2.0 RC4 has concluded and passed.  Thanks to everyone who
verified the release and voted!

Binding +1's
Darin Johnson
Santosh Marella
Mohit Soni

Non-Binding +1's
John Yost
Sarjeet Signh
Brandon Gulla

I'll submit the release to the IPMC to vote.

Darin

On Thu, Jun 9, 2016 at 8:49 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> Looks like we got our 3rd binding vote! Let's announce the result and ask
> Incubator PMC to begin their vote. Darin, let me/Santosh know if you need
> advice on this part of the process.
>
>
> On Thu, Jun 9, 2016 at 5:44 PM, Brandon Gulla <gulla.bran...@gmail.com>
> wrote:
>
> > +1
> >
> > built and tested on a test cluster. great work guys.
> >
> > On Thu, Jun 9, 2016 at 6:39 PM, mohit soni <mohitsoni1...@gmail.com>
> > wrote:
> >
> > > +1 (Binding)
> > >
> > > - Verified signature
> > > - Verified MD5 and SHA512 hashes
> > > - Builds from source tar ball.
> > > - Installed Myriad on a Mesos cluster and ran sanity tests.
> > >
> > > Thanks
> > > Mohit
> > >
> > > On Thu, Jun 2, 2016 at 7:25 AM, John Yost <hokiege...@gmail.com>
> wrote:
> > >
> > > > I'm voting +1
> > > >
> > > > --John
> > > >
> > > > On Tue, May 24, 2016 at 10:46 PM, Darin Johnson <
> > dbjohnson1...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > I'm voting +1 (Binding)
> > > > >
> > > > > Verified md5/sha hashes.  Compiled with gradle build, gradle
> > > > buildRMDocker
> > > > > (on OSX with docker-machine).
> > > > >
> > > > > Ran remote distribution (with cgroups) on a 4 node cluster (Ubuntu,
> > > > > hadoop-2.6.0, hadoop 2.7.0) with one CGS NM and 3 FGS NM.  Ran 8
> > > > > simultaneous jobs.  Shut down Framework.  Restarted NodeManager,
> ran
> > an
> > > > > additional 3 jobs.
> > > > >
> > > > > Ran the same with docker (minus cgroups).
> > > > >
> > > > > Darin
> > > > >
> > > > > On Tue, May 24, 2016 at 10:40 PM, Darin Johnson <
> > > dbjohnson1...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I have created a source tar ball for Apache Myriad
> > 0.2.0-incubating,
> > > > > > release candidate 3 based off the feed back received from release
> > > > > > candidate 1,2 & 3.  Thanks Sarjeet for a very thorough review!
> > > > > >
> > > > > > Here’s the release notes:
> > > > > > https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes
> > > > > >
> > > > > > The commit to be voted upon is tagged with
> > > > "myriad-0.2.0-incubating-rc4"
> > > > > > and is available here:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc
> > > > > > <
> > > > >
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc4
> > > > > >
> > > > > > 4
> > > > > >
> > > > > > The artifacts to be voted upon are located below. Please note
> that
> > > this
> > > > > is
> > > > > > a source release:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc4/
> > > > > >
> > > > > > Release artifacts are signed with the following key:
> > > > > > *https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
> > > > > > <https://home.apache.org/~darinj/gpg/2AAE9E3F.asc>*
> > > > > >
> > > > > > **Please note that the release tar ball does not include the
> > gradlew
> > > > > script
> > > > > > to build. You need to install gradle in order to build.**
> > > > > >
> > > > > > Please try out the release candidate and vote. The vote is open
> > for a
> > > > > > minimum of 3 business days (Friday May 27) or until the necessary
> > > > number
> > > > > > of votes (3 binding +1s)
> > > > > > is reached.
> > > > > >
> > > > > > If/when this vote succeeds, I will call for a vote with IPMC
> > seeking
> > > > > > permission to release RC3 as Apache Myriad 0.2.0 (incubating).
> > > > > >
> > > > > > [ ] +1 Release this package as Apache Myriad 0.2.0-incubating
> > > > > > [ ]  0 I don't feel strongly about it, but I'm okay with the
> > release
> > > > > > [ ] -1 Do not release this package because...
> > > > > >
> > > > > > Thanks,
> > > > > > Darin
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Brandon
> >
>

Re: problem getting fine grained scaling workig

2016-06-08 Thread Darin Johnson

Will do today, if you'd like to help with the documentation I could give
you access.

On Wed, Jun 8, 2016 at 3:14 AM, Stephen Gran <stephen.g...@piksel.com>
wrote:

> Hi,
>
> Can someone with access please correct the screenshot here:
> https://cwiki.apache.org/confluence/display/MYRIAD/Fine-grained+Scaling
>
> This gives the strong impression that you don't need an NM with non-zero
> resources.  I think this is what initially steered me down the wrong path.
>
> Cheers,
>
> On 03/06/16 16:38, Darin Johnson wrote:
> > That is correct you need at least one node manager with the minimum
> > requirements to launch an ApplicationMaster.  Otherwise YARN will throw
> an
> > exception.
> >
> > On Fri, Jun 3, 2016 at 10:52 AM, yuliya Feldman
> <yufeld...@yahoo.com.invalid
> >> wrote:
> >
> >> I believe you need at least one NM that is not subject to fine grain
> >> scaling.
> >> So far if total resources on the cluster is less then a single container
> >> needs for AM you won't be able to submit any app.As exception below
> tells
> >> you.
> >> (Invalid resource request, requested memory < 0, or requested memory
> >max
> >> configured, requestedMemory=1536, maxMemory=0
> >>  at)
> >> I believe by default when starting Myriad cluster one NM with non 0
> >> capacity should start by default.
> >> In addition see in RM log whether offers with resources are coming to
> RM -
> >> this info should be in the log.
> >>
> >>From: Stephen Gran <stephen.g...@piksel.com>
> >>   To: "dev@myriad.incubator.apache.org" <
> dev@myriad.incubator.apache.org>
> >>   Sent: Friday, June 3, 2016 1:29 AM
> >>   Subject: problem getting fine grained scaling workig
> >>
> >> Hi,
> >>
> >> I'm trying to get fine grained scaling going on a test mesos cluster.  I
> >> have a single master and 2 agents.  I am running 2 node managers with
> >> the zero profile, one per agent.  I can see both of them in the RM UI
> >> reporting correctly as having 0 resources.
> >>
> >> I'm getting stack traces when I try to launch a sample application,
> >> though.  I feel like I'm just missing something obvious somewhere - can
> >> anyone shed any light?
> >>
> >> This is on a build of yesterday's git head.
> >>
> >> Cheers,
> >>
> >> root@master:/srv/apps/hadoop# bin/yarn jar
> >> share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
> >> /outDir
> >> 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
> >> master.testing.local/10.0.5.3:8032
> >> 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
> >> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
> >> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
> >> job: job_1464902078156_0001
> >> 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
> >> area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
> >> java.io.IOException:
> >> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
> >> Invalid resource request, requested memory < 0, or requested memory >
> >> max configured, requestedMemory=1536, maxMemory=0
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
> >>  at
> >>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
> >>  at
> >>
> >>
> org.apa

Re: problem getting fine grained scaling workig

2016-06-06 Thread Darin Johnson

No worries, keep me posted.  I think we did a good proof of concept, we're
trying to make it solid now so if you find any issues let us know.

Darin
On Jun 5, 2016 2:57 PM, "Stephen Gran" <stephen.g...@piksel.com> wrote:

> Hi,
>
> Brilliant!  Working now.
>
> Thank you very much,
>
> On 05/06/16 18:09, Darin Johnson wrote:
> > Stephen,
> >
> > I was able to recreate the problem (specific due to 2.7.2, they changed
> the
> > defaults on the following two properties to true).  Setting them to false
> > allowed me to again run map reduce jobs.  I'll try to update the
> > documentation later today.
> >
> >
> >
> >  yarn.nodemanager.pmem-check-enabled
> >
> >  false
> >
> >
> >
> >
> >
> >  yarn.nodemanager.vmem-check-enabled
> >
> >  false
> >
> >
> >
> > Darin
> >
> > On Sun, Jun 5, 2016 at 10:30 AM, Stephen Gran <stephen.g...@piksel.com>
> > wrote:
> >
> >> Hi,
> >>
> >> I think those are the properties I added when I started getting this
> >> error.  Removing them doesn't seem to make any difference, sadly.
> >>
> >> This is hadoop 2.7.2
> >>
> >> Cheers,
> >>
> >> On 05/06/16 14:45, Darin Johnson wrote:
> >>> Hey Stephen,
> >>>
> >>> I think you're pretty close.
> >>>
> >>> Looking at the config I'd suggest removing these properties:
> >>>
> >>>  
> >>>   yarn.nodemanager.resource.memory-mb
> >>>   4096
> >>>   
> >>>   
> >>>   yarn.scheduler.maximum-allocation-vcores
> >>>   12
> >>>   
> >>>   
> >>>   yarn.scheduler.maximum-allocation-mb
> >>>   8192
> >>>   
> >>> 
> >>>  yarn.nodemanager.vmem-check-enabled
> >>>   false
> >>>   Whether virtual memory limits will be enforced for
> >>> containers
> >>> 
> >>> 
> >>>  yarn.nodemanager.vmem-pmem-ratio
> >>>   4
> >>>   Ratio between virtual memory to physical memory when
> >>> setting memory limits for containers
> >>> 
> >>>
> >>> I'll try them out on my test cluster later today/tonight and see if I
> can
> >>> recreate the problem.  What version of hadoop are you running?  I'll
> make
> >>> sure I'm consistent with that as well.
> >>>
> >>> Thanks,
> >>>
> >>> Darin
> >>> On Jun 5, 2016 8:15 AM, "Stephen Gran" <stephen.g...@piksel.com>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Attached.  Thanks very much for looking.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> On 05/06/16 12:51, Darin Johnson wrote:
> >>>>> Hey Steven can you please send your yarn-site.xml, I'm guessing
> you're
> >> on
> >>>>> the right track.
> >>>>>
> >>>>> Darin
> >>>>> Hi,
> >>>>>
> >>>>> OK.  That helps, thank you.  I think I just misunderstood the docs
> (or
> >>>>> they never said explicitly that you did need at least some static
> >>>>> resource), and I scaled down the initial nm.medium that got
> started.  I
> >>>>> get a bit further now, and jobs start but are killed with:
> >>>>>
> >>>>> Diagnostics: Container
> >>>>> [pid=3865,containerID=container_1465112239753_0001_03_01] is
> >> running
> >>>>> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> >>>>> memory used; 2.6 GB of 0B virtual memory used. Killing container
> >>>>>
> >>>>> When I've seen this in the past with yarn but without myriad, it was
> >>>>> usually about ratios of vmem to mem and things like that - I've tried
> >>>>> some of those knobs, but I didn't expect much result and didn't get
> >> any.
> >>>>>
> >>>>> What strikes me about the error message is that the vmem and mem
> >>>>> allocations are for 0.
> >>>>>
> >>>>> I'm sorry for asking what are probably naive questions here, I
> c

Re: problem getting fine grained scaling workig

2016-06-05 Thread Darin Johnson

Stephen,

I was able to recreate the problem (specific due to 2.7.2, they changed the
defaults on the following two properties to true).  Setting them to false
allowed me to again run map reduce jobs.  I'll try to update the
documentation later today.

  

yarn.nodemanager.pmem-check-enabled

false

  

  

yarn.nodemanager.vmem-check-enabled

false

  

Darin

On Sun, Jun 5, 2016 at 10:30 AM, Stephen Gran <stephen.g...@piksel.com>
wrote:

> Hi,
>
> I think those are the properties I added when I started getting this
> error.  Removing them doesn't seem to make any difference, sadly.
>
> This is hadoop 2.7.2
>
> Cheers,
>
> On 05/06/16 14:45, Darin Johnson wrote:
> > Hey Stephen,
> >
> > I think you're pretty close.
> >
> > Looking at the config I'd suggest removing these properties:
> >
> > 
> >  yarn.nodemanager.resource.memory-mb
> >  4096
> >  
> >  
> >  yarn.scheduler.maximum-allocation-vcores
> >  12
> >  
> >  
> >  yarn.scheduler.maximum-allocation-mb
> >  8192
> >  
> >
> > yarn.nodemanager.vmem-check-enabled
> >  false
> >  Whether virtual memory limits will be enforced for
> > containers
> >
> > 
> > yarn.nodemanager.vmem-pmem-ratio
> >  4
> >  Ratio between virtual memory to physical memory when
> > setting memory limits for containers
> >
> >
> > I'll try them out on my test cluster later today/tonight and see if I can
> > recreate the problem.  What version of hadoop are you running?  I'll make
> > sure I'm consistent with that as well.
> >
> > Thanks,
> >
> > Darin
> > On Jun 5, 2016 8:15 AM, "Stephen Gran" <stephen.g...@piksel.com> wrote:
> >
> >> Hi,
> >>
> >> Attached.  Thanks very much for looking.
> >>
> >> Cheers,
> >>
> >> On 05/06/16 12:51, Darin Johnson wrote:
> >>> Hey Steven can you please send your yarn-site.xml, I'm guessing you're
> on
> >>> the right track.
> >>>
> >>> Darin
> >>> Hi,
> >>>
> >>> OK.  That helps, thank you.  I think I just misunderstood the docs (or
> >>> they never said explicitly that you did need at least some static
> >>> resource), and I scaled down the initial nm.medium that got started.  I
> >>> get a bit further now, and jobs start but are killed with:
> >>>
> >>> Diagnostics: Container
> >>> [pid=3865,containerID=container_1465112239753_0001_03_01] is
> running
> >>> beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> >>> memory used; 2.6 GB of 0B virtual memory used. Killing container
> >>>
> >>> When I've seen this in the past with yarn but without myriad, it was
> >>> usually about ratios of vmem to mem and things like that - I've tried
> >>> some of those knobs, but I didn't expect much result and didn't get
> any.
> >>>
> >>> What strikes me about the error message is that the vmem and mem
> >>> allocations are for 0.
> >>>
> >>> I'm sorry for asking what are probably naive questions here, I couldn't
> >>> find a different forum.  If there is one, please point me there so I
> >>> don't disrupt the dev flow here.
> >>>
> >>> I can see this in the logs:
> >>>
> >>>
> >>> 2016-06-05 07:39:25,687 INFO
> >>>
> >>
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> >>> container_1465112239753_0001_03_01 Container Transitioned from NEW
> >>> to ALLOCATED
> >>> 2016-06-05 07:39:25,688 INFO
> >>> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
> >>>   OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> >>> RESULT=SUCCESS  APPID=application_1465112239753_0001
> >>> CONTAINERID=container_1465112239753_0001_03_01
> >>> 2016-06-05 07:39:25,688 INFO
> >>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> >>> Assigned container container_1465112239753_0001_03_01 of capacity
> >>> <memory:0, vCores:0> on host slave2.testing.local:26688, which has 1
> >>> containers, <memory:0, vCores:0> used and <memory:4096, vCores:1>
> >>> available after allocation
> >>> 20

Re: problem getting fine grained scaling workig

2016-06-05 Thread Darin Johnson

Hey Stephen,

I think you're pretty close.

Looking at the config I'd suggest removing these properties:

   
yarn.nodemanager.resource.memory-mb
4096


yarn.scheduler.maximum-allocation-vcores
12


yarn.scheduler.maximum-allocation-mb
8192

  
   yarn.nodemanager.vmem-check-enabled
false
Whether virtual memory limits will be enforced for
containers
  

   yarn.nodemanager.vmem-pmem-ratio
4
Ratio between virtual memory to physical memory when
setting memory limits for containers
  

I'll try them out on my test cluster later today/tonight and see if I can
recreate the problem.  What version of hadoop are you running?  I'll make
sure I'm consistent with that as well.

Thanks,

Darin
On Jun 5, 2016 8:15 AM, "Stephen Gran" <stephen.g...@piksel.com> wrote:

> Hi,
>
> Attached.  Thanks very much for looking.
>
> Cheers,
>
> On 05/06/16 12:51, Darin Johnson wrote:
> > Hey Steven can you please send your yarn-site.xml, I'm guessing you're on
> > the right track.
> >
> > Darin
> > Hi,
> >
> > OK.  That helps, thank you.  I think I just misunderstood the docs (or
> > they never said explicitly that you did need at least some static
> > resource), and I scaled down the initial nm.medium that got started.  I
> > get a bit further now, and jobs start but are killed with:
> >
> > Diagnostics: Container
> > [pid=3865,containerID=container_1465112239753_0001_03_01] is running
> > beyond virtual memory limits. Current usage: 50.7 MB of 0B physical
> > memory used; 2.6 GB of 0B virtual memory used. Killing container
> >
> > When I've seen this in the past with yarn but without myriad, it was
> > usually about ratios of vmem to mem and things like that - I've tried
> > some of those knobs, but I didn't expect much result and didn't get any.
> >
> > What strikes me about the error message is that the vmem and mem
> > allocations are for 0.
> >
> > I'm sorry for asking what are probably naive questions here, I couldn't
> > find a different forum.  If there is one, please point me there so I
> > don't disrupt the dev flow here.
> >
> > I can see this in the logs:
> >
> >
> > 2016-06-05 07:39:25,687 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> > container_1465112239753_0001_03_01 Container Transitioned from NEW
> > to ALLOCATED
> > 2016-06-05 07:39:25,688 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
> >  OPERATION=AM Allocated ContainerTARGET=SchedulerApp
> > RESULT=SUCCESS  APPID=application_1465112239753_0001
> > CONTAINERID=container_1465112239753_0001_03_01
> > 2016-06-05 07:39:25,688 INFO
> > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode:
> > Assigned container container_1465112239753_0001_03_01 of capacity
> > <memory:0, vCores:0> on host slave2.testing.local:26688, which has 1
> > containers, <memory:0, vCores:0> used and <memory:4096, vCores:1>
> > available after allocation
> > 2016-06-05 07:39:25,689 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> > Sending NMToken for nodeId : slave2.testing.local:26688 for container :
> > container_1465112239753_0001_03_01
> > 2016-06-05 07:39:25,696 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl:
> > container_1465112239753_0001_03_01 Container Transitioned from
> > ALLOCATED to ACQUIRED
> > 2016-06-05 07:39:25,696 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM:
> > Clear node set for appattempt_1465112239753_0001_03
> > 2016-06-05 07:39:25,696 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> > Storing attempt: AppId: application_1465112239753_0001 AttemptId:
> > appattempt_1465112239753_0001_03 MasterContainer: Container:
> > [ContainerId: container_1465112239753_0001_03_01, NodeId:
> > slave2.testing.local:26688, NodeHttpAddress: slave2.testing.local:24387,
> > Resource: <memory:0, vCores:0>, Priority: 0, Token: Token { kind:
> > ContainerToken, service: 10.0.5.5:26688 }, ]
> > 2016-06-05 07:39:25,697 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> > appattempt_1465112239753_0001_03 State change from SCHEDULED to
> > ALLOCATED_SAVING
> > 2016-06-05 07:39:25,698 INFO
> >
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttem

Re: problem getting fine grained scaling workig

2016-06-03 Thread Darin Johnson

That is correct you need at least one node manager with the minimum
requirements to launch an ApplicationMaster.  Otherwise YARN will throw an
exception.

On Fri, Jun 3, 2016 at 10:52 AM, yuliya Feldman  wrote:

> I believe you need at least one NM that is not subject to fine grain
> scaling.
> So far if total resources on the cluster is less then a single container
> needs for AM you won't be able to submit any app.As exception below tells
> you.
> (Invalid resource request, requested memory < 0, or requested memory >max
> configured, requestedMemory=1536, maxMemory=0
> at)
> I believe by default when starting Myriad cluster one NM with non 0
> capacity should start by default.
> In addition see in RM log whether offers with resources are coming to RM -
> this info should be in the log.
>
>   From: Stephen Gran 
>  To: "dev@myriad.incubator.apache.org" 
>  Sent: Friday, June 3, 2016 1:29 AM
>  Subject: problem getting fine grained scaling workig
>
> Hi,
>
> I'm trying to get fine grained scaling going on a test mesos cluster.  I
> have a single master and 2 agents.  I am running 2 node managers with
> the zero profile, one per agent.  I can see both of them in the RM UI
> reporting correctly as having 0 resources.
>
> I'm getting stack traces when I try to launch a sample application,
> though.  I feel like I'm just missing something obvious somewhere - can
> anyone shed any light?
>
> This is on a build of yesterday's git head.
>
> Cheers,
>
> root@master:/srv/apps/hadoop# bin/yarn jar
> share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
> /outDir
> 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
> master.testing.local/10.0.5.3:8032
> 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
> 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
> job: job_1464902078156_0001
> 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
> area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
> java.io.IOException:
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
> Invalid resource request, requested memory < 0, or requested memory >
> max configured, requestedMemory=1536, maxMemory=0
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
> at
>
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
> at
>
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
> at
>
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
> at
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
> at
>
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
>

Re: problem getting fine grained scaling working

2016-06-03 Thread Darin Johnson

That is normal behavior, Myriad keeps the resources to flexup a node
manager incase a job comes in of a few seconds and then releases them.  The
info statement is arguably chatty and will probably go to debug in a few
more releases.


On Fri, Jun 3, 2016 at 9:18 AM, Stephen Gran 
wrote:

> Hi,
>
> Not sure if this is relevant, but I see this in the RM logs:
>
> 2016-06-03 13:06:55,466 INFO
> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Setting
> capacity for node slave1.testing.local to 
> 2016-06-03 13:06:55,467 INFO
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
> Update resource on node: slave1.testing.local from:  vCores:0>, to: 
> 2016-06-03 13:06:55,467 INFO
> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Setting
> capacity for node slave1.testing.local to 
> 2016-06-03 13:06:55,470 INFO
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
> Update resource on node: slave1.testing.local from:  vCores:6>, to: 
>
>
> This is happening for each nodemanager, repeating every 5 or 6 seconds.
>   I'm assuming this will be the NM sending the actual capacity report to
> the RM, for use in updating YARN's view of available resource.  I don't
> know if it should be going back and forth like it is, though?
>
> Cheers,
>
> On 03/06/16 09:29, Stephen Gran wrote:
> > Hi,
> >
> > I'm trying to get fine grained scaling going on a test mesos cluster.  I
> > have a single master and 2 agents.  I am running 2 node managers with
> > the zero profile, one per agent.  I can see both of them in the RM UI
> > reporting correctly as having 0 resources.
> >
> > I'm getting stack traces when I try to launch a sample application,
> > though.  I feel like I'm just missing something obvious somewhere - can
> > anyone shed any light?
> >
> > This is on a build of yesterday's git head.
> >
> > Cheers,
> >
> > root@master:/srv/apps/hadoop# bin/yarn jar
> > share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar teragen 1
> > /outDir
> > 16/06/03 08:23:33 INFO client.RMProxy: Connecting to ResourceManager at
> > master.testing.local/10.0.5.3:8032
> > 16/06/03 08:23:34 INFO terasort.TeraSort: Generating 1 using 2
> > 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: number of splits:2
> > 16/06/03 08:23:34 INFO mapreduce.JobSubmitter: Submitting tokens for
> > job: job_1464902078156_0001
> > 16/06/03 08:23:35 INFO mapreduce.JobSubmitter: Cleaning up the staging
> > area /tmp/hadoop-yarn/staging/root/.staging/job_1464902078156_0001
> > java.io.IOException:
> > org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException:
> > Invalid resource request, requested memory < 0, or requested memory >
> > max configured, requestedMemory=1536, maxMemory=0
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:268)
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:228)
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:236)
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:329)
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
> >  at
> >
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
> >  at
> >
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
> >  at
> >
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
> >  at
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> >  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> >  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> >  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> >  at java.security.AccessController.doPrivileged(Native Method)
> >  at javax.security.auth.Subject.doAs(Subject.java:422)
> >  at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> >  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> >
> >  at
> org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
> >  at
> >
>

Re: [Vote] Release apache-myriad-0.2.0-incubating (release candidate 4)

2016-06-02 Thread Darin Johnson

Hey all I need one more committer vote for RC4.  John and I have a lot of
other improvements we want to start working on but are waiting to cut a
stable release first.

Darin

On Mon, May 30, 2016 at 4:33 PM, sarjeet singh <sarje...@usc.edu> wrote:

> +1 (Non-binding)
>
> Verified md5 and sha512 checksums.
> D/L myriad-0.2.0-incubating-rc4.tar.gz, Compiled & deployed it on a 1 node
> MapR cluster.
> Tried FGS/CGS flex up/down, and ran long/short running M/R jobs.
> Tried framework shutdown from UI/API, and tried re-launching myriad again.
> Tried Cgroups and able to launch NMs w/ cgroups enabled successfully.
>
> - Sarjeet Singh
>
> On Fri, May 27, 2016 at 3:36 PM, Santosh Marella <smare...@maprtech.com>
> wrote:
>
> > +1 (Binding).
> >
> > - Verified signature
> > - Verified MD5 and SHA512 hashes
> > - Builds from source tar ball.
> > - Ran Apache RAT. Verified that all the sources have license headers.
> > - Verified CGS/FGS behaviors with MapReduce jobs on a 4 node Mesos/Yarn
> > cluster.
> >
> > Thanks,
> > Santosh
> >
> > On Tue, May 24, 2016 at 7:46 PM, Darin Johnson <dbjohnson1...@gmail.com>
> > wrote:
> >
> > > I'm voting +1 (Binding)
> > >
> > > Verified md5/sha hashes.  Compiled with gradle build, gradle
> > buildRMDocker
> > > (on OSX with docker-machine).
> > >
> > > Ran remote distribution (with cgroups) on a 4 node cluster (Ubuntu,
> > > hadoop-2.6.0, hadoop 2.7.0) with one CGS NM and 3 FGS NM.  Ran 8
> > > simultaneous jobs.  Shut down Framework.  Restarted NodeManager, ran an
> > > additional 3 jobs.
> > >
> > > Ran the same with docker (minus cgroups).
> > >
> > > Darin
> > >
> > > On Tue, May 24, 2016 at 10:40 PM, Darin Johnson <
> dbjohnson1...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I have created a source tar ball for Apache Myriad 0.2.0-incubating,
> > > > release candidate 3 based off the feed back received from release
> > > > candidate 1,2 & 3.  Thanks Sarjeet for a very thorough review!
> > > >
> > > > Here’s the release notes:
> > > > https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes
> > > >
> > > > The commit to be voted upon is tagged with
> > "myriad-0.2.0-incubating-rc4"
> > > > and is available here:
> > > >
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc
> > > > <
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc4
> > > >
> > > > 4
> > > >
> > > > The artifacts to be voted upon are located below. Please note that
> this
> > > is
> > > > a source release:
> > > >
> > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc4/
> > > >
> > > > Release artifacts are signed with the following key:
> > > > *https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
> > > > <https://home.apache.org/~darinj/gpg/2AAE9E3F.asc>*
> > > >
> > > > **Please note that the release tar ball does not include the gradlew
> > > script
> > > > to build. You need to install gradle in order to build.**
> > > >
> > > > Please try out the release candidate and vote. The vote is open for a
> > > > minimum of 3 business days (Friday May 27) or until the necessary
> > number
> > > > of votes (3 binding +1s)
> > > > is reached.
> > > >
> > > > If/when this vote succeeds, I will call for a vote with IPMC seeking
> > > > permission to release RC3 as Apache Myriad 0.2.0 (incubating).
> > > >
> > > > [ ] +1 Release this package as Apache Myriad 0.2.0-incubating
> > > > [ ]  0 I don't feel strongly about it, but I'm okay with the release
> > > > [ ] -1 Do not release this package because...
> > > >
> > > > Thanks,
> > > > Darin
> > > >
> > >
> >
>

Re: Podling Report Reminder - June 2016

2016-06-01 Thread Darin Johnson

Thanks Adam, I also was unable to edit the wiki (tried to add Santosh's
report, before I saw Adam did).

On Wed, Jun 1, 2016 at 8:07 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> Updated the wiki. Looks great. Thanks Santosh!
> https://wiki.apache.org/incubator/June2016
>
> On Wed, Jun 1, 2016 at 2:50 PM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Santosh looks good thanks!
> > On Jun 1, 2016 5:35 PM, "Santosh Marella" <smare...@maprtech.com> wrote:
> >
> > > Hi Adam,
> > >
> > >   I have put together the following report. Can you please review it
> and
> > > add it to incubator wiki (I don't have permissions)?
> > >
> > > Thanks,
> > > Santosh.
> > >
> > >
> > >
> >
> 
> > > Myriad has been incubating since 2015-03-01.
> > >
> > > Three most important issues to address in the move towards graduation:
> > >
> > >   1. Develop project roadmap for longer term community/user engagement.
> > >   2. Release frequently - 0.2.0 is underway, but has taken ~6 months
> > since
> > > last release.
> > >   3. Expand community - users/contributors/committers.
> > >
> > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> > > aware of?
> > >
> > >   None.
> > >
> > > How has the community developed since the last report?
> > >
> > >   - dev@ mailing list experienced a low in March, but picked up
> traffic
> > > leading up to 0.2.0 release. 141 messages since the last report.
> > >   - 5 new members on the dev@ mailing list. 2 new contributors.
> > >   - Myriad was presented at ApacheCon Vancouver and at couple of other
> > > meetups. Talks submitted at various conferences.
> > >   - Bi-weekly dev syncs happening steadily. Approx. 4-7 members
> > > participate. Minutes at http://s.apache.org/8kF
> > >
> > > How has the project developed since the last report?
> > >
> > >   - Myriad 0.2.0 out for PPMC voting. DarinJ is driving the release.
> > >   - 12 commits since 4/1.
> > >   - 13 JIRAs fixed/resolved.
> > >
> > > Date of last release:
> > >
> > >   2015-12-09 myriad-0.1.0-incubating released
> > >
> > > When were the last committers or PMC members elected?
> > >
> > >   2015-10-05 Darin J
> > >   2015-10-14 Swapnil Daingade
> > >
> > >
> > > Signed-off-by:
> > >
> > >   [ ](myriad) Benjamin Hindman
> > >   [ ](myriad) Danese Cooper
> > >   [ ](myriad) Ted Dunning
> > >   [ ](myriad) Luciano Resende
> > >
> > >
> > >
> >
> --
> > >
> > > On Fri, May 27, 2016 at 11:55 PM, Adam Bordelon <a...@mesosphere.io>
> > > wrote:
> > >
> > > > I'll be pretty busy too, but if I'm not too delirious after my
> MesosCon
> > > > presentation on June 4th, I should be able to spend 30min putting
> > > something
> > > > together before EoD. If anybody else wants to draft a response, I'd
> be
> > > > happy to review it and add it to the Incubator wiki (if you don't
> have
> > > > permissions yourself).
> > > > The (real) link for our June report is:
> > > > http://wiki.apache.org/incubator/June2016
> > > > For previous Myriad reports, see:
> > > > http://wiki.apache.org/incubator/March2016
> > > > http://wiki.apache.org/incubator/December2015
> > > >
> > > >
> > > > On Fri, May 27, 2016 at 4:32 PM, Darin Johnson <
> > dbjohnson1...@gmail.com>
> > > > wrote:
> > > >
> > > > > I just saw this, I'm unfortunately going to be super busy until
> June
> > 4
> > > > and
> > > > > don't have the experience.  If someone else can handle this it'd be
> > > > great,
> > > > > if I get a copy I'll take a stab at the next one.
> > > > > On May 26, 2016 8:40 PM, <johndam...@apache.org> wrote:
> > > > >
> > > > > Dear podling,
> > > > >
> > > > > This email was sent by an automated system on behalf of the Apache
> > > > > Incubator PMC. It is an initial reminder to give you plenty of time
> > to
> > > >

[Vote] Release apache-myriad-0.2.0-incubating (release candidate 4)

2016-05-24 Thread Darin Johnson

Hi All,

I have created a source tar ball for Apache Myriad 0.2.0-incubating,
release candidate 3 based off the feed back received from release candidate
1,2 & 3.  Thanks Sarjeet for a very thorough review!

Here’s the release notes:
https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes

The commit to be voted upon is tagged with "myriad-0.2.0-incubating-rc4"
and is available here:
https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc

4

The artifacts to be voted upon are located below. Please note that this is
a source release:
https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc4/

Release artifacts are signed with the following key:
*https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
*

**Please note that the release tar ball does not include the gradlew script
to build. You need to install gradle in order to build.**

Please try out the release candidate and vote. The vote is open for a
minimum of 3 business days (Friday May 27) or until the necessary number of
votes (3 binding +1s)
is reached.

If/when this vote succeeds, I will call for a vote with IPMC seeking
permission to release RC3 as Apache Myriad 0.2.0 (incubating).

[ ] +1 Release this package as Apache Myriad 0.2.0-incubating
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...

Thanks,
Darin

Re: [Vote] Release apache-myriad-0.2.0-incubating (release candidate 3)

2016-05-24 Thread Darin Johnson

That was my fault I pushed the PR to master but not to 0.2.x before I ran
the release script (off 0.2.x).  New release coming momentarily.

On Tue, May 24, 2016 at 9:21 PM, Sarjeet Singh <sarjeetsi...@maprtech.com>
wrote:

> >> Specifically, this corrected some documentation and a minor typo
>
> Darin, RC3 is missing PR#75 changes. I D/L'ed the tar and manually checked
> the changes and wasn't there.
>
> -Sarjeet
>
> On Mon, May 23, 2016 at 9:15 PM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I have created a source tar ball for Apache Myriad 0.2.0-incubating,
> > release candidate 3 based off the feed back received from release
> candidate
> > 1 & 2.  Specifically, this corrected some documentation and a minor typo.
> >
> > Here’s the release notes:
> > https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes
> >
> > The commit to be voted upon is tagged with "myriad-0.2.0-incubating-rc2"
> > and is available here:
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc
> > <
> >
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc3
> > >
> > 3
> >
> > The artifacts to be voted upon are located below. Please note that this
> is
> > a source release:
> >
> >
> https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc3/
> >
> > Release artifacts are signed with the following key:
> > *https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
> > <https://home.apache.org/~darinj/gpg/2AAE9E3F.asc>*
> >
> > **Please note that the release tar ball does not include the gradlew
> script
> > to build. You need to install gradle in order to build.**
> >
> > Please try out the release candidate and vote. The vote is open for a
> > minimum of 3 business days (Friday May 27) or until the necessary number
> of
> > votes (3 binding +1s)
> > is reached.
> >
> > If/when this vote succeeds, I will call for a vote with IPMC seeking
> > permission to release RC3 as Apache Myriad 0.2.0 (incubating).
> >
> > [ ] +1 Release this package as Apache Myriad 0.2.0-incubating
> > [ ]  0 I don't feel strongly about it, but I'm okay with the release
> > [ ] -1 Do not release this package because...
> >
> > Thanks,
> > Darin
> >
>

[Vote] Release apache-myriad-0.2.0-incubating (release candidate 3)

2016-05-23 Thread Darin Johnson

Hi All,

I have created a source tar ball for Apache Myriad 0.2.0-incubating,
release candidate 3 based off the feed back received from release candidate
1 & 2.  Specifically, this corrected some documentation and a minor typo.

Here’s the release notes:
https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes

The commit to be voted upon is tagged with "myriad-0.2.0-incubating-rc2"
and is available here:
https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc

3

The artifacts to be voted upon are located below. Please note that this is
a source release:
https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc3/

Release artifacts are signed with the following key:
*https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
*

**Please note that the release tar ball does not include the gradlew script
to build. You need to install gradle in order to build.**

Please try out the release candidate and vote. The vote is open for a
minimum of 3 business days (Friday May 27) or until the necessary number of
votes (3 binding +1s)
is reached.

If/when this vote succeeds, I will call for a vote with IPMC seeking
permission to release RC3 as Apache Myriad 0.2.0 (incubating).

[ ] +1 Release this package as Apache Myriad 0.2.0-incubating
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...

Thanks,
Darin

Re: gradle Issue when building RM docker on MacOSX

2016-05-22 Thread Darin Johnson

I've seen that error if I used a terminal that wasn't loaded with
docker-machine.  I think you can also solve with evaluation
(`docker-machine env`)
On May 22, 2016 8:37 PM, "sarjeet singh"  wrote:

Observed following issue when tried to build RM docker image from mac
(local):

ssingh-mbpro:docker ssingh$ ./gradlew -P dockerTag=sarjeet/myriad
buildRMDocker

   [***output formatted***]

Building image using context
'/Users/ssingh/Myriad/myriad-0.2.0/myriad-0.2.0-incubating-rc2/docker'.

Using tag 'sarjeet/myriad' for image.

java.lang.UnsatisfiedLinkError: Could not find library in classpath, tried:
[libjunixsocket-macosx-1.8-x86_64.dylib,
libjunixsocket-macosx-1.5-x86_64.dylib]

at org.newsclub.net.unix.NativeUnixSocket.load(NativeUnixSocket.java:81)

at
org.newsclub.net.unix.NativeUnixSocket.(NativeUnixSocket.java:112)

at org.newsclub.net.unix.AFUNIXSocket.(AFUNIXSocket.java:36)

at org.newsclub.net.unix.AFUNIXSocket.newInstance(AFUNIXSocket.java:50)

at
com.github.dockerjava.jaxrs.ApacheUnixSocket.(ApacheUnixSocket.java:53)

at
com.github.dockerjava.jaxrs.UnixConnectionSocketFactory.createSocket(UnixConnectionSocketFactory.java:65)

at
org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:108)

at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)

at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:357)

at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:218)

at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:194)

at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)

at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)

at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)

at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)

at
com.github.dockerjava.jaxrs.connector.ApacheConnector.apply(ApacheConnector.java:443)

at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:246)

at
org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:683)

at org.glassfish.jersey.internal.Errors.process(Errors.java:315)

at org.glassfish.jersey.internal.Errors.process(Errors.java:297)

at org.glassfish.jersey.internal.Errors.process(Errors.java:228)

at
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:424)

at
org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:679)

at
org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:435)

at
org.glassfish.jersey.client.JerseyInvocation$Builder.post(JerseyInvocation.java:338)

at
com.github.dockerjava.jaxrs.async.POSTCallbackNotifier.response(POSTCallbackNotifier.java:29)

at
com.github.dockerjava.jaxrs.async.AbstractCallbackNotifier.call(AbstractCallbackNotifier.java:45)

at
com.github.dockerjava.jaxrs.async.AbstractCallbackNotifier.call(AbstractCallbackNotifier.java:22)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

[pool-1-thread-1] ERROR
com.github.dockerjava.core.async.ResultCallbackTemplate - Error during
callback

java.lang.NoClassDefFoundError: Could not initialize class
org.newsclub.net.unix.NativeUnixSocket

at org.newsclub.net.unix.AFUNIXSocketImpl.connect(AFUNIXSocketImpl.java:134)

at org.newsclub.net.unix.AFUNIXSocket.connect(AFUNIXSocket.java:97)

at
com.github.dockerjava.jaxrs.ApacheUnixSocket.connect(ApacheUnixSocket.java:64)

at
com.github.dockerjava.jaxrs.UnixConnectionSocketFactory.connectSocket(UnixConnectionSocketFactory.java:73)

at
org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:118)

at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:314)

at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:357)

at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:218)

at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:194)

at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:85)

at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)

at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:186)

at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)

at
com.github.dockerjava.jaxrs.connector.ApacheConnector.apply(ApacheConnector.java:443)

at org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:246)

at
org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:683)

at

Re: Need help with cgroup troubleshooting or setup issue with NM launch.

2016-05-21 Thread Darin Johnson

Sarjeet:

Can you try adding this to your yarn-site.xml:



yarn.nodemanager.linux-container-executor.cgroups.hierarchy

${yarn.nodemanager.linux-container-executor.cgroups.hierachy}



this should change the hierarchy to
/sys/fs/cgroup/cpu/mesos/XXX-TASK-ID-XXX, which will be rightable.

and explains the error:


Caused by: java.io.IOException: Not able to enforce cpu weights; cannot
write to cgroup at: /sys/fs/cgroup/cpu

The node manager will now add tasks to:

/sys/fs/cgroup/cpu/mesos/XXX-TASK-ID-XXX

I'll go check that to ensure that's in the documentation.

Thanks,

Darin



On Sat, May 21, 2016 at 4:56 AM, Sarjeet Singh 
wrote:

> When trying cgroups on myriad-0.2 RC on a single node mapr cluster, I am
> getting the following issue:
>
> 1. The below errors is when launching NodeManager with cgroups enabled:
>
> *stdout*:
>
> export TASK_DIR=afe954c5-79dc-4238-af84-14855090df34&& sudo chown mapr
> /sys/fs/cgroup/cpu/mesos/afe954c5-79dc-4238-af84-14855090df34 && export
> YARN_HOME=/opt/mapr/hadoop/hadoop-2.7.0; env
> YARN_NODEMANAGER_OPTS=-Dcluster.name.prefix=/cluster1
> -Dnodemanager.resource.io-spindles=4.0
> -Dyarn.nodemanager.linux-container-executor.cgroups.hierarchy=mesos/
> afe954c5-79dc-4238-af84-14855090df34
> -Dyarn.home=/opt/mapr/hadoop/hadoop-2.7.0
> -Dnodemanager.resource.cpu-vcores=4 -Dnodemanager.resource.memory-mb=4096
> -Dmyriad.yarn.nodemanager.address=0.0.0.0:31847
> -Dmyriad.yarn.nodemanager.localizer.address=0.0.0.0:31132
> -Dmyriad.yarn.nodemanager.webapp.address=0.0.0.0:31181
> -Dmyriad.mapreduce.shuffle.port=31166
> YARN_HOME=/opt/mapr/hadoop/hadoop-2.7.0
> /opt/mapr/hadoop/hadoop-2.7.0/bin/yarn nodemanager
>
>
> *stderr*:
>
> 16/05/21 01:43:13 INFO service.AbstractService: Service NodeManager failed
> in state INITED; cause:
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> initialize container executor
>
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> initialize container executor
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:214)
>
> at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:476)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:524)
>
> Caused by: java.io.IOException: Not able to enforce cpu weights; cannot
> write to cgroup at: /sys/fs/cgroup/cpu
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.initializeControllerPaths(CgroupsLCEResourcesHandler.java:493)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.init(CgroupsLCEResourcesHandler.java:152)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.init(CgroupsLCEResourcesHandler.java:135)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:192)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:212)
>
> ... 3 more
>
> 16/05/21 01:43:13 WARN service.AbstractService: When stopping the service
> NodeManager : java.lang.NullPointerException
>
> java.lang.NullPointerException
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.stopRecoveryStore(NodeManager.java:164)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStop(NodeManager.java:276)
>
> at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>
> at
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>
> at
>
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>
> at org.apache.hadoop.service.AbstractService.init(AbstractService.java:171)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:476)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:524)
>
> 16/05/21 01:43:13 FATAL nodemanager.NodeManager: Error starting NodeManager
>
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> initialize container executor
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:214)
>
> at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:476)
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:524)
>
> Caused by: java.io.IOException: Not able to enforce cpu weights; cannot
> write to cgroup at: /sys/fs/cgroup/cpu
>
> at
>
> org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler.initializeControllerPaths(CgroupsLCEResourcesHandler.java:493)
>
> at
>
>

Re: [Vote] Release apache-myriad-0.2.0-incubating (release candidate 2)

2016-05-19 Thread Darin Johnson

I'm voting +1.
Build, ran multiple map/reduce jobs, a few spark and flink jobs.

Darin

On Tue, May 17, 2016 at 9:24 PM, Darin Johnson <dbjohnson1...@gmail.com>
wrote:

> Hi All,
>
> I have created a source tar ball for Apache Myriad 0.2.0-incubating,
> release candidate 2 based off the feed back received from release
> candidate 1.  Specifically, the NOTICE file has been updated to 2016 and
> the framework properly shuts down when using the web ui.
>
> Here’s the release notes:
> https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes
>
> The commit to be voted upon is tagged with "myriad-0.2.0-incubating-rc2"
> and is available here:
>
> https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc2
>
> The artifacts to be voted upon are located below. Please note that this is
> a source release:
>
> https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc2/
>
> Release artifacts are signed with the following key:
> *https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
> <https://home.apache.org/~darinj/gpg/2AAE9E3F.asc>*
>
> **Please note that the release tar ball does not include the gradlew script
> to build. You need to install gradle in order to build.**
>
> Please try out the release candidate and vote. The vote is open for a
> minimum of 3 business days (Friday May 20) or until the necessary number
> of votes (3 binding +1s)
> is reached.
>
> If/when this vote succeeds, I will call for a vote with IPMC seeking
> permission to release RC1 as Apache Myriad 0.2.0 (incubating).
>
> [ ] +1 Release this package as Apache Myriad 0.2.0-incubating
> [ ]  0 I don't feel strongly about it, but I'm okay with the release
> [ ] -1 Do not release this package because...
>
> Thanks,
> Darin
>

Spark and Flink

2016-05-19 Thread Darin Johnson

Just wanted to let people know I tried running a Spark and a Flink job
today on Myriad with zero sized node managers.  It just worked!

This shouldn't be interpreted as it's not going to had issues.  However,
there is some initial progress.

Darin

[Vote] Release apache-myriad-0.2.0-incubating (release candidate 2)

2016-05-17 Thread Darin Johnson

Hi All,

I have created a source tar ball for Apache Myriad 0.2.0-incubating,
release candidate 2 based off the feed back received from release candidate
1.  Specifically, the NOTICE file has been updated to 2016 and the
framework properly shuts down when using the web ui.

Here’s the release notes:
https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes

The commit to be voted upon is tagged with "myriad-0.2.0-incubating-rc2"
and is available here:
https://git-wip-us.apache.org/repos/asf?p=incubator-myriad.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc2

The artifacts to be voted upon are located below. Please note that this is
a source release:
https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2.0-incubating-rc2/

Release artifacts are signed with the following key:
*https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
*

**Please note that the release tar ball does not include the gradlew script
to build. You need to install gradle in order to build.**

Please try out the release candidate and vote. The vote is open for a
minimum of 3 business days (Friday May 20) or until the necessary number of
votes (3 binding +1s)
is reached.

If/when this vote succeeds, I will call for a vote with IPMC seeking
permission to release RC1 as Apache Myriad 0.2.0 (incubating).

[ ] +1 Release this package as Apache Myriad 0.2.0-incubating
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...

Thanks,
Darin

[Vote] Release apache-myriad-0.2.0-incubating (release candidate 1)

2016-05-13 Thread Darin Johnson

Hi All,

Firstly, thanks everyone for the valuable contributions to the project and
for holding on tight as we move along the release process.

I have created a source tar ball for Apache Myriad 0.2.0-incubating,
release candidate 1.

Here’s the release notes:
https://cwiki.apache.org/confluence/display/MYRIAD/Release+Notes

The commit to be voted upon is tagged with "myriad-0.2.0-incubating-rc1"
and is available here:
https://git-wip-us.apache.org/repos/asf?p=incubator-myriad
.git;a=shortlog;h=refs/tags/myriad-0.2.0-incubating-rc

1

The artifacts to be voted upon are located below. Please note that this is
a source release:
https://dist.apache.org/repos/dist/dev/incubator/myriad/myriad-0.2
.0-incubating-rc1/

Release artifacts are signed with the following key:
*https://home.apache.org/~darinj/gpg/2AAE9E3F.asc
*

**Please note that the release tar ball does not include the gradlew script
to build. You need to install gradle in order to build.**

Please try out the release candidate and vote. The vote is open for a
minimum of 3 business days (Wednesday May 18) or until the necessary number
of votes (3 binding +1s)
is reached.

If/when this vote succeeds, I will call for a vote with IPMC seeking
permission to release RC1 as Apache Myriad 0.2.0 (incubating).

[ ] +1 Release this package as Apache Myriad 0.2.0-incubating
[ ]  0 I don't feel strongly about it, but I'm okay with the release
[ ] -1 Do not release this package because...

Thanks,
Darin

Myriad Release update

2016-05-11 Thread Darin Johnson

OK, I got the updates merged.  I'm going to be updating documentation and
testing tomorrow.  If all goes well we should have a release tomorrow
evening.

Darin

Myriad PR's

2016-05-09 Thread Darin Johnson

In preparation for the 0.2.0 release I'm going to start merging PR's I've
looked over Mohit's, If anyone want's to look over any of the PR's that'd
be great, otherwise I'm going to assume anything that's been up for over 3
business days with no negative comments is considered to be OK unless
someone has objections.

Darin

Re: NM does not start with cgroups enabled

2016-05-05 Thread Darin Johnson

Bjorn, I don't know if you're still experimenting with Myriad, but I
believe I've got a fix for your issue.  I'm going to try to get it in our
next release, so if you have any feedback it would be great.  I verified it
on a couple small systems.

https://github.com/apache/incubator-myriad/pull/69

On Wed, Mar 23, 2016 at 8:17 AM, Darin Johnson <dbjohnson1...@gmail.com>
wrote:

> Hey, Bjorn sorry for the delay, looking at the difference between the
> exceptions and my own experience I believe you left some cgroup configs in
> yarn-site.xml of the node manager.
> On Mar 18, 2016 2:58 AM, "Björn Hagemeier" <b.hageme...@fz-juelich.de>
> wrote:
>
>> Hi Darin,
>>
>> thanks a lot for this. But what about the other case below, when cgroups
>> is disabled?
>>
>>
>> Björn
>>
>> Am 18.03.2016 um 00:25 schrieb Darin Johnson:
>> > Hey Bjorn,
>> >
>> > I think I figured out the issue.  Some of the values for cgroups are
>> still
>> > hardcoded in myriad.  I'll add a JIRA Ticket hopefully we can get an
>> update
>> > for 0.2.0.  I'll also respond to this thread after a pull request is
>> > submitted in case you'd like to test it.
>> >
>> > Darin
>> > Hi all,
>> >
>> > I have trouble starting the NM on the slave nodes. Apparently, it does
>> > not find it's configuration or sth. is wrong with the configuration.
>> >
>> > With cgroups enabled, the NM does not start, the logs contain,
>> > indicating that there is sth. wrong in the configuratin. However,
>> > yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
>> > value used to be "${yarn.nodemanager.linux-container-executor.group}" as
>> > indicated by the installation documentation, however I'm uncertain
>> > whether this recursion is the correct approach.
>> >
>> >
>> > ==
>> > 16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting
>> NodeManager
>> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
>> > initialize container executor
>> > at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
>> > at
>> > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>> > at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
>> > at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
>> > Caused by: java.io.IOException: Linux container executor not configured
>> > properly (error=24)
>> > at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
>> > at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
>> > ... 3 more
>> > Caused by: ExitCodeException exitCode=24: Can't get configured value for
>> > yarn.nodemanager.linux-container-executor.group.
>> >
>> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
>> > at org.apache.hadoop.util.Shell.run(Shell.java:460)
>> > at
>> >
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
>> > at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
>> > ... 4 more
>> > ==
>> >
>> >
>> > I have given it another try with cgroups disabled (in
>> > myriad-config-default.yml), I seem to get a little further, but still
>> > stuck at running Yarn jobs:
>> >
>> > ==
>> > 16/03/14 10:56:34 INFO container.Container: Container
>> > container_1457949199710_0001_01_01 transitioned from LOCALIZED to
>> > RUNNING
>> > 16/03/14 10:56:34 INFO nodemanager.DefaultContainerExecutor:
>> > launchContainer: [bash,
>> >
>> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/application_1457949199710_0001/container_1457949199710_0001_01_01/default_container_executor.sh]
>> > 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exit code
>> > from container cont

Re: cgroups suggestions

2016-05-05 Thread Darin Johnson

It turns out everything works if you set permissions appropriately of
$CGROUP_ROOT/mesos/$TASKID/ so the yarn user can write to the hierarchy.
Then all works exactly as expected.

I spent a while running through the container-executor code and when it
mounts a cgroup subsystem it changes the ownership of the hierarchy to the
yarn user, the original cgroups code of myriad attempted to do something
similar by chmoding the directory but assumed the yarn user work be a
member of group root, also when the code was written the chmod happened as
root, currently that is ineffective as the standard framework user does not
necessarily have permission to modify $CGROUP_ROOT/mesos/$TASKID.  However,
we have a mechanism for using a frameworksuperuser which can do this (my
current hack).

The current code also sets
yarn.nodemanager.linux-container-executor.cgroups.mount-path=/sys/fs/cgroup
and yarn.nodemanager.linux-container-executor.cgroups.mount=true, the
documentation the requires edits to yarn-site.xml to get these passed
through.

Now that I've got things working, I'll start cleaning up the original code
to provide an cleaner setup and adjust the documentation as necessary, I
should have a PR soon.

Re: cgroups suggestions

2016-05-04 Thread Darin Johnson

Santosh, that is the behavior I'm seeing.
On May 4, 2016 6:13 PM, "Santosh Marella" <smare...@maprtech.com> wrote:

> > The second involves the cgroup hierarchy and the cgroup mount point.
> Here
> > the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
> > This is problematic as mesos will not unmount the hierarchy when the task
> > finished (in this case the node manager)
>
> IIRC, when a task is launched by mesos, the agent creates
> $CGROUP_DIR/mesos/$TASK_ID mount point to enforce cpu/mem for that task.
> Once the task finishes, the agent should unmount the $TASK_ID. Are you
> saying
> that's not happening for NMs ?
>
> Santosh
>
> On Wed, May 4, 2016 at 10:30 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > I've been digging into groups support, there's a few things that are easy
> > fixes but a few things become problematic so I'd like to discuss.
> >
> > First the code makes certain options dictated that can be placed in the
> > yarn-site.xml - this should be done to remove code and provide
> > flexibility.  That's easy.
> >
> > The second involves the cgroup hierarchy and the cgroup mount point.
> Here
> > the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
> > This is problematic as mesos will not unmount the hierarchy when the task
> > finished (in this case the node manager), it is also therefore unable to
> > unmount it's own task hierarchy (This also creates the need to chmod a
> > number of directories as a superuser).  This leads to issues.  An
> > alternative approach would be to use the container-executor program
> > (already suid w/ yarn's group) to create the hierarchy as
> > $CGROUP_DIR/frameworkname if it doesn't exist, this may open another can
> of
> > worms as I haven't tested fully.
> >
> > Any thoughts or suggestions would be appreciated.
> >
> > Darin
> >
>

cgroups suggestions

2016-05-04 Thread Darin Johnson

I've been digging into groups support, there's a few things that are easy
fixes but a few things become problematic so I'd like to discuss.

First the code makes certain options dictated that can be placed in the
yarn-site.xml - this should be done to remove code and provide
flexibility.  That's easy.

The second involves the cgroup hierarchy and the cgroup mount point.  Here
the code attempts to create a hierarchy in $CGROUP_DIR/mesos/$TASK_ID.
This is problematic as mesos will not unmount the hierarchy when the task
finished (in this case the node manager), it is also therefore unable to
unmount it's own task hierarchy (This also creates the need to chmod a
number of directories as a superuser).  This leads to issues.  An
alternative approach would be to use the container-executor program
(already suid w/ yarn's group) to create the hierarchy as
$CGROUP_DIR/frameworkname if it doesn't exist, this may open another can of
worms as I haven't tested fully.

Any thoughts or suggestions would be appreciated.

Darin

Re: 答复: Hello Guys

2016-04-27 Thread Darin Johnson

Yongyu,

Sounds like you've made some progress! I'm not sure you'll be able to get
the cluster completely under CM management and the node managers launched
by Myriad aren't running the Cloudera agent (I don't think we'll ever
officially support that, but you could tweak the source to run it in the
background, if you could make it a general process it might be a feature
we'd add).  Though if you're launching the Resource Manager via CM I'd
expect it to show up.  I haven't attempted to put my resource manager under
the CM though.

Darin

On Tue, Apr 26, 2016 at 4:45 AM, 陈泳宇 <yc...@linkernetworks.com> wrote:

> Hi Darin, This is Yongyu from Linkernetworks, I have proved that myriad
> can work well with CDH. Currently, I am also trying to myriad our hadoop
> cluster under the management of CM.
>
> I noticed that you have already complete the mission, but I still got some
> problem here: The status shown on CM dashboard will not change…
>
> Here are my steps:
>
> 1.   Add YARN service on CM dashboard..
>
> 2.   Stop the YARN service
>
> 3.   Cd to /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45
>
> 4.   Copy native libs 、myriad jars as well we the config files.
>
> 5.   Tar this package and upload it to hdfs.
>
> 6.   Start the resourcemanager.
>
> Myriad service will be launched successfully, but nothing changes on the
> dashboard, which means the cluster is not under the management of CM..
>
>
>
>
>
> Regards,
>
> Yongyu
>
>
> Begin forwarded message:
>
> *From:* Darin Johnson <dbjohnson1...@gmail.com>
> *Date:* April 21, 2016 at 8:50:22 PM GMT+8
> *To:* Dev <dev@myriad.incubator.apache.org>
> *Subject:* *Re: Hello Guys*
> *Reply-To:* dev@myriad.incubator.apache.org
>
> Hey Sam,
>
> I'm already myriad alongside a CM managed hadoop cluster.  It's a little
> hacky right now, I'm working on stream lining this, it may involve CM
> and/or some docker integration.  Here are my current steps:
>
> 0. I strongly recommend pulling of master and building from source - it has
> some really useful patches.  We're working on another release now (couple
> weeks out though).
>
> 1. Let Cloudera Manager configure hdfs - it does a good job of this.
>
> 2. Grab a cloudera tar ball from here:
>
> http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_eom.html#topic_3
> (I've also just used apache-hadoop tarballs).
>
> 3. Extract the tar ball, copy the native libraries cdh install on your
> system in hadoop-*/lib/native
>
> 4 cp myriad/myriad-*/build/libs/* hadoop-*/share/hadoop/yarn/lib
>
> 5 copy your hadoop configs to hadoop-*/etc/hadoop/
>
> 6 create a myriad-default-config.yml in hadoop-*/etc/hadoop follow
> instructions on wiki for remote distribution for edits to
> myriad-default-config.yml.  NB: don't enable yarn cgroups yet - I'm fixing
> a bug.
>
> 7 chown -R root:root hadoop-*y
>
> 8 chown root:yarn hadoop-*/bin/container-executor ; chmod g+s
> hadoop*/bin/container-executor
>
> 9 mv hadoop-<..> hadoop-myriad
>
> 10 tar -zxvf hadoop-myriad.tgz hadoop-myriad ; hadoop fs -put
> hadoop-myriad.tgz /dist/
>
> 11 cd hadoop-myriad/ && sudo -u yarn bin/yarn resource manager
>
> 12 hit the web-ui at host:8192 and flexup some node managers
>
> This has been pretty stable.
>
> Alternatively, if you're running mesos and docker you could look at PR
> https://github.com/apache/incubator-myriad/pull/64, it's still WIP but
> avoids a lot of setup - would be happy to work with you as I document and
> harden this feature for the 0.2.0 release.  Currently that runs off generic
> hadoop put we could certainly create distribution specific dockerfiles.
>
> Let me know if you need help.  Keep in mind this is still an alpha project,
> so expect some issues. Would love to get feedback, use cases and feature
> ideas.
>
> Also a mesos tip: if your running services like hdfs outside of mesos you
> should adjust you're mesos resources appropriately or you'll end up over
> subscribed and processes will slow or die.
>
> Thanks for trying myriad!
>
> Darin
>
>
> On Apr 21, 2016 4:22 AM, "rchen" <rc...@linkernetworks.com> wrote:
>
> Hi Guys,
> Currrently, we are working on Myriad with Cloudera distribution. And we
> have proved that Mesos and work with Yarn.
> However talking about CM, how to integrate ? Any comments is welcome.
> appreciated.
>
>
> Regards
> Sam
>
>

Re: Myriad Releases

2016-04-23 Thread Darin Johnson

Great I'm calling it.

Also, if anyone wants to test or provide feedback on
https://github.com/apache/incubator-myriad/pull/64 that would be awesome.
It still needs work, but if you have hdfs up it takes about a half hour to
get a Myriad up, I'm working on streamlining it a bit.  Would also like to
get the base docker images hosted on something other than my personal
dockerhub.
On Thu, Apr 21, 2016 at 12:16 AM, Santosh Marella <smare...@maprtech.com>
wrote:

> Agreed. Let's just do a 0.2.0 rather than a 0.1.1.
>
> Santosh
>
> On Wed, Apr 20, 2016 at 6:28 PM, Swapnil Daingade <sdaing...@maprtech.com>
> wrote:
>
> > +1
> >
> > Another change to the roadmap was to move the security work to 0.3
> release.
> >
> > Regards
> > Swapnil
> >
> >
> >
> > On Wed, Apr 20, 2016 at 6:04 PM, Adam Bordelon <a...@mesosphere.io>
> wrote:
> >
> > > +1 to skipping 0.1.1 if 0.2.0 is coming soon enough
> > > I don't think we have any production users eagerly awaiting the 0.1.1
> > fixes
> > >
> > > On Wed, Apr 20, 2016 at 5:52 PM, Darin Johnson <
> dbjohnson1...@gmail.com>
> > > wrote:
> > >
> > > > Hey Zachary! Thanks, for the upvote.
> > > >
> > > > Also if you're looking for projects ping me! I'm going to be adding
> > more
> > > > tickets in the next few days.
> > > >
> > > > On Wed, Apr 20, 2016 at 8:38 PM, Zachary Jaffee <z...@case.edu>
> wrote:
> > > >
> > > > > If I recall the original reason for the 0.1.1 release was that it
> > would
> > > > be
> > > > > able to get it out earlier than the 0.2.0 release. Since it looks
> > like
> > > > they
> > > > > will be released at the same time essentially, the reason to
> release
> > > > 0.1.1
> > > > > over just waiting to release 0.2.0 goes away.
> > > > >
> > > > > On Wed, Apr 20, 2016 at 5:29 PM, Darin Johnson <
> > > dbjohnson1...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > During the dev sync today we discussed the upcoming 0.1.1 and
> 0.2.0
> > > > > > releases.  Currently, the only out standing issue for 0.1.1 is
> > > > MYRIAD-192
> > > > > > (Cgroups), for Myriad 0.2.0 the outstanding issues are Myriad-36
> > and
> > > > > Myriad
> > > > > > 192 (Configuration and Docker/Appc support).  Currently I have a
> > WIP
> > > PR
> > > > > for
> > > > > > Docker Support which I'd like some feedback on (it's should be
> > super
> > > > easy
> > > > > > to test), I'll probably complete Myriad 192 and part of that PR
> as
> > > > it's a
> > > > > > natural fit.  I estimate I can get all patches done by early may
> > and
> > > > > > hopefully get a release or release candidate out by May 11
> > > (ApacheCon).
> > > > > >
> > > > > > Due to the Alpha nature of Myriad and the significant value of
> > Docker
> > > > and
> > > > > > Configuration support, I think most people would opt for 0.2.0
> over
> > > > 0.1.1
> > > > > > and don't feel it's worth the effort to provide both releases at
> > this
> > > > > > time.  I suggest simply doing a 0.2.0 release.  Are there any
> > > > objections?
> > > > > >
> > > > > > Darin
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Zach Jaffee
> > > > > B.S. Computer Science
> > > > > Case Western Reserve University Class of 2017
> > > > > Operations Director | WRUW FM 91.1 Cleveland
> > > > > (917) 881-0646
> > > > > zjaffee.com
> > > > > linkedin.com/in/zjaffee
> > > > > github.com/ZJaffee
> > > > >
> > > >
> > >
> >
>

Re: Hello Guys

2016-04-21 Thread Darin Johnson

Hey Sam,

I'm already myriad alongside a CM managed hadoop cluster. It's a little
hacky right now, I'm working on stream lining this, it may involve CM
and/or some docker integration. Here are my current steps:

0. I strongly recommend pulling of master and building from source - it has
some really useful patches. We're working on another release now (couple
weeks out though).

1. Let Cloudera Manager configure hdfs - it does a good job of this.

2. Grab a cloudera tar ball from here:
http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_eom.html#topic_3
(I've also just used apache-hadoop tarballs).

3. Extract the tar ball, copy the native libraries cdh install on your
system in hadoop-*/lib/native

4 cp myriad/myriad-*/build/libs/* hadoop-*/share/hadoop/yarn/lib

5 copy your hadoop configs to hadoop-*/etc/hadoop/

6 create a myriad-default-config.yml in hadoop-*/etc/hadoop follow
instructions on wiki for remote distribution for edits to
myriad-default-config.yml. NB: don't enable yarn cgroups yet - I'm fixing
a bug.

7 chown -R root:root hadoop-*y

8 chown root:yarn hadoop-*/bin/container-executor ; chmod g+s
hadoop*/bin/container-executor

9 mv hadoop-<..> hadoop-myriad

10 tar -zxvf hadoop-myriad.tgz hadoop-myriad ; hadoop fs -put
hadoop-myriad.tgz /dist/

11 cd hadoop-myriad/ && sudo -u yarn bin/yarn resource manager

12 hit the web-ui at host:8192 and flexup some node managers

This has been pretty stable.

Alternatively, if you're running mesos and docker you could look at PR
https://github.com/apache/incubator-myriad/pull/64, it's still WIP but
avoids a lot of setup - would be happy to work with you as I document and
harden this feature for the 0.2.0 release. Currently that runs off generic
hadoop put we could certainly create distribution specific dockerfiles.

Let me know if you need help. Keep in mind this is still an alpha project,
so expect some issues. Would love to get feedback, use cases and feature
ideas.

Also a mesos tip: if your running services like hdfs outside of mesos you
should adjust you're mesos resources appropriately or you'll end up over
subscribed and processes will slow or die.

Thanks for trying myriad!

Darin

On Apr 21, 2016 4:22 AM, "rchen" wrote:

Hi Guys,
Currrently, we are working on Myriad with Cloudera distribution. And we
have proved that Mesos and work with Yarn.
However talking about CM, how to integrate ? Any comments is welcome.
appreciated.

Regards
Sam

Re: Myriad Releases

2016-04-20 Thread Darin Johnson

Hey Zachary! Thanks, for the upvote.

Also if you're looking for projects ping me! I'm going to be adding more
tickets in the next few days.

On Wed, Apr 20, 2016 at 8:38 PM, Zachary Jaffee <z...@case.edu> wrote:

> If I recall the original reason for the 0.1.1 release was that it would be
> able to get it out earlier than the 0.2.0 release. Since it looks like they
> will be released at the same time essentially, the reason to release 0.1.1
> over just waiting to release 0.2.0 goes away.
>
> On Wed, Apr 20, 2016 at 5:29 PM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > During the dev sync today we discussed the upcoming 0.1.1 and 0.2.0
> > releases.  Currently, the only out standing issue for 0.1.1 is MYRIAD-192
> > (Cgroups), for Myriad 0.2.0 the outstanding issues are Myriad-36 and
> Myriad
> > 192 (Configuration and Docker/Appc support).  Currently I have a WIP PR
> for
> > Docker Support which I'd like some feedback on (it's should be super easy
> > to test), I'll probably complete Myriad 192 and part of that PR as it's a
> > natural fit.  I estimate I can get all patches done by early may and
> > hopefully get a release or release candidate out by May 11 (ApacheCon).
> >
> > Due to the Alpha nature of Myriad and the significant value of Docker and
> > Configuration support, I think most people would opt for 0.2.0 over 0.1.1
> > and don't feel it's worth the effort to provide both releases at this
> > time.  I suggest simply doing a 0.2.0 release.  Are there any objections?
> >
> > Darin
> >
>
>
>
> --
> Zach Jaffee
> B.S. Computer Science
> Case Western Reserve University Class of 2017
> Operations Director | WRUW FM 91.1 Cleveland
> (917) 881-0646
> zjaffee.com
> linkedin.com/in/zjaffee
> github.com/ZJaffee
>

Myriad Releases

2016-04-20 Thread Darin Johnson

During the dev sync today we discussed the upcoming 0.1.1 and 0.2.0
releases.  Currently, the only out standing issue for 0.1.1 is MYRIAD-192
(Cgroups), for Myriad 0.2.0 the outstanding issues are Myriad-36 and Myriad
192 (Configuration and Docker/Appc support).  Currently I have a WIP PR for
Docker Support which I'd like some feedback on (it's should be super easy
to test), I'll probably complete Myriad 192 and part of that PR as it's a
natural fit.  I estimate I can get all patches done by early may and
hopefully get a release or release candidate out by May 11 (ApacheCon).

Due to the Alpha nature of Myriad and the significant value of Docker and
Configuration support, I think most people would opt for 0.2.0 over 0.1.1
and don't feel it's worth the effort to provide both releases at this
time.  I suggest simply doing a 0.2.0 release.  Are there any objections?

Darin

Re: Observations on Fine Grained Scaling

2016-04-13 Thread Darin Johnson

Santosh, I get a lot of 2-3 containers.  But I can only get 9-12 (Topped of
the cpu resources at 12 cores) containers if a task runs for more than 30
secs (preferably 60 secs), that's generally not an issue but I thought it
was worth putting on the list for general knowledge.  It's also less of a
deal when more jobs are running.

I have a few ideas on how to improve it and data locality, but think that
it'll likely involve refactoring YarnNodeCapacityManager and
OfferLifeCycleManager to interfaces that can be extended to handle
different strategies which can be configured at on startup.  I'd love to
start that discussion once we finish getting the basic mechanics working.
Maybe a 0.3.0 or 0.4.0 release?

Darin

On Wed, Apr 13, 2016 at 2:14 PM, Santosh Marella <smare...@maprtech.com>
wrote:

> > After the patches it seems stable, I'm able to run multiple terasort/pi
> > jobs and a few scalding jobs without difficulty.
> Great work, Darin. Glad to see FGS is now stable.
>
> >Noticed with jobs with short map tasks (8-12 secs), I rarely got more
> > than two containers per node, I'm curious if I'm not consuming resources
> > fast enough.
> Yes. Perhaps we need to tune the rate at which Mesos sends out resource
> offers
> to frameworks. The default that we observe in Myriad is 5 seconds. However,
> if your
> job has many map tasks and Mesos offer is big enough to accommodate several
> of them,
> then you should ideally see lot more than 2-3 containers per node.
>
> Isn't that happening? How many map tasks does your job have?
>
> Thanks,
> Santosh
>
> On Wed, Apr 13, 2016 at 8:34 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > I've been running a number of tests on the Fine Grained scaling aspect on
> > Myriad.  Here's a few notes:
> >
> > 1. After the patches it seems stable, I'm able to run multiple
> terasort/pi
> > jobs and a few scalding jobs without difficulty.
> > 2. Noticed with jobs with short map tasks (8-12 secs), I rarely got more
> > than two containers per node, I'm curious if I'm not consuming resources
> > fast enough.  The issue goes away on the reduce side (able to get far
> > better utilization of offers).  The issue can be lessened by increasing
> > mapred.splits.min.size and mapred.splits.max.size.  This may be an issue
> > for things like Hive.
> >
> > Darin
> >
>

Observations on Fine Grained Scaling

2016-04-13 Thread Darin Johnson

I've been running a number of tests on the Fine Grained scaling aspect on
Myriad.  Here's a few notes:

1. After the patches it seems stable, I'm able to run multiple terasort/pi
jobs and a few scalding jobs without difficulty.
2. Noticed with jobs with short map tasks (8-12 secs), I rarely got more
than two containers per node, I'm curious if I'm not consuming resources
fast enough.  The issue goes away on the reduce side (able to get far
better utilization of offers).  The issue can be lessened by increasing
mapred.splits.min.size and mapred.splits.max.size.  This may be an issue
for things like Hive.

Darin

Re: Myriad 0.1.1 Release

2016-04-08 Thread Darin Johnson

Thanks Adam, I think 194 was a typo I believe I meant 195 (node manager
dieing).  Currently I haven't noticed the behavior so it will depend on
root cause.
On Apr 8, 2016 4:51 AM, "Adam Bordelon" <a...@mesosphere.io> wrote:

Thanks for heading this up Darin!

Here's a link to the JIRA Issues page for Myriad 0.1.1:
https://issues.apache.org/jira/browse/MYRIAD/fixforversion/12335455/?selectedTab=com.atlassian.jira.jira-projects-plugin:version-issues-panel

Looking at MYRIAD-194, it seems to be a new feature request, and only
bugfixes should go into a patch release like 0.1.1, so I vote against
its inclusion. MYRIAD-191 could be worth including, depending on the
root-cause.

On Wed, Apr 6, 2016 at 11:06 AM, Darin Johnson <dbjohnson1...@gmail.com>
wrote:
> Hi,
>
> I'm the release manager for Myriad 0.1.1, which we're hoping to get out in
> the next couple weeks. Here's the list of PR's and JIRA's that I think
> should go into the release since 0.1.0:
>
> Complete:
> #56  MYRIAD-181 Build failure due to dependency on zookeeper test jar
> <https://github.com/apache/incubator-myriad/pull/56>
> #57 MYRIAD-153: tasks not finishing when FGS is enabled
> #60 MYRIAD-186 Clean up the build
> <https://github.com/apache/incubator-myriad/pull/60>
> #62 Myriad 188 - NodeManager switch to UNHEALTHY causes NPE
> <https://github.com/apache/incubator-myriad/pull/62>
> #63 MYRIAD-171, MYRIAD-190
> <https://github.com/apache/incubator-myriad/pull/63>, compatibility issues
> with 2.7.1+ and 2.6.2+
>
> Todo:
> Myriad-192
>
> Possibly (Pending determination of issue):
> Myriad-194
> Myriad-191
>
> This is certainly open for discussion so if you think something should be
> added or removed please respond.  This is a fix release so no new features
> are to be added.  However, we plan to release 0.2.0 shortly after so new
> features can be added then.
>
> Darin

Myriad 0.1.1 Release

2016-04-06 Thread Darin Johnson

Hi,

I'm the release manager for Myriad 0.1.1, which we're hoping to get out in
the next couple weeks. Here's the list of PR's and JIRA's that I think
should go into the release since 0.1.0:

Complete:
#56  MYRIAD-181 Build failure due to dependency on zookeeper test jar

#57 MYRIAD-153: tasks not finishing when FGS is enabled
#60 MYRIAD-186 Clean up the build

#62 Myriad 188 - NodeManager switch to UNHEALTHY causes NPE

#63 MYRIAD-171, MYRIAD-190
, compatibility issues
with 2.7.1+ and 2.6.2+

Todo:
Myriad-192

Possibly (Pending determination of issue):
Myriad-194
Myriad-191

This is certainly open for discussion so if you think something should be
added or removed please respond.  This is a fix release so no new features
are to be added.  However, we plan to release 0.2.0 shortly after so new
features can be added then.

Darin

Re: Challenges after MapR 5.1 Upgrade.

2016-04-04 Thread Darin Johnson

Hey John,

I noticed these lines in your yarn-site.xml:


yarn.scheduler.minimum-allocation-mb
512




yarn.scheduler.minimum-allocation-vcores

1


If your attempting to launch a zero resource nodemanager for fgs that will
result in the first stack trace.  Both should be explicitly 0 for that
feature to work (defaults are 1024 and 1 resp, which will fail).  You do
have them set below to 0, however I'm in certain which would take
precedence.
On Apr 4, 2016 5:19 PM, "John Omernik"  wrote:

> This was a Upgrade from 5.0.  I will post here, note: I have removed the
> mapr_shuffle to get node managers to work, however, I am seeing other odd
> things, so any help would be appreciated.
>
> 
> 
>
> 
> 
> yarn.nodemanager.aux-services
> mapreduce_shuffle,myriad_executor
> 
> 
> 
> yarn.resourcemanager.hostname
> myriadprod.marathonprod.mesos
> 
> 
> yarn.nodemanager.aux-services.mapreduce_shuffle.class
> org.apache.hadoop.mapred.ShuffleHandler
> 
> 
> yarn.nodemanager.aux-services.myriad_executor.class
> org.apache.myriad.executor.MyriadExecutorAuxService
> 
> 
> yarn.nm.liveness-monitor.expiry-interval-ms
> 2000
> 
> 
> yarn.am.liveness-monitor.expiry-interval-ms
> 1
> 
> 
> yarn.resourcemanager.nm.liveness-monitor.interval-ms
> 1000
> 
> 
> 
> yarn.nodemanager.resource.cpu-vcores
> ${nodemanager.resource.cpu-vcores}
> 
> 
> yarn.nodemanager.resource.memory-mb
> ${nodemanager.resource.memory-mb}
> 
>
> 
> 
> yarn.scheduler.minimum-allocation-mb
> 512
> 
>
> 
> yarn.scheduler.minimum-allocation-vcores
> 1
> 
>
>
> 
>   
>
> yarn.nodemanager.address
> ${myriad.yarn.nodemanager.address}
> 
> 
> yarn.nodemanager.webapp.address
> ${myriad.yarn.nodemanager.webapp.address}
> 
> 
> yarn.nodemanager.webapp.https.address
> ${myriad.yarn.nodemanager.webapp.address}
> 
> 
> yarn.nodemanager.localizer.address
> ${myriad.yarn.nodemanager.localizer.address}
> 
>
> 
> 
> yarn.resourcemanager.scheduler.class
> org.apache.myriad.scheduler.yarn.MyriadFairScheduler
> 
>
> 
>
> yarn.scheduler.minimum-allocation-vcores
> 0
> 
> 
> yarn.scheduler.minimum-allocation-vcores
> 0
> 
> 
> 
> Who will execute(launch) the containers.
> yarn.nodemanager.container-executor.class
> ${yarn.nodemanager.container-executor.class}
> 
> 
> The class which should help the LCE handle
> resources.
>
>
> yarn.nodemanager.linux-container-executor.resources-handler.class
>
>
> ${yarn.nodemanager.linux-container-executor.resources-handler.class}
> 
> 
>
> yarn.nodemanager.linux-container-executor.cgroups.hierarchy
>
>
> ${yarn.nodemanager.linux-container-executor.cgroups.hierarchy}
> 
> 
>
> yarn.nodemanager.linux-container-executor.cgroups.mount
>
> ${yarn.nodemanager.linux-container-executor.cgroups.mount}
> 
> 
>
> yarn.nodemanager.linux-container-executor.cgroups.mount-path
>
>
> ${yarn.nodemanager.linux-container-executor.cgroups.mount-path}
> 
> 
> yarn.nodemanager.linux-container-executor.group
> ${yarn.nodemanager.linux-container-executor.group}
> 
> 
> yarn.nodemanager.linux-container-executor.path
> ${yarn.home}/bin/container-executor
> 
> 
> yarn.http.policy
> HTTP_ONLY
> 
> 
>
> On Mon, Apr 4, 2016 at 3:53 PM, yuliya Feldman  >
> wrote:
>
> > YarnDefaultProperties.java that defines class for mapr_direct_shuffle
> > should be there even in 5.0, so nothing new there even if maprfs jar is
> > outdated - could you also check that?
> > Also could you paste content of your yarn-site.xml here?
> > Thanks,Yuliya
> >
> >   From: yuliya Feldman 
> >  To: "dev@myriad.incubator.apache.org" 
> >  Sent: Monday, April 4, 2016 1:43 PM
> >  Subject: Re: Challenges after MapR 5.1 Upgrade.
> >
> > Hello John,
> > Did you upgrade to 5.1 or installed new one?
> > Feels like MapR default properties were not loaded - I need to poke
> around
> > and then I will ask you for additional info
> > Thanks,Yuliya
> >
> >   From: John Omernik 
> >  To: dev@myriad.incubator.apache.org
> >  Sent: Monday, April 4, 2016 12:29 PM
> >  Subject: Challenges after MapR 5.1 Upgrade.
> >
> > I had at one point Myriad working fine in MapR 5.0.  I updated to 5.1,
> and
> > repackaged my hadoop tgz for remote distribution and now I have two
> > problems occurring.
> >
> > 1. At first when I had the mapr direct shuffle enabled per the
> > yarn-site.xml on

Re: Myriad talk link for MesosCon?

2016-03-23 Thread Darin Johnson

Yeah I didn't see one either.

Darin

On Wed, Mar 23, 2016 at 1:10 PM, Sarjeet Singh 
wrote:

> I couldn't find any associated link of myriad talk for MesosCon voting.
> Anyone?
>
> Though, I found these proposal doc:
>
> Developers: http://bit.ly/1RpZPvj
> Users: http://bit.ly/1Mspaxp
>
>
> *It seems the deadline for the proposal voting is today, March 23 2016.*
>
> -Sarjeet
>

Re: 0.2.0 release

2016-03-23 Thread Darin Johnson

Swanil,

I concur and want to keep both options for Mesos and Docker networking
available, and putting the configuration for both in should be a priority.
However, one has to be careful with this as the NM's register with the RM
via heartbeats with their container port (Not the host port), this isn't an
issue if NM and RM are in the same Docker Network, via Weave or Kubernetes
but is with simple bridged networking. We also have to be careful as Myriad
currently doesn't run HDFS itself so we'd lose data locality.  My idea was
the start with Host Networking so we could make Myriad easier to deploy but
leave room to add additional networking options: basically exposing all the
protobuf options for Docker Parameters (used to configure docker
networking) and NetworkInfo (used to configure Mesos networking).

Darin

On Tue, Mar 22, 2016 at 2:48 PM, Swapnil Daingade <sdaing...@maprtech.com>
wrote:

> Hi Darin,
>
> I feel docker networking is something we should spent time to think
> through.
> A user should be able to use multiple options provided by Mesos, Docker,
> 3rd party etc
>
> It would be great if we can abstract the specific implementation to provide
> container ip addresses behind interfaces. User should be able to switch
> implementations by making simple changes in configuration files.
>
> Regards
> Swapnil
>
>
> On Tue, Mar 22, 2016 at 8:20 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > Swapnil,
> >
> > Any help would be appreciated.  I'll try to write up what I'm working on
> > tomorrow.  But essentially the ideas are:
> > 1. Ability to launch the resource manager and node managers in docker
> > containers
> > 2. Use host networking for now (Ports configured to be pulled from mesos
> -
> > ability to use ports reserved by role), but leave hooks to easily add IP
> > per container.
> > 3. Ability to get configuration files for a URI
> > 4. Ability to mount local volumes for local directories in the shuffle
> > phase etc (though will require more config).
> >
> > Darin
> >
>

Re: NM does not start with cgroups enabled

2016-03-23 Thread Darin Johnson

Hey, Bjorn sorry for the delay, looking at the difference between the
exceptions and my own experience I believe you left some cgroup configs in
yarn-site.xml of the node manager.
On Mar 18, 2016 2:58 AM, "Björn Hagemeier" <b.hageme...@fz-juelich.de>
wrote:

> Hi Darin,
>
> thanks a lot for this. But what about the other case below, when cgroups
> is disabled?
>
>
> Björn
>
> Am 18.03.2016 um 00:25 schrieb Darin Johnson:
> > Hey Bjorn,
> >
> > I think I figured out the issue.  Some of the values for cgroups are
> still
> > hardcoded in myriad.  I'll add a JIRA Ticket hopefully we can get an
> update
> > for 0.2.0.  I'll also respond to this thread after a pull request is
> > submitted in case you'd like to test it.
> >
> > Darin
> > Hi all,
> >
> > I have trouble starting the NM on the slave nodes. Apparently, it does
> > not find it's configuration or sth. is wrong with the configuration.
> >
> > With cgroups enabled, the NM does not start, the logs contain,
> > indicating that there is sth. wrong in the configuratin. However,
> > yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
> > value used to be "${yarn.nodemanager.linux-container-executor.group}" as
> > indicated by the installation documentation, however I'm uncertain
> > whether this recursion is the correct approach.
> >
> >
> > ==
> > 16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting
> NodeManager
> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> > initialize container executor
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
> > at
> > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
> > Caused by: java.io.IOException: Linux container executor not configured
> > properly (error=24)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
> > ... 3 more
> > Caused by: ExitCodeException exitCode=24: Can't get configured value for
> > yarn.nodemanager.linux-container-executor.group.
> >
> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
> > at org.apache.hadoop.util.Shell.run(Shell.java:460)
> > at
> > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
> > ... 4 more
> > ==
> >
> >
> > I have given it another try with cgroups disabled (in
> > myriad-config-default.yml), I seem to get a little further, but still
> > stuck at running Yarn jobs:
> >
> > ==
> > 16/03/14 10:56:34 INFO container.Container: Container
> > container_1457949199710_0001_01_01 transitioned from LOCALIZED to
> > RUNNING
> > 16/03/14 10:56:34 INFO nodemanager.DefaultContainerExecutor:
> > launchContainer: [bash,
> >
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/application_1457949199710_0001/container_1457949199710_0001_01_01/default_container_executor.sh]
> > 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exit code
> > from container container_1457949199710_0001_01_01 is : 1
> > 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exception
> > from container-launch with container ID:
> > container_1457949199710_0001_01_01 and exit code: 1
> > ExitCodeException exitCode=1:
> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
> > at org.apache.hadoop.util.Shell.run(Shell.java:460)
> > at
> > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
> > at
> >
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
> >

Re: 0.2.0 release

2016-03-22 Thread Darin Johnson

Swapnil,

Any help would be appreciated.  I'll try to write up what I'm working on
tomorrow.  But essentially the ideas are:
1. Ability to launch the resource manager and node managers in docker
containers
2. Use host networking for now (Ports configured to be pulled from mesos -
ability to use ports reserved by role), but leave hooks to easily add IP
per container.
3. Ability to get configuration files for a URI
4. Ability to mount local volumes for local directories in the shuffle
phase etc (though will require more config).

Darin

Re: NM does not start with cgroups enabled

2016-03-20 Thread Darin Johnson

Hey Bjorn,

I think I figured out the issue.  Some of the values for cgroups are still
hardcoded in myriad.  I'll add a JIRA Ticket hopefully we can get an update
for 0.2.0.  I'll also respond to this thread after a pull request is
submitted in case you'd like to test it.

Darin
Hi all,

I have trouble starting the NM on the slave nodes. Apparently, it does
not find it's configuration or sth. is wrong with the configuration.

With cgroups enabled, the NM does not start, the logs contain,
indicating that there is sth. wrong in the configuratin. However,
yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
value used to be "${yarn.nodemanager.linux-container-executor.group}" as
indicated by the installation documentation, however I'm uncertain
whether this recursion is the correct approach.


==
16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
initialize container executor
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
at
org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
Caused by: java.io.IOException: Linux container executor not configured
properly (error=24)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
... 3 more
Caused by: ExitCodeException exitCode=24: Can't get configured value for
yarn.nodemanager.linux-container-executor.group.

at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
at org.apache.hadoop.util.Shell.run(Shell.java:460)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
... 4 more
==


I have given it another try with cgroups disabled (in
myriad-config-default.yml), I seem to get a little further, but still
stuck at running Yarn jobs:

==
16/03/14 10:56:34 INFO container.Container: Container
container_1457949199710_0001_01_01 transitioned from LOCALIZED to
RUNNING
16/03/14 10:56:34 INFO nodemanager.DefaultContainerExecutor:
launchContainer: [bash,
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/application_1457949199710_0001/container_1457949199710_0001_01_01/default_container_executor.sh]
16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exit code
from container container_1457949199710_0001_01_01 is : 1
16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exception
from container-launch with container ID:
container_1457949199710_0001_01_01 and exit code: 1
ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
at org.apache.hadoop.util.Shell.run(Shell.java:460)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exception from
container-launch.
16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Container id:
container_1457949199710_0001_01_01
16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exit code: 1
==

Unfortunately, directory
/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/
is empty, the log indicates that it is being deleted after the failed
attempt.

Again, any hint would be useful. Also regarding the activation of cgroups.


Best regards,
Björn

--
Dipl.-Inform. Björn Hagemeier
Federated Systems and Data
Juelich Supercomputing Centre
Institute for Advanced Simulation

Phone: +49 2461 61 1584
Fax  : +49 2461 61 6656
Email: b.hageme...@fz-juelich.de
Skype: bhagemeier
WWW  : http://www.fz-juelich.de/jsc

JSC is the coordinator of the
John von Neumann Institute

Re: 0.2.0 release

2016-03-19 Thread Darin Johnson

Happy to report as of the last two PRS, FGS is usable no memory leaks or
crashes, could likely be improved with fancier schedulers but that's for
the future.  I'm currently looking at running some terasort benchmarks with
FGS and a reserved resources vs statically sized NMs to figure out the
performance hit.  Might be worth a blog post in the near future.

Adam, I've been looking through the cgroups code for myriad recently,
apparently we need the mod the path YARN uses for it's Hierarchy.  Does
that change at all within a Docker container or is it the same?

Darin

On Mar 16, 2016 8:48 PM, "Adam Bordelon" <a...@mesosphere.io> wrote:

> +1 on Darin as release manager
>
> I'd like to see 0.2 have:
> - Usable FGS
> - Dockerized NM (for multitenancy)
>
> On Tue, Mar 15, 2016 at 9:46 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > We've talked about a 0.2.0 release slated for mid April at the dev sync.
> > I'd like to nail down any features people would like and have time to
> work
> > on.
> >
> > I've been spend some time fixing major bugs to the FGS feature and plan
> to
> > address MYRIAD-136 and MYRIAD-189.
> >
> > I'd also be willing to be the release manager on this release if
> necessary.
> >
> > Darin
> >
>

Re: NM does not start with cgroups enabled

2016-03-16 Thread Darin Johnson

what does your container-executor.cfg look like?  Seems like
yarn.nodemanager.linux-container-executor.group isn't set, or possibly
bannerusers= hasn't been set (some distro's).

On Tue, Mar 15, 2016 at 12:52 PM, Darin Johnson <dbjohnson1...@gmail.com>
wrote:

> Bjorn,
>
> You're isolation configuration is correct, I was going from memory.  I'll
> take a look at you're configs a little later on my test environment and see
> what I can come up with.
>
> Darin
>
> On Tue, Mar 15, 2016 at 12:07 PM, Björn Hagemeier <
> b.hageme...@fz-juelich.de> wrote:
>
>> Dear Darin,
>>
>> thanks for your response.
>>
>> The precise content of /etc/mesos-slave/isolation is:
>>
>> ==
>> cgroups/cpu,cgroups/mem
>> ==
>>
>> Which I took from some documentation, it may have been that of the
>> Puppet module I'm using [1]. Should the values be different? Your string
>> looks a bit different: "cpu/cgroups,memory/cgroups".
>>
>> Please find my yarn-site.xml and myriad-config-default.yml attached. I
>> don't think they contain any sensitive information.
>>
>>
>> Best regards,
>> Björn
>>
>> [1] https://github.com/deric/puppet-mesos
>>
>> Am 15.03.2016 um 16:46 schrieb Darin Johnson:
>> > Hey Bjorn,
>> >
>> > Can you copy paste the relevant part of the Myriad and yarn-site.xml?
>> > Also, can you ensure you are running the mesos-slave with
>> > --isolation="cpu/cgroups,memory/cgroups?.
>> >
>> > I'll try to recreate the problem and/or tell you what's missing in the
>> > config.
>> >
>> > Darin
>> >
>> > On Mon, Mar 14, 2016 at 6:19 AM, Björn Hagemeier <
>> b.hageme...@fz-juelich.de>
>> > wrote:
>> >
>> >> Hi all,
>> >>
>> >> I have trouble starting the NM on the slave nodes. Apparently, it does
>> >> not find it's configuration or sth. is wrong with the configuration.
>> >>
>> >> With cgroups enabled, the NM does not start, the logs contain,
>> >> indicating that there is sth. wrong in the configuratin. However,
>> >> yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
>> >> value used to be "${yarn.nodemanager.linux-container-executor.group}"
>> as
>> >> indicated by the installation documentation, however I'm uncertain
>> >> whether this recursion is the correct approach.
>> >>
>> >>
>> >> ==
>> >> 16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting
>> NodeManager
>> >> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
>> >> initialize container executor
>> >> at
>> >>
>> >>
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
>> >> at
>> >>
>> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>> >> at
>> >>
>> >>
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
>> >> at
>> >>
>> >>
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
>> >> Caused by: java.io.IOException: Linux container executor not configured
>> >> properly (error=24)
>> >> at
>> >>
>> >>
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
>> >> at
>> >>
>> >>
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
>> >> ... 3 more
>> >> Caused by: ExitCodeException exitCode=24: Can't get configured value
>> for
>> >> yarn.nodemanager.linux-container-executor.group.
>> >>
>> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
>> >> at org.apache.hadoop.util.Shell.run(Shell.java:460)
>> >> at
>> >>
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
>> >> at
>> >>
>> >>
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
>> >> ... 4 more

0.2.0 release

2016-03-15 Thread Darin Johnson

We've talked about a 0.2.0 release slated for mid April at the dev sync.
I'd like to nail down any features people would like and have time to work
on.

I've been spend some time fixing major bugs to the FGS feature and plan to
address MYRIAD-136 and MYRIAD-189.

I'd also be willing to be the release manager on this release if necessary.

Darin

Re: NM does not start with cgroups enabled

2016-03-15 Thread Darin Johnson

Hey Bjorn,

Can you copy paste the relevant part of the Myriad and yarn-site.xml?
Also, can you ensure you are running the mesos-slave with
--isolation="cpu/cgroups,memory/cgroups?.

I'll try to recreate the problem and/or tell you what's missing in the
config.

Darin

On Mon, Mar 14, 2016 at 6:19 AM, Björn Hagemeier 
wrote:

> Hi all,
>
> I have trouble starting the NM on the slave nodes. Apparently, it does
> not find it's configuration or sth. is wrong with the configuration.
>
> With cgroups enabled, the NM does not start, the logs contain,
> indicating that there is sth. wrong in the configuratin. However,
> yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
> value used to be "${yarn.nodemanager.linux-container-executor.group}" as
> indicated by the installation documentation, however I'm uncertain
> whether this recursion is the correct approach.
>
>
> ==
> 16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting NodeManager
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> initialize container executor
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
> Caused by: java.io.IOException: Linux container executor not configured
> properly (error=24)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
> ... 3 more
> Caused by: ExitCodeException exitCode=24: Can't get configured value for
> yarn.nodemanager.linux-container-executor.group.
>
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
> at org.apache.hadoop.util.Shell.run(Shell.java:460)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
> ... 4 more
> ==
>
>
> I have given it another try with cgroups disabled (in
> myriad-config-default.yml), I seem to get a little further, but still
> stuck at running Yarn jobs:
>
> ==
> 16/03/14 10:56:34 INFO container.Container: Container
> container_1457949199710_0001_01_01 transitioned from LOCALIZED to
> RUNNING
> 16/03/14 10:56:34 INFO nodemanager.DefaultContainerExecutor:
> launchContainer: [bash,
>
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/application_1457949199710_0001/container_1457949199710_0001_01_01/default_container_executor.sh]
> 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exit code
> from container container_1457949199710_0001_01_01 is : 1
> 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exception
> from container-launch with container ID:
> container_1457949199710_0001_01_01 and exit code: 1
> ExitCodeException exitCode=1:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
> at org.apache.hadoop.util.Shell.run(Shell.java:460)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
> at
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exception from
> container-launch.
> 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Container id:
> container_1457949199710_0001_01_01
> 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exit code: 1
> ==
>
> Unfortunately, directory
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/
> is empty, the log indicates that it is being deleted after the failed
> attempt.
>
> Again, any hint would be useful. Also regarding the activation of cgroups.
>
>
> Best regards,
> Björn
>
> --
> Dipl.-Inform. Björn Hagemeier
>

Re: Myriad Vagrant Setup Issue

2016-01-15 Thread Darin Johnson

Hey Matt, if you look at the mesos ui is there any information in the
stderr or stdout of the Slave Host it's staging on?

Darin

On Fri, Jan 15, 2016 at 10:36 AM, Matthew J. Loppatto <
mloppa...@keywcorp.com> wrote:

> I've gotten a little farther on this issue by increasing the mesos slave
> memory to 4 GB from 2GB.  The node manager task get launched and sits in
> the STAGING state for a minute and then the mesos-slave.INFO log shows:
>
> I0115 15:19:12.114537 30903 slave.cpp:3841] Terminating executor
> myriad_executor20160115-145750-344821002-5050-30838-20160115-145750-344821002-5050-30838-O18020160115-145750-344821002-5050-30838-S0
> of framework 20160115-145750-344821002-5050-30838- because it did not
> register within 1mins
>
> I then increased the mesos slave's executor_registration_timeout setting
> from 1mins to 5mins to see if that would make a difference but still get
> the following in the log:
>
> I0115 15:19:12.114537 30903 slave.cpp:3841] Terminating executor
> myriad_executor20160115-145750-344821002-5050-30838-20160115-145750-344821002-5050-30838-O18020160115-145750-344821002-5050-30838-S0
> of framework 20160115-145750-344821002-5050-30838- because it did not
> register within 5mins
>
> Is there any guidance on why the Myriad executor fails to register with
> the Mesos slave?
>
> Thanks,
> Matt
>
> -Original Message-
> From: Matthew J. Loppatto
> Sent: Thursday, January 14, 2016 2:25 PM
> To: 'dev@myriad.incubator.apache.org'
> Subject: RE: Myriad Vagrant Setup Issue
>
> Sarjeet,
>
> Thanks for the reply.  I modified the medium profile in my
> myriad-config-default.yml file to use 1 cpu and 1024 MB mem and am seeing a
> similar issue in the YARN resource manager log:
>
> Offer not sufficient for task with, cpu: 1.4, memory: 2432.0, ports: 1001
>
> If I try lowering the medium profile memory below 1024 I get the following
> message in the log:
>
> NodeManager from vagrant-ubuntu-trusty-64 doesn’t satisfy minimum
> allocations, Sending SHUTDOWN signal to NodeManager.
>
> Increasing the memory of the VM to 6 GB also didn't solve the issue.  Are
> there any other measures I can take to resolve the insufficient resource
> messages?
>
> Thanks,
> Matt
>
> -Original Message-
> From: sarjeet singh [mailto:sarje...@usc.edu]
> Sent: Thursday, January 14, 2016 12:41 PM
> To: dev@myriad.incubator.apache.org
> Subject: Re: Myriad Vagrant Setup Issue
>
> Matthew,
>
> You can modify profile configurations for Nodemanagers in
> myriad-config-default.yml and reduce medium (default) NM configuration to
> match with your VM capacity so a default NM (medium profile) could launch
> without any issue.
>
> - Sarjeet Singh
>
> On Thu, Jan 14, 2016 at 10:56 PM, Matthew J. Loppatto <
> mloppa...@keywcorp.com> wrote:
>
> > Hi,
> >
> > I'm trying to setup Myriad for an R project at my company but I'm
> > having some trouble even getting the Vagrant VM working properly.  I
> > followed the instructions here:
> >
> > https://github.com/apache/incubator-myriad/blob/master/docs/vagrant.md
> >
> > with some minor corrections but the Node Manager fails to start.  It
> > looks like a resource issue based on the log output.  The Mesos UI
> > shows a slave process with 2 cpu and 2 GB mem, but the log states the
> > task requires 4 cpu and 5.5 GB mem.
> >
> > I've detailed my configuration and log output in this public Gist:
> >
> > https://gist.github.com/FearTheParrot/626259c23a854645fcbf
> >
> > Would it be possible to provision the Mesos slave with more resources
> > while also reducing the profile size of the Node Manager?  The Vagrant
> > VM only has 4 GB ram and 2 cpu.
> >
> > Any help would be appreciated.
> >
> > Thanks!
> > Matt
> >
>

1 2 >

1 - 100 of 135 matches

Mail list logo