Re: how to change mesos resources

2016-04-08 Thread Vinod Kone
Have you tried the remedy steps included in the error message?



To remedy this do as follows:

Step 1: rm -f /tmp/mesos/meta/slaves/latest

This ensures slave doesn't recover old live executors.

Step 2: Restart the slave.



On Fri, Apr 8, 2016 at 11:29 AM, Stefano Bianchi 
wrote:

> Hi i would like to enter in this mailing list.
> i'm currently doing my Master Thesis on Mesos and Calico.
> I'm working at INFN, institute of nuclear physics. The goal of the thesis
> is to build a PaaS where mesos is the scheduler and Calico must allow the
> interconnection between multiple datacenters linked to the CERN.
>
> I'm exploiting an IaaS based on Openstack, here i have created 6 Virtual
> Machines, 3 Masters and 3 Slaves, on one slave is running Mesos-DNS from
> Marathon.
> All is perfectly working, since i am on another network i changed
> correctly the hostnames such that on mesos are resolvable and i tried to
> run from marathon a simple http server which is scalable on all my machine.
> So all is fine and working.
>
> The only thing that i don't like is that each 3 slaves have 1 CPU 10 GB of
> disk memory and 2GB of RAM, but mesos currently show for each one only 5 GB
> of disk memory and 900MB of RAM.
> So checking in documentation i found the command to manage the resources.
> I stopped Slave1, for instance, and i have run this command:
>
> mesos-slave --master=MASTER_ADDRESS:5050
> --resources='cpu:1;mem:2000;disk:9000'
>
> where i want set 2000 GB of RAM and 9000GB of disk memory.
>  The output is the following:
>
> I0408 15:11:00.915324  7892 main.cpp:215] Build: 2016-03-10 20:32:58 by root
>
> I0408 15:11:00.915436  7892 main.cpp:217] Version: 0.27.2
>
> I0408 15:11:00.915448  7892 main.cpp:220] Git tag: 0.27.2
>
> I0408 15:11:00.915459  7892 main.cpp:224] Git SHA: 
> 3c9ec4a0f34420b7803848af597de00fedefe0e2
>
> I0408 15:11:00.923334  7892 systemd.cpp:236] systemd version `219` detected
>
> I0408 15:11:00.923384  7892 main.cpp:232] Inializing systemd state
>
> I0408 15:11:00.950050  7892 systemd.cpp:324] Started systemd slice 
> `mesos_executors.slice`
>
> I0408 15:11:00.951529  7892 containerizer.cpp:143] Using isolation: 
> posix/cpu,posix/mem,filesystem/posix
>
> I0408 15:11:00.963232  7892 linux_launcher.cpp:101] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
>
> I0408 15:11:00.965541  7892 main.cpp:320] Starting Mesos slave
>
> I0408 15:11:00.966008  7892 slave.cpp:192] Slave started on 
> 1)@192.168.100.56:5051
>
> I0408 15:11:00.966023  7892 slave.cpp:193] Flags at startup: 
> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5" 
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
> --cgroups_root="mesos" --container_disk_watch_interval="15secs" 
> --containerizers="mesos" --default_role="*" --disk_watch_interval="1mins" 
> --docker="docker" --docker_auth_server="https://auth.docker.io; 
> --docker_kill_orphans="true" --docker_puller_timeout="60" 
> --docker_registry="https://registry-1.docker.io; --docker_remove_delay="6hrs" 
> --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" 
> --docker_store_dir="/tmp/mesos/store/docker" 
> --enforce_container_disk_quota="false" 
> --executor_registration_timeout="1mins" 
> --executor_shutdown_grace_period="5secs" 
> --fetcher_cache_dir="/tmp/mesos/fetch" --fetcher_cache_size="2GB" 
> --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" 
> --hadoop_home="" --help="false" --hostname_lookup="true" 
> --image_provisioner_backend="copy" --initialize_driver_logging="true" 
> --isolation="posix/cpu,posix/mem" --launcher_dir="/usr/libexec/mesos" 
> --logbufsecs="0" --logging_level="INFO" --master="192.168.100.55:5050" 
> --oversubscribed_resources_interval="15secs" --perf_duration="10secs" 
> --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" 
> --quiet="false" --recover="reconnect" --recovery_timeout="15mins" 
> --registration_backoff_factor="1secs" --resources="cpu:1;mem:2000;disk:9000" 
> --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox" 
> --strict="true" --switch_user="true" --systemd_enable_support="true" 
> --systemd_runtime_directory="/run/systemd/system" --version="false" 
> --work_dir="/tmp/mesos"
>
> I0408 15:11:00.967485  7892 slave.cpp:463] Slave resources: cpu(*):1; 
> mem(*):2000; disk(*):9000; cpus(*):1; ports(*):[31000-32000]
>
> I0408 15:11:00.967547  7892 slave.cpp:471] Slave attributes: [  ]
>
> I0408 15:11:00.967560  7892 slave.cpp:476] Slave hostname: 
> slave1.openstacklocal
>
> I0408 15:11:00.971304  7893 state.cpp:58] Recovering state from 
> '/tmp/mesos/meta'
>
> *Failed to perform recovery: Incompatible slave info detected*.
>
> 
>
> Old slave info:
>
> hostname: 

Re: [VOTE] Release Apache Mesos 0.26.1 (rc2)

2016-03-19 Thread Vinod Kone
+1 (binding)

Tested on ASF CI.

On Sun, Mar 13, 2016 at 4:33 PM, Michael Park  wrote:

> +1 (binding)
>
> Internal CI results with the corresponding JIRA tickets for the failed
> tests:
>
> CentOS 6 (non-SSL):
>   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
> (MESOS-3049 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> CentOS 6 (SSL):
>   - MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward
> (MESOS-3049 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> CentOS 7 (non-SSL):
>   - LimitedCpuIsolatorTest.ROOT_CGROUPS_Pids_and_Tids
> (MESOS-4677 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> CentOS 7 (SSL):
>   - FetcherCacheTest.RemoveLRUCacheEntries
> (MESOS-4156 )
>   - PerfEventIsolatorTest.ROOT_CGROUPS_Sample
> (MESOS-4039 )
>   - CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> (MESOS-3215 )
>   - MemoryPressureMesosTest.CGROUPS_ROOT_Statistics
>   - MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery
> (MESOS-4047 ,
> MESOS-4053 )
>
> Debian 8 (non-SSL): Success!
> Debian 8 (SSL): Failed with MESOS-2017
> 
>
> Ubuntu 12 (non-SSL):
> Ubuntu 12 (SSL):
> Ubuntu 14 (non-SSL):
> Ubuntu 14 (SSL):
>   - UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup
>   - UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup
> (MESOS-4035 )
>
> Ubuntu 15 (non-SSL): Success!
> Ubuntu 15 (SSL): Success!
>
> On 13 March 2016 at 18:43, Michael Park  wrote:
>
> > While the vote for this release was open until Fri Mar 11 23:59:59 EST
> > 2016,
> > I'm going to give it another 3 days since there has not been any -1
> votes.
> >
> > The vote is extended until Wed Mar 16 23:59:59 EST 2016.
> >
> > On 10 March 2016 at 12:40, Michael Park  wrote:
> >
> >> Thanks Greg!
> >>
> >> On 10 March 2016 at 12:32, Greg Mann  wrote:
> >>
> >>> +1 (non-binding)
> >>>
> >>> Ran `sudo make check` on CentOS 7, using gcc with libevent and SSL
> >>> enabled. All tests pass.
> >>>
> >>> I was also able to successfully test a simple upgrade scenario from
> >>> 0.25.1-rc2 to 0.26.1-rc2 using the script found here:
> >>> https://reviews.apache.org/r/44229/
> >>>
> >>> Cheers,
> >>> Greg
> >>>
> >>>
> >>> On Tue, Mar 8, 2016 at 7:48 PM, Michael Park  wrote:
> >>>
>  Hi all,
> 
>  Please vote on releasing the following candidate as Apache Mesos
> 0.26.1.
> 
> 
>  0.26.1 includes the following:
> 
> 
> 
> 
>  The only diff with RC1 is the following: Fix CGROUPS_ROOT_* tests on
>  systemd platforms.
>  <
> https://github.com/apache/mesos/commit/a896cda4aa8bb9c9bbfba20dda4b68df8dbdf569
> >
>  This patch is necessary in order to make the `systemd` integration
> work
>  correctly.
>  It was part of 

Re: How to make full version available in /version endpoint

2016-03-23 Thread Vinod Kone
Not currently, no. What's your use case?

On Wed, Mar 23, 2016 at 3:50 PM, Zhitao Li  wrote:

> Hi,
>
> Has anyone brought up the possibility of making the full version
> (i.e. 0.28.0-2.0.16.debian81a) show up in the the /version endpoint?
>
> For example, when we are using the mesosphere community package, we want 
> '0.27.1-2.0.226.debian81'
> string show up, but we only get the following right now:
>
> {
>   "build_date": "2016-02-23 00:39:17",
>   "build_time": 1456187957,
>   "build_user": "root",
>   "git_sha": "864fe8eabd4a83b78ce9140c501908ee3cb90beb",
>   "git_tag": "0.27.1",
>   "version": "0.27.1"
> }
>
> Is there an environment variable or something which we could tweak at
> build/package time to get it? Thanks!
>
> --
> Cheers,
>
> Zhitao Li
>


[RFC] Mesos Releases and Support

2016-03-25 Thread Vinod Kone
Hi folks,

There has been some interest recently about Mesos releases and support
policy. As promised, I spent some time thinking about this and written my
thoughts down in a doc.

https://docs.google.com/document/d/1A8MglUWST6pWan3cVw98v8uxTPew8RMKxxrRqiSENM0/edit?usp=sharing

Please take a look and provide feedback. I'm especially interested in your
opinion on the proposals.

Thanks,
Vinod


Re: HTTP API

2016-03-19 Thread Vinod Kone
Thanks for the interest!

We are actively working to make the Framework v1 API stable. We've made
quite a few improvements/fixes to the Scheduler v1 API since 0.24.0. We've
also introduced Executor v1 API in 0.28.0. Both are in *experimental* state.

There are still things left to do to make the Framework v1 API production
ready. Please refer to MESOS-3302
 and MESOS-4855
 for specifics.  Can you
help contributing to any of these?

Other than the issues listed above, we like frameworks to start testing
this API in their staging/testing clusters. This would give us the most
confidence to call it production ready. Can you help?

I'm very optimistically hoping to get this ready by MesosCon Denver, but we
need more help for it to be a realistic deadline. If any one is willing to
help, please reach out to me. I promise to give you my time and shepherd
your contributions.

Thanks,

On Wed, Mar 16, 2016 at 1:38 PM, Zameer Manji  wrote:

> +1
>
> I am also interested in knowing the state of the HTTP API. I have heard
> that it stabilizing the API might be tied with Mesos 1.0 but I don't have a
> source for that. Can a PMC member comment on what the plan is?
>
> On Mon, Mar 14, 2016 at 2:30 PM, Dario Rexin  wrote:
>
>> Hi all,
>>
>> since the introduction of the HTTP API in 0.24 around 7.5 months have
>> passed. What are the plans to make this API stable? There are already
>> features (inverse offers) that are exclusively available through this API,
>> so it would be great to have a timeline, as I think for most people it’s
>> impossible to use experimental features in production.
>>
>> Thanks,
>> Dario
>>
>> --
>> Zameer Manji
>>
>>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-08 Thread Vinod Kone
+kevin klues

OK. I'm cancelling this vote since there are some show stopper issues that
we need to cherry-pick. I'll cut another RC on Thursday.

@shepherds: can you please make sure the blocker tickets are marked with
fix version and that they land today or tomorrow?

@kevin: since you have volunteered to help with the release, can you make
sure we have a list of commits to cherry pick for rc2?

Thanks,


On Tue, Mar 8, 2016 at 12:05 AM, Shuai Lin <linshuai2...@gmail.com> wrote:

> Maybe also https://issues.apache.org/jira/browse/MESOS-4877 and
> https://issues.apache.org/jira/browse/MESOS-4878 ?
>
>
> On Tue, Mar 8, 2016 at 9:13 AM, Jie Yu <yujie@gmail.com> wrote:
>
>> I'd like to fix https://issues.apache.org/jira/browse/MESOS-4888 as well
>> if you guys plan to cut another RC
>>
>> On Mon, Mar 7, 2016 at 10:16 AM, Daniel Osborne <
>> daniel.osbo...@metaswitch.com> wrote:
>>
>>> -1
>>>
>>> If it doesn’t cause too much pain, I'm hoping we can squeeze a
>>> relatively small patch which restores Mesos' ability to extract Docker
>>> assigned IPs. This has been broken with Docker 1.10's release over  a month
>>> ago, and prevents service discovery and DNS from working.
>>>
>>> Mesos-4370: https://issues.apache.org/jira/browse/MESOS-4370
>>> RB# 43093: https://reviews.apache.org/r/43093/
>>>
>>> I've built 0.28.0-rc1 with this patch and can confirm that it fixes it
>>> as expected.
>>>
>>> Apologies for not bringing this to attention earlier.
>>>
>>> Thanks all,
>>> Dan
>>>
>>> -Original Message-
>>> From: Vinod Kone [mailto:vinodk...@apache.org]
>>> Sent: Thursday, March 3, 2016 5:44 PM
>>> To: dev <d...@mesos.apache.org>; user <user@mesos.apache.org>
>>> Subject: [VOTE] Release Apache Mesos 0.28.0 (rc1)
>>>
>>> Hi all,
>>>
>>>
>>> Please vote on releasing the following candidate as Apache Mesos 0.28.0.
>>>
>>>
>>> 0.28.0 includes the following:
>>>
>>>
>>> 
>>>
>>>   * [MESOS-4343] - A new cgroups isolator for enabling the net_cls
>>> subsystem in
>>>
>>> Linux. The cgroups/net_cls isolator allows operators to provide
>>> network
>>>
>>>
>>> performance isolation and network segmentation for containers within
>>> a Mesos
>>>
>>> cluster. To enable the cgroups/net_cls isolator, append
>>> `cgroups/net_cls` to
>>>
>>> the `--isolation` flag when starting the slave. Please refer to
>>>
>>>
>>> docs/mesos-containerizer.md for more details.
>>>
>>>
>>>
>>>
>>>
>>>   * [MESOS-4687] - The implementation of scalar resource values (e.g.,
>>> "2.5
>>>
>>>
>>> CPUs") has changed. Mesos now reliably supports resources with up to
>>> three
>>>
>>> decimal digits of precision (e.g., "2.501 CPUs"); resources with
>>> more than
>>>
>>> three decimal digits of precision will be rounded. Internally,
>>> resource math
>>>
>>> is now done using a fixed-point format that supports three decimal
>>> digits of
>>>
>>> precision, and then converted to/from floating point for input and
>>> output,
>>>
>>> respectively. Frameworks that do their own resource math and
>>> manipulate
>>>
>>>
>>> fractional resources may observe differences in roundoff error and
>>> numerical
>>>
>>> precision.
>>>
>>>
>>>
>>>
>>>
>>>   * [MESOS-4479] - Reserved resources can now optionally include
>>> "labels".
>>>
>>>
>>> Labels are a set of key-value pairs that can be used to associate
>>> metadata
>>>
>>> with a reserved resource. For example, frameworks can use this
>>> feature to
>>>
>>> distinguish between two reservations for the same role at the same
>>> agent
>>>
>>> that are intended for different purposes.
>>>
>>>
>>>
>>>
>>>
>>>   * [MESOS-2840] - **Experimental** support for container images in Mesos
>>>
>>>
>>> containerizer (a.k.a. Unified Containerizer). This allows framewo

Re: 0.28.0 release

2016-03-03 Thread Vinod Kone
Release vote sent. The soft lock is released as well. Commit away!

On Thu, Mar 3, 2016 at 4:58 PM, Timothy Chen <tnac...@gmail.com> wrote:

> Sorry I pushed a quick typo fix before seeing this email.
>
> Tim
>
> On Thu, Mar 3, 2016 at 4:15 PM, Vinod Kone <vinodk...@apache.org> wrote:
> > Alright, all the blockers are resolved. I'll be cutting the RC shortly.
> >
> > I'm also taking a soft lock on the 'master' branch. *Committers:* *Please
> > do not push any commits upstream until I release the lock.*
> >
> > Thanks,
> >
> > On Mon, Feb 29, 2016 at 1:36 PM, Vinod Kone <vinodk...@apache.org>
> wrote:
> >
> >> Hi folks,
> >>
> >> I'm volunteering to be the Release Manager for 0.28.0. Joris and Kevin
> >> Klues have kindly agreed to help me out. The plan is cut an RC tomorrow
> >> 03/01.
> >>
> >> The dashboard for the release is here:
> >>
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12327751
> >>
> >> *If you have a ticket marked with "Fix Version 028.0" and is not in
> >> "Resolved" state, verify if it's a blocker for 0.28.0. If not, please
> unset
> >> the Fix Version.*
> >>
> >>
> >> Thanks,
> >> Vinod
> >>
> >>
>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-03 Thread Vinod Kone
On Thu, Mar 3, 2016 at 5:43 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Tue Mar  10 17:00:00 PST 2016


Sorry. This should be Mar 8th not 10th.


Re: [VOTE] Release Apache Mesos 0.26.1 (rc1)

2016-03-04 Thread Vinod Kone
+1 (binding)

On Tue, Mar 1, 2016 at 5:03 PM, Kevin Klues <klue...@gmail.com> wrote:

> I committed a fix for this in:
>
> https://github.com/apache/mesos/commit/42f746937233349660c687ea7a66cc0a78871663
>
> Looks like that's post 0.26 though, so maybe it should be included in the
> .1 rc
>
> On Mon, Feb 29, 2016 at 2:27 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> Looks like the ASF CI builds for CentOS7 are failing because they are
>> unable to find JAVA_HOME. Couldn't tell if it's an issue with the docker
>> build script or something in the configure script.
>>
>>
>> checking for svn_txdelta in -lsvn_delta-1... yes
>> checking for sasl_done in -lsasl2... yes
>> checking SASL CRAM-MD5 support... yes
>> checking for javac... /usr/bin/javac
>> checking for java... /usr/bin/java
>> checking value of Java system property 'java.home'... 
>> /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.71-2.b15.el7_2.x86_64/jre
>> configure: error: could not guess JAVA_HOME
>>
>>
>>
>> *Revision*: a05261dbed1c2577676b11235380de95d586aeeb
>>
>>- refs/tags/0.26.1-rc1
>>
>> Configuration Matrix gcc clang
>> centos:7 --verbose --enable-libevent --enable-ssl
>> [image: Failed]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/8/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/>
>> [image: Not run]
>> --verbose
>> [image: Failed]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/8/COMPILER=gcc,CONFIGURATION=--verbose,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/>
>> [image: Not run]
>> ubuntu:14.04 --verbose --enable-libevent --enable-ssl
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/8/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/8/COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/>
>> --verbose
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/8/COMPILER=gcc,CONFIGURATION=--verbose,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/>
>> [image: Success]
>> <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/8/COMPILER=clang,CONFIGURATION=--verbose,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)/>
>>
>> On Mon, Feb 29, 2016 at 11:21 AM, Kapil Arya <ka...@mesosphere.io> wrote:
>>
>>> +1 (binding)
>>>
>>> Successful CI builds for the following distros:
>>>
>>> amd64/centos/6
>>> amd64/centos/7
>>> amd64/debian/jessie
>>> amd64/ubuntu/precise
>>> amd64/ubuntu/trusty
>>> amd64/ubuntu/vivid
>>>
>>> Kapil
>>>
>>> On Sat, Feb 27, 2016 at 12:26 AM, Michael Park <mp...@apache.org> wrote:
>>>
>>> > Hi all,
>>> >
>>> > Please vote on releasing the following candidate as Apache Mesos
>>> 0.26.1.
>>> >
>>> >
>>> > 0.26.1 includes the following:
>>> >
>>> >
>>> 
>>> >
>>> >- Improvements
>>> >   - `/state` endpoint performance
>>> >   - systemd integration
>>> >   - GLOG performance
>>> >   - Configurable task/framework history
>>> >   - Offer filter timeout fix for backlogged allocator
>>> >
>>> >
>>> >- Bugs
>>> >- SSL
>>> >   - Libevent
>>> >   - Fixed point resources math
>>> >- HDFS
>>> >   - Agent upgrade compatibility
>>> >
>>> > The CHANGELOG for the release is available at:
>>> >
>>> >
>>> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.26.1-rc1
>>> >
>>> >
>>> 
>>> >
>>> > The candidate for Mesos 0.26.1 release is available at:
>>> >
>>> https://dist.apache.org/repos/dist/dev/mesos/0.26.1-rc1/mesos-0.26.1.tar.gz
>>> >
>>> > Th

Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-04 Thread Vinod Kone
Bad copy paste into the email, sorry. It looks fine in the CHANGELOG.

On Fri, Mar 4, 2016 at 12:52 AM, Shuai Lin <linshuai2...@gmail.com> wrote:

>   * [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1
>> Scheduler API.
>>   * [MESOS-4591] - Change the object of ReserveResources and CreateVolume
>> ACLs to `roles`.
>>   * [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1
>> Scheduler API.
>
>
> MESOS-4712 is included twice.
>
> On Fri, Mar 4, 2016 at 1:25 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> On Thu, Mar 3, 2016 at 5:43 PM, Vinod Kone <vinodk...@apache.org> wrote:
>>
>> > Tue Mar  10 17:00:00 PST 2016
>>
>>
>> Sorry. This should be Mar 8th not 10th.
>>
>
>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-04 Thread Vinod Kone
I think this was supposed to refer to
https://github.com/apache/mesos/blob/master/docs/mesos-containerizer.md

@Jie ^^ ?

On Fri, Mar 4, 2016 at 10:29 AM, Steven Schlansker <
sschlans...@opentable.com> wrote:

>
> > On Mar 3, 2016, at 5:43 PM, Vinod Kone <vinodk...@apache.org> wrote:
> >
> > Hi all,
> > Please vote on releasing the following candidate as Apache Mesos 0.28.0.
> > 0.28.0 includes the following:
> > ...
> >   * [MESOS-2840] - **Experimental** support for container images in Mesos
> > containerizer (a.k.a. Unified Containerizer). This allows frameworks
> to
> > launch Docker/Appc containers using Mesos containerizer without
> relying on
> > docker daemon (engine) or rkt. The isolation of the containers is
> done using
> > isolators. Please refer to docs/container-image.md for currently
> supported
> > features and limitations.
>
> As of commit 46dcae5, there doesn't seem to be any such documentation?
> I'm excited to try this feature out :)
>
> https://github.com/apache/mesos/blob/master/docs/container-image.md (404)
>
>


Need CHANGELOG updates

2016-03-03 Thread Vinod Kone
Hi guys,

The 0.28.0 release is currently blocked on the updates to CHANGELOG.
Basically I'm looking for shepherds/owners of feature tickets to add a
blurb in the CHANGELOG for their tickets.

The big ticket items that went into 0.28.0 that I know of

--> net_cls_isolator
--> floating point math for resources
--> unified containerizer
--> executor api v1

If there are other things that need to be called out specifically in the
CHANGELOG, please reach out to me.

Thanks,


Re: Making 'curl' a prerequisite for installing Mesos

2016-03-03 Thread Vinod Kone
sgtm

On Thu, Mar 3, 2016 at 10:01 AM, Jie Yu  wrote:

> Neil, thanks for the comments and the pointer!
>
> Just looked at the curl_multi_xxx() API. Yeah, I think we should be able
> to use that API in our async environment.  But we need to hook this with
> our underlying libev/libevent based runtime, which might take a while to
> finish. I'll create a ticket to track.
>
> In the meantime, I want to unblock people from using some of the new
> features built on top of the 'curl' based fetcher. Since this is a pretty
> simple dependency to add, I would suggest that we still proceed adding this
> dependency.
>
> - Jie
>
> On Thu, Mar 3, 2016 at 9:37 AM, Neil Conway  wrote:
>
>> No objection to about the additional dependency, but using 'curl'
>> instead of 'libcurl' seems unfortunate. Can you share some more
>> detailed information about the problems that have been encountered
>> using libcurl? e.g., was using the curl_multi_xxx() APIs explored?
>>
>> Neil
>>
>> On Thu, Mar 3, 2016 at 9:10 AM, Jie Yu  wrote:
>> > Hi,
>> >
>> > I am proposing making 'curl' a prerequisite when installing Mesos.
>> > Currently, we require 'libcurl' being present when installing Mesos
>> > (http://mesos.apache.org/gettingstarted/). However, we found that it
>> does
>> > not compose well with our asynchronous runtime environment (i.e., it'll
>> > block the current worker thread).
>> >
>> > Recent work on URI fetcher uses 'curl' directly, instead of using
>> 'libcurl'
>> > to fetch artifacts, because it composes well with our async runtime env.
>> > 'curl' is installed by default in most systems (e.g., OSX, centos,
>> RHEL).
>> >
>> > So I am proposing adding 'curl' to our prerequisite list. Let me know
>> if you
>> > have any concern on this. I'll update the Getting Started doc if you
>> are OK
>> > with this change.
>> >
>> > Thanks,
>> > - Jie
>> >
>>
>
>


Re: [VOTE] Release Apache Mesos 0.28.0 (rc1)

2016-03-07 Thread Vinod Kone
Looks like there is a bunch of discussion around MESOS-4370 and RB# 43093.
Kapil, as the shepherd, what is your take on the priority and readiness of
this?

On Mon, Mar 7, 2016 at 1:16 PM, Daniel Osborne <
daniel.osbo...@metaswitch.com> wrote:

> -1
>
> If it doesn’t cause too much pain, I'm hoping we can squeeze a relatively
> small patch which restores Mesos' ability to extract Docker assigned IPs.
> This has been broken with Docker 1.10's release over  a month ago, and
> prevents service discovery and DNS from working.
>
> Mesos-4370: https://issues.apache.org/jira/browse/MESOS-4370
> RB# 43093: https://reviews.apache.org/r/43093/
>
> I've built 0.28.0-rc1 with this patch and can confirm that it fixes it as
> expected.
>
> Apologies for not bringing this to attention earlier.
>
> Thanks all,
> Dan
>
> -Original Message-
> From: Vinod Kone [mailto:vinodk...@apache.org]
> Sent: Thursday, March 3, 2016 5:44 PM
> To: dev <d...@mesos.apache.org>; user <user@mesos.apache.org>
> Subject: [VOTE] Release Apache Mesos 0.28.0 (rc1)
>
> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 0.28.0.
>
>
> 0.28.0 includes the following:
>
>
> 
>
>   * [MESOS-4343] - A new cgroups isolator for enabling the net_cls
> subsystem in
>
> Linux. The cgroups/net_cls isolator allows operators to provide network
>
>
> performance isolation and network segmentation for containers within a
> Mesos
>
> cluster. To enable the cgroups/net_cls isolator, append
> `cgroups/net_cls` to
>
> the `--isolation` flag when starting the slave. Please refer to
>
>
> docs/mesos-containerizer.md for more details.
>
>
>
>
>
>   * [MESOS-4687] - The implementation of scalar resource values (e.g., "2.5
>
>
> CPUs") has changed. Mesos now reliably supports resources with up to
> three
>
> decimal digits of precision (e.g., "2.501 CPUs"); resources with more
> than
>
> three decimal digits of precision will be rounded. Internally,
> resource math
>
> is now done using a fixed-point format that supports three decimal
> digits of
>
> precision, and then converted to/from floating point for input and
> output,
>
> respectively. Frameworks that do their own resource math and manipulate
>
>
> fractional resources may observe differences in roundoff error and
> numerical
>
> precision.
>
>
>
>
>
>   * [MESOS-4479] - Reserved resources can now optionally include "labels".
>
>
> Labels are a set of key-value pairs that can be used to associate
> metadata
>
> with a reserved resource. For example, frameworks can use this feature
> to
>
> distinguish between two reservations for the same role at the same
> agent
>
> that are intended for different purposes.
>
>
>
>
>
>   * [MESOS-2840] - **Experimental** support for container images in Mesos
>
>
> containerizer (a.k.a. Unified Containerizer). This allows frameworks to
>
>
> launch Docker/Appc containers using Mesos containerizer without
> relying on
>
> docker daemon (engine) or rkt. The isolation of the containers is done
> using
>
> isolators. Please refer to docs/container-image.md for currently
> supported
>
> features and limitations.
>
>
>
>
>
>   * [MESOS-4793] - **Experimental** support for v1 Executor HTTP API. This
>
>
> allows executors to send HTTP requests to the /api/v1/executor agent
>
>
> endpoint without the need for an executor driver. Please refer to
>
>
> docs/executor-http-api.md for more details.
>
>
>
>
>
> Additional API Changes:
>
>
>   * [MESOS-4066] - Agent should not return partial state when a request is
> made to /state endpoint during recovery.
>
>   * [MESOS-4547] - Introduce TASK_KILLING state.
>
>
>   * [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1
> Scheduler API.
>
>   * [MESOS-4591] - Change the object of ReserveResources and CreateVolume
> ACLs to `roles`.
>
>   * [MESOS-4712] - Remove 'force' field from the Subscribe Call in v1
> Scheduler API.
>
>   * [MESOS-4591] - Change the object of ReserveResources and CreateVolume
> ACLs to `roles`.
>
>   * [MESOS-3583] - Add stream IDs for HTTP schedulers.
>
>
> The CHANGELOG for the release is available at:
>
>
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=0.28.0-rc1
>
>
> --

NYC Meetup on Mar 9th

2016-03-04 Thread Vinod Kone
Hi,

If you are in the NYC area and are a Mesos enthusiast, you might be
interested in the upcoming meetup in NYC.

Please visit http://www.meetup.com/Apache-Mesos-NYC-Meetup/events/229086077/
for more details.

Looking forward to the meetup,
Vinod

P.S: @MPark, can you please add this to our events calendar?


Re: [RFC] Mesos Releases and Support

2016-03-31 Thread Vinod Kone
Thanks for all those who commented on the doc so far. The feedback was
great.

I'm planning to finalize the doc by end of this week, so please provide
feedback if you haven't and wanted to.

Regarding the proposals themselves, looks like most people are in favor of
proposal 1. We can probably punt on LTS until we get some experience with
the new release and patch policies.

On Fri, Mar 25, 2016 at 12:21 PM, Vinod Kone <vi...@mesosphere.io> wrote:

> Hi folks,
>
> There has been some interest recently about Mesos releases and support
> policy. As promised, I spent some time thinking about this and written my
> thoughts down in a doc.
>
>
> https://docs.google.com/document/d/1A8MglUWST6pWan3cVw98v8uxTPew8RMKxxrRqiSENM0/edit?usp=sharing
>
> Please take a look and provide feedback. I'm especially interested in your
> opinion on the proposals.
>
> Thanks,
> Vinod
>
>
>


Re: Unstability on Mesos 0.27

2016-03-19 Thread Vinod Kone
Hey Gabriel,

Could you share more details on what the crashes are and what your setup is
(docker containerizer?). Any logs (master, agent, application) that can
shed light would be useful to diagnose.

On Wed, Mar 16, 2016 at 5:12 PM, Alfredo Carneiro <
alfr...@simbioseventures.com> wrote:

> Hello guys,
>
> I am using Mesos 0.27 with different kinds of applications, such as,
> crawlers, databases and websites. However, I have faced many crashes and I
> couldn't find what it is the matter.
>
> We have 14 machines with 8Gb of ram and 4 cpu each. Usually, we run about
> 40 instance of our crawler, which they start stopping of nowhere (but the
> containers keep running). The day before yesterday we decided try to test
> our entire infrastrcuture and we scaled our crawler up to 110 instances.
> Unfortunately, today we've faced a big crash that affected mainly our
> crawler and our databases.
>
> So, I am wondering if anyone else have the same problem, such as apps
> which crashes of nowhere or something else which could be related to some
> unstability on Mesos.
>
> --
> Alfredo Miranda
>


Re: Mesos agents across a WAN?

2016-03-31 Thread Vinod Kone
This is great info Evan, especially coming from a production experience.
Thanks for sharing it !

On Thu, Mar 31, 2016 at 1:49 PM, Evan Krall  wrote:

> On Wed, Mar 30, 2016 at 6:56 PM, Jeff Schroeder <
> jeffschroe...@computer.org> wrote:
>
>> Given regional bare metal Mesos clusters on multiple continents, are
>> there any known issues running some of the agents over the WAN? Is anyone
>> else doing it, or is this a terrible idea that I should tell management no
>> on?
>>
>> A few specifics:
>>
>> 1. Are there any known limitations or configuration gotchas I might
>> encounter?
>>
>
> One thing to keep in mind is that the masters maintain a distributed log
> through a consensus protocol, so there needs to be a quorum of masters that
> can talk to each other in order to operate. Consensus protocols tend to be
> very latency-sensitive, so you probably want to keep masters near each
> other.
>
> Some of our clusters span semi-wide geographical regions (in production,
> up to about 5 milliseconds RTT between master and some slaves). So far, we
> haven't seen any issues caused by that amount of latency, and I believe we
> have clusters in non-production environments which have even higher round
> trip between slaves and masters, and work fine. I haven't benchmarked task
> launch time or anything like that, so I can't say how much it affects the
> speed of operations.
>
> Mesos generally does the right thing around network partitions (changes
> won't propagate, but it won't kill your tasks), but if you're running
> things in Marathon and using TCP or HTTP healthchecks, be aware that
> Marathon does not rate limit itself on issuing task kills
>  for healthcheck
> failures. This means during a network partition, your applications will be
> fine, but once the network partition heals (or if you're experiencing
> packet loss but not total failure), Marathon will suddenly kill all of the
> tasks on the far side of the partition. A workaround for that is to use
> command health checks, which are run by the mesos slave.
>
>
>> 2. Does setting up ZK observers in each non-primary dc and pointing the
>> agents at them exclusively make sense?
>>
>
> My understanding of ZK observers is that they proxy writes to the actual
> ZK quorum members, so this would probably be fine. mesos-slave uses ZK to
> discover masters, and mesos-master uses ZK to do leader election; only
> mesos-master is doing any writes to ZK.
>
> I'm not sure how often mesos-slave reads from ZK to get the list of
> masters; I assume it doesn't bother if it has a live connection to a master.
>
>
>> 4. Any suggestions on how best to do agent attributes / constraints for
>> something like this? I was planning on having the config management add a
>> "data_center" agent attribute to match on.
>>
>
> If you're running services on Marathon or similar, I'd definitely
> recommend exposing the location of the slaves as an attribute, and having
> constraints to keep different instances of your application spread across
> the different locations. The "correct" constraints to apply depends on your
> application and latency / failure sensitivity.
>
> Evan
>
>
>> Thanks!
>>
>> [1]
>> https://github.com/kubernetes/kubernetes/blob/8813c955182e3c9daae68a8257365e02cd871c65/release-0.19.0/docs/proposals/federation.md#kubernetes-cluster-federation
>>
>> --
>> Jeff Schroeder
>>
>> Don't drink and derive, alcohol and analysis don't mix.
>> http://www.digitalprognosis.com
>>
>
>


Re: Run Mesos without master being able to open connections to slaves

2016-04-25 Thread Vinod Kone
On Mon, Apr 25, 2016 at 8:35 AM, Elouan Keryell-Even <
elouan.kery...@gmail.com> wrote:

> So I'd be glad to have some insight from you guys about if it is possible,
> in one way or another, to make Mesos work without the Master being able to
> initiate connections to slaves. I just need to be 100% sure there isn't any
> workaround before going back to my boss :)
>
>
Master and agent/slave are still required to be able to open connections to
each other. There is no work around that I'm aware of.

We had a similar restriction with scheduler (driver based) to master
communication. The new scheduler HTTP API no longer has this restriction
for master to scheduler communication.

For master to agent communication, the plan is to come up with a new HTTP
API similar to the scheduler HTTP API. Neither the design nor the
implementation has started yet.


Re: Reconnected slaves not sending resource offers?

2016-04-25 Thread Vinod Kone
On Mon, Apr 25, 2016 at 8:40 AM, Thomas Petr  wrote:

> The only thing that ended up fixing the situation was bouncing our
> scheduler (~10 minutes after the restarted slaves joined the cluster) --
> the act of failing over the framework appeared to "recover" the missing
> resources:
>

What do the master logs say when the slave is registered with a new id?


Re: Mesos Web UI

2016-04-29 Thread Vinod Kone
Adam, since you committed this, feel free to backport it to the relevant
stable branches (26.x, 27.x, 28.x). They will be included in the next patch
releases.

On Fri, Apr 29, 2016 at 9:31 AM, Gilbert Song  wrote:

> Julian, since the fix was after 0.28, could you try it again by applying
> the patch @haosdent provided, or try with mesos master head?
>
> On Fri, Apr 29, 2016 at 8:52 AM, Julian Gonzalez Llorente <
> jgonza...@medallia.com> wrote:
>
>> Hello list,
>>
>> I found one little error on the web UI of mesos.
>> In the "Slaves" tab there are a table with the columns: ID, Host, CPUs,
>> Mem, Disk, Registered and Re-Registered.
>> But in the "Offers" tab there is two times Mem. : ID, Framework, Host,
>> Cpus, Mem, Mem.
>> The second Mem should be Disk instead.
>>
>> I suppose that should be easy to fix.
>>
>> Regards,
>> Julian
>>
>
>


Re: Change the role of a framework

2016-04-28 Thread Vinod Kone
I think what you did seems correct.

On Thu, Apr 28, 2016 at 6:31 PM, Shuai Lin  wrote:

> Hi list,
>
> For some reason I need to change the role of an existing framework
> (marathon)  from the default role "*" to a specific role, say "services", I
> don't find any existing documentation on this, so here are the steps that I
> take on a staging cluster:
>
> - stop all HA marathon instances, only left one running
>
> - set the marathon role (/etc/marathon/conf/mesos_role), and restart
> marathon
>   - at this moment marathon is still using "*" role because master won't
> update the role of a framework when it re-registers
>   - for that to happen we need to do a mesos master fail over
>
> - stop the current active mesos-master, so marathon would use the new role
> after the master failover
>
> - now: marathon is using "services" role, which means it would accept
> resources from both slaves with default '*' role and slaves with "services"
> role
>
> - for each slave:
>   - stop the slave
>   - change the role (/etc/mesos-slave/default_role) to "services"
>   - remove /tmp/mesos/meta/slaves
>   - restart docker (otherwise the old running executors/tasks won't be
> killed)
>   - restart the slave
>
> During the process all running tasks are killed and restarted, but that's
> acceptable to me.
>
> Now all slaves is running with role "services" and marathon is running
> with role "services".  So far the cluster seems to be working fine, but I'm
> not sure if the steps I take have any un-noticed impacts, since this is a
> somewhat un-documented procedure.
>
> Any comments?
>
> Regards,
> Shuai
>
>
>
>


Design doc for v1 Operator API

2016-05-20 Thread Vinod Kone
Hi folks,

Here is the design doc for the v1 Operator API. Please take a look and give
us feedback.

https://docs.google.com/document/d/1XfgF4jDXZDVIEWQPx6Y4glgeTTswAAxw6j8dPDAtoeI/edit?pref=2=1#

Thanks,
Vinod


Re: Setting constraints

2016-05-21 Thread Vinod Kone
There is no flexible/dynamic way to ensure that a particular task runs on
every agent in the cluster.

If you are ok with static configuration, you can set aside resources for a
role on every agent by using the resources flag (e.g,
--resources="cpus(system):2;cpus(*): 10" on an agent with 12 cpus). You can
then start marathon with "system" role and ensure that only system tasks
are submitted with that role in their app config.

On Sat, May 21, 2016 at 6:26 PM, Scott Kinney  wrote:

> Hi Guangya,
>
> That doens't deploy the apps automatically to every slave. only one slave
> will get one of each.
>
>
>
> --
> Scott Kinney | DevOps
> stem    |   *m*  510.282.1299
> 100 Rollins Road, Millbrae, California 94030
>
> This e-mail and/or any attachments contain Stem, Inc. confidential and
> proprietary information and material for the sole use of the intended
> recipient(s). Any review, use or distribution that has not been expressly
> authorized by Stem, Inc. is strictly prohibited. If you are not the
> intended recipient, please contact the sender and delete all copies. Thank
> you.
> --
> *From:* Guangya Liu 
> *Sent:* Saturday, May 21, 2016 6:21 PM
> *To:* user@mesos.apache.org
> *Subject:* Re: Setting constraints
>
> Hi Scott,
>
> I think only setting "constraints": [["hostname", "UNIQUE"] is good
> enough, please refer to
> https://github.com/mesosphere/marathon/blob/master/docs/docs/constraints.md#unique-operator
>
> Thanks,
>
> Guangya
>
> On Sun, May 22, 2016 at 12:59 AM, Scott Kinney 
> wrote:
>
>> I would like to have one instance of consul and registrator running on
>> every slave. I thought that by setting a attribute on every node then
>> setting contraints like..
>>
>>
>> "constraints": [
>> ["exec_env", "GROUP_BY"],
>> ["hostname", "UNIQUE"]
>> ]
>> If every node has the same 'exec_env' then they would all get one
>> instance of this task ( i am not setting a value for 'instances' btw).
>>
>>
>> What is a good way to be sure every slave has one of a particular task?
>>
>> Thanks!
>>
>> --
>> Scott Kinney | DevOps
>> stem    |   *m*  510.282.1299
>> 100 Rollins Road, Millbrae, California 94030
>>
>> This e-mail and/or any attachments contain Stem, Inc. confidential and
>> proprietary information and material for the sole use of the intended
>> recipient(s). Any review, use or distribution that has not been expressly
>> authorized by Stem, Inc. is strictly prohibited. If you are not the
>> intended recipient, please contact the sender and delete all copies. Thank
>> you.
>>
>
>


Re: mesos website workgroup

2016-05-17 Thread Vinod Kone
Great to see a lot of interest!

I've created a work group for "website" here
https://cwiki.apache.org/confluence/display/MESOS/Apache+Mesos+Working+Groups
.

Look out for email threads with [WEBSITE] prefix in the subject going
forward.

On Tue, May 17, 2016 at 6:23 AM, Radoslaw Gruchalski <ra...@gruchalski.com>
wrote:

> I can help as well.
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
> de.linkedin.com/in/radgruchalski
>
>
> *Confidentiality:*This communication is intended for the above-named
> person and may be confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
> On May 17, 2016 at 3:23:13 PM, Timothy Anderegg (
> timothy.ander...@gmail.com) wrote:
>
> I'm happy to help,
>
> Tim
>
> On Tue, May 17, 2016, 6:01 AM Freddy Ayuso-Henson <faay...@gmail.com>
> wrote:
>
>> I’m interested in contributing.
>>
>>
>>
>> From: Vinod Kone <vinodk...@apache.org> <vinodk...@apache.org>
>> Reply: user@mesos.apache.org <user@mesos.apache.org>
>> <user@mesos.apache.org>
>> Date: May 16, 2016 at 10:02:58 PM
>>
>> To: dev <d...@mesos.apache.org> <d...@mesos.apache.org>, user
>> <user@mesos.apache.org> <user@mesos.apache.org>
>> Subject:  mesos website workgroup
>>
>> Hi guys,
>>
>> Mesos website needs some love. It hasn't seen major changes for a while
>> now and there is no real maintainer for it.
>>
>> I'm proposing we start a work group for the folks who are interested in
>> contributing to the website. Especially folks who have interest and
>> experience in frontend development or at least have access to folks with
>> that experience (maybe colleagues at your company).
>>
>> Since we are gearing up for a 1.0 release, it would be nice to do a
>> website refresh.
>>
>> Please reply to this email if you are interested.
>>
>> Thanks,
>> Vinod
>>
>>


Re: Error on Teardown attempt: Framework is not connected via HTTP

2016-04-15 Thread Vinod Kone
That's not the endpoint you want (that's for frameworks to use). You want
/teardown endpoint (that's for operators).


Re: kafka-mesos still refusing to launch brokers on one cluster

2016-04-18 Thread Vinod Kone
Is there nothing in the mesos master logs about this scheduler trying to
connect/register? If not, the registration message from the scheduler is
not making it to the master. Are there any firewall rules between the
scheduler host and master host?


Re: kafka-mesos still refusing to launch brokers on one cluster

2016-04-19 Thread Vinod Kone
On Tue, Apr 19, 2016 at 11:24 AM, Justin Ryan  wrote:

> Marathon has no trouble registering a framework and launching jobs on this
> cluster, only kafka-mesos. :/
>
>
Are both these frameworks binding to the same IP/interface? Does anyone of
them use any LIBPROCESS_IP or LIBPROCESS_PORT env variables?


Re: kafka-mesos still refusing to launch brokers on one cluster

2016-04-18 Thread Vinod Kone
On Mon, Apr 18, 2016 at 12:32 PM, Justin Ryan  wrote:

> So, test is working again, but prod is still and has consistently been dead


What do the master/agent/scheduler logs say regarding kafka tasks?


Re: 0.28.2

2016-05-25 Thread Vinod Kone
Patch releases are for bug fixes ugly only. So you have to wait for the next 
stable release. 

@vinodkone

> On May 25, 2016, at 5:52 AM, June Taylor  wrote:
> 
> We recently upgraded to 0.28.1 and were looking for the feature where a role 
> can be specified to mesos-execute. This didn't seem to make it in. Can you 
> get that into 0.28.2?
> 
> 
> Thanks,
> June Taylor
> System Administrator, Minnesota Population Center
> University of Minnesota
> 
>> On Tue, May 24, 2016 at 11:04 AM, Jie Yu  wrote:
>> Hi folks,
>> 
>> According to our release schedule, we should be cutting point release for 
>> 0.28.x. I volunteer to be the release manager. Vinod and Ben already started 
>> a branch (0.28.x) in the repo, so it's just a matter of cutting it. If you 
>> have any patch that you want to backport into 0.28.2, please let me know 
>> asap. I plan to cut it this Thursday.
>> 
>> Thanks,
>> - Jie
> 


1.0 Release Candidate

2016-05-25 Thread Vinod Kone
Hi folks,

As discussed in the previous community sync, we plan to cut a release
candidate for our next release (1.0) early next week.

1.0 is mainly centered around new APIs for Mesos. Please take a look at
MESOS-338  for blocking
issues. We got some great design and testing feedback for the v1 scheduler
and executor APIs. Please do the same for the in-progress v1 operator API

.

Since this is a 1.0, we would like to do the release a little differently.

First, the voting period for vetting the release candidate would be a few
weeks (2-3 weeks) instead of the typical 3 days.

Second, we are wiling to make major changes (scalability fixes, API fixes)
if there are any issues reported by the community.

We are doing these because we really want the community to thoroughly test
the 1.0 release and give feedback.

Thanks,


Re: [VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-26 Thread Vinod Kone
We've the ASF press wire and other community blog posts lined up to be
posted tomorrow, so it will be really hard to tell all those folks to
postpone it this late. I've a couple options that I want to propose

1) Fix the webui bug in 1.0.1 which we will cut as soon as we fix this bug.

2) Try to fix the bug in the next couple hours, cut rc5, and vote it in
tonight without doing the typical 72 hour voting period.


I'm personally leaning towards 1) given the timing and the nature of the
bug. What do others think? PMC?

On Tue, Jul 26, 2016 at 4:08 PM, Yan Xu <xuj...@apple.com> wrote:

> I don't mind if it's shepherd by folks with more front-end expertise.
> Actually my original suggested solution on
> https://issues.apache.org/jira/browse/MESOS-5911 seemed incorrect. Let's
> discuss the actual fix on the ticket, I feel that a short term fix
> shouldn't be more than a few lines to unblock the release.
>
> On Jul 26, 2016, at 3:26 PM, Jie Yu <yujie@gmail.com> wrote:
>
> Yan, are you going to shepherd the fix for this one? If yes, when do you
> think it can be done?
>
> - Jie
>
> On Tue, Jul 26, 2016 at 3:05 PM, Yan Xu <xuj...@apple.com> wrote:
>
> -1
>
> We tested it in our testing environment but webUI redirect didn't work. We
> filed: https://issues.apache.org/jira/browse/MESOS-5911
>
> Given that webUI is the portal for Mesos clusters I feel that we should at
> least have a basic fix (more context in the JIRA) before release 1.0.
> Thoughts?
>
> On Jul 26, 2016, at 2:52 PM, Kapil Arya <ka...@mesosphere.io> wrote:
>
> +1 (binding)
>
> OpenSUSE Tumbleweed:
>./configure --disable-java --disable-python && make check
>
> On Tue, Jul 26, 2016 at 4:44 PM, Zhitao Li <zhitaoli...@gmail.com> wrote:
>
> Also tested:
>
> make check passes on OS X
>
> One thing I found when testing RC4 debian with Aurora integration test
> suite (on its master) is that scheduler previously expected GPU resource
> will not receive offers without new `GPU_RESOURCES` capability even it's
> the only scheduler.
>
> Given that GPU support is not technically released until 1.0, I don't
> consider this is a blocker to me, but it might be surprising to people
> already testing GPU support.
>
> On Tue, Jul 26, 2016 at 12:45 PM, Benjamin Mahler <bmah...@apache.org>
> wrote:
>
> +1 (binding)
>
> OS X 10.11.6
> ./configure --disable-python --disable-java
> make check
>
> On Tue, Jul 26, 2016 at 10:24 AM, Greg Mann <g...@mesosphere.io> wrote:
>
> +1 (non-binding)
>
> * Ran `sudo make distcheck` successfully on CentOS 7.1 with only one
>
> test
>
> failure: ExamplesTest.PythonFramework fails for me the first time it's
> executed as part of the whole test suite, and then succeeds on
>
> subsequent
>
> executions. I'm investigating further, and will file a ticket if
>
> necessary.
>
> * Ran the upgrade testing script successfully from 0.28.2 -> 1.0.0-rc4
>
> Cheers,
> Greg
>
> On Tue, Jul 26, 2016 at 1:58 AM, haosdent <haosd...@gmail.com> wrote:
>
> +1
>
> * make check in CentOS 7.2
> * make check in Ubuntu 14.04
> * test upgrade from 0.28.2 to 1.0.0-rc4
>
>
> On Tue, Jul 26, 2016 at 8:33 AM, Kapil Arya <ka...@mesosphere.io>
>
> wrote:
>
>
> One can find the deb/rpm packages here:
>
> http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.0.0-rc4
>
>
> And here are the corresponding docker images based off of Ubuntu
>
> 14.04:
>
>mesosphere/mesos:1.0.0-rc4
>mesosphere/mesos-master:1.0.0-rc4
>mesosphere/mesos-slave:1.0.0-rc4
>
> Kapil
>
> On Sat, Jul 23, 2016 at 1:40 AM, Vinod Kone <vinodk...@apache.org>
>
> wrote:
>
>
> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos
>
> 1.0.0.
>
>
> *The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes
>
> if a
>
> majority of at least 3 +1 PMC votes are cast.*
>
> 1.0.0 includes the following:
>
>
>
>
>
>
> 
>
>
>  * Scheduler and Executor v1 HTTP APIs are now considered stable.
>
>
>
>
>
>  * [MESOS-4791] - **Experimental** support for v1 Master and
>
> Agent
>
> APIs.
>
> These
>
>APIs let operators and services (monitoring, load balancers)
>
> send
>
> HTTP
>
>requests to '/api/v1' endpoint on master or agent. See
>
>
>`docs/operator-http-api.md` for details.
>
>
>
>
>
>  * [MESOS-4828] - **Experimental** support for a new `disk/xfs'
>
> isolator
>
>
>
>has been added t

[VOTE] Release Apache Mesos 1.0.0 (rc3)

2016-07-22 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.0.

*The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.*

1.0.0 includes the following:



  * Scheduler and Executor v1 HTTP APIs are now considered stable.





  * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
These

APIs let operators and services (monitoring, load balancers) send HTTP


requests to '/api/v1' endpoint on master or agent. See


`docs/operator-http-api.md` for details.





  * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator


has been added to isolate disk resources more efficiently. Please refer
to

docs/mesos-containerizer.md for more details.





  * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
added a

new isolator 'docker/volume' which allows users to use external volumes
in

Mesos containerizer. Currently, the isolator interacts with the Docker


volume plugins using a tool called 'dvdcli'. By speaking the Docker
volume

plugin API, most of the Docker volume plugins are supported.





  * [MESOS-4641] - **Experimental** A new network isolator, the


`network/cni` isolator, has been introduced in the
`MesosContainerizer`. The

`network/cni` isolator implements the Container Network Interface (CNI)


specification proposed by CoreOS.  With CNI the `network/cni` isolator
is

able to allocate a network namespace to Mesos containers and attach the


container to different types of IP networks by invoking network drivers


called CNI plugins.





  * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored
in

order to decouple the ACLs definition language from the interface.


It additionally includes the option of retrieving `ObjectApprover`. An


`ObjectApprover` can be used to synchronously check authorizations for
a

given object and is hence useful when authorizing a large number of
objects

and/or large objects (which need to be copied using request based


authorization). NOTE: This is a **breaking change** for authorizer
modules.




  * [MESOS-5405] - The `subject` and `object` fields in
authorization::Request

have been changed from required to optional. If either of these fields
is

not set, the request should only be authorized if any subject/object
should

be allowed.


NOTE: This is a semantic change for authorizer modules.





  * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
endpoint

filtering enables operators to restrict what part of the cluster state
a

user is authorized to see.


Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

executors. The following endpoints support HTTP endpoint filtering:


'/state', '/state-summary', '/tasks', '/frameworks','/weights',


and '/roles'. Additonally the following v1 API calls support filtering:


'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
'GET_TASKS'.




  * [MESOS-4909] - Tasks can now specify a kill policy. They are
best-effort,

because machine failures or forcible terminations may occur. Currently,
the

only available kill policy is how long to wait between graceful and
forcible

task kill. In the future, more policies may be available (e.g. hitting
an

HTTP endpoint, running a command, etc). Note that it is the executor's


responsibility to enforce kill policies. For executor-less
command-based

tasks, the kill is performed via sending a signal to the task process:


SIGTERM for the graceful kill and SIGKILL for the forcible kill. For
docker

executor-less tasks the grace period is passed to 'docker stop --time'.
This

feature supersedes the '--docker_stop_timeout', which is now
deprecated.




  * [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now
be

overridden when the scheduler kills the task. This can be used by
schedulers

to forcefully kill a task which is already being killed, e.g. if
something

went wrong during a graceful kill and a forcible kill is desired. Note
that

it is the executor's responsibility to honor the
'Event.kill.kill_policy'

field and override the task's kill policy and kill policy from a
previous

kill task request. To use this feature, schedulers and executors must


support HTTP API; use the '--http_command_executor' agent flag to
ensure

the agent launches the HTTP API based command executor.





  * [MESOS-4949] - The executor shutdown grace period can now be configured
in


[RESULT][VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-27 Thread Vinod Kone
Hi all,

The vote for Mesos 1.0.0 (rc4) has passed with the following votes.


+1 (Binding)

--

Kapil Arya

Jie Yu

Benjamin Mahler


+1 (Non-binding)

--

Haosdent

Greg Mann

Zhitao Li


+0

-

Yan Xu


There were no  -1 votes.


*NOTE: There were a couple known issues [MESOS-5911
<https://issues.apache.org/jira/browse/MESOS-5911>, MESOS-5913
<https://issues.apache.org/jira/browse/MESOS-5913>] that couldn't be fixed
in time for the 1.0. We plan to do a patch release to fix these ASAP.*


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.0


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.0


The mesos-1.0.0.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,

On Fri, Jul 22, 2016 at 10:40 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.0.
>
> *The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
> majority of at least 3 +1 PMC votes are cast.*
>
> 1.0.0 includes the following:
>
>
> 
>
>   * Scheduler and Executor v1 HTTP APIs are now considered stable.
>
>
>
>
>
>   * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
> These
>
> APIs let operators and services (monitoring, load balancers) send
> HTTP
>
> requests to '/api/v1' endpoint on master or agent. See
>
>
> `docs/operator-http-api.md` for details.
>
>
>
>
>
>   * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
>
>
> has been added to isolate disk resources more efficiently. Please
> refer to
>
> docs/mesos-containerizer.md for more details.
>
>
>
>
>
>   * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
> added a
>
> new isolator 'docker/volume' which allows users to use external
> volumes in
>
> Mesos containerizer. Currently, the isolator interacts with the
> Docker
>
> volume plugins using a tool called 'dvdcli'. By speaking the Docker
> volume
>
> plugin API, most of the Docker volume plugins are supported.
>
>
>
>
>
>   * [MESOS-4641] - **Experimental** A new network isolator, the
>
>
> `network/cni` isolator, has been introduced in the
> `MesosContainerizer`. The
>
> `network/cni` isolator implements the Container Network Interface
> (CNI)
>
> specification proposed by CoreOS.  With CNI the `network/cni` isolator
> is
>
> able to allocate a network namespace to Mesos containers and attach
> the
>
> container to different types of IP networks by invoking network
> drivers
>
> called CNI plugins.
>
>
>
>
>
>   * [MESOS-2948, MESOS-5403] - The authorizer interface has been
> refactored in
>
> order to decouple the ACLs definition language from the interface.
>
>
> It additionally includes the option of retrieving `ObjectApprover`.
> An
>
> `ObjectApprover` can be used to synchronously check authorizations for
> a
>
> given object and is hence useful when authorizing a large number of
> objects
>
> and/or large objects (which need to be copied using request based
>
>
> authorization). NOTE: This is a **breaking change** for authorizer
> modules.
>
>
>
>
>   * [MESOS-5405] - The `subject` and `object` fields in
> authorization::Request
>
> have been changed from required to optional. If either of these fields
> is
>
> not set, the request should only be authorized if any subject/object
> should
>
> be allowed.
>
>
> NOTE: This is a semantic change for authorizer modules.
>
>
>
>
>
>   * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
> endpoint
>
> filtering enables operators to restrict what part of the cluster state
> a
>
> user is authorized to see.
>
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> executors. The following endpoints support HTTP endpoint f

Re: [RESULT][VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-27 Thread Vinod Kone
The 1.0 blog post is up: http://mesos.apache.org/blog/mesos-1-0-0-released/

Thank you all for making this possible!

@vinodkone

> On Jul 27, 2016, at 7:39 AM, Vinod Kone <vinodk...@apache.org> wrote:
> 
> Hi all,
> 
> The vote for Mesos 1.0.0 (rc4) has passed with the following votes.
> 
> 
> 
> +1 (Binding)
> 
> --
> 
> Kapil Arya
> 
> Jie Yu
> 
> Benjamin Mahler
> 
> 
> 
> +1 (Non-binding)
> 
> --
> 
> Haosdent
> 
> Greg Mann
> 
> Zhitao Li
> 
> 
> 
> +0
> 
> -
> 
> Yan Xu
> 
> 
> 
> There were no  -1 votes.
> 
> 
> 
> NOTE: There were a couple known issues [MESOS-5911, MESOS-5913] that couldn't 
> be fixed in time for the 1.0. We plan to do a patch release to fix these ASAP.
> 
> 
> 
> Please find the release at:
> 
> https://dist.apache.org/repos/dist/release/mesos/1.0.0
> 
> 
> 
> It is recommended to use a mirror to download the release:
> 
> http://www.apache.org/dyn/closer.cgi
> 
> 
> 
> The CHANGELOG for the release is available at:
> 
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.0
> 
> 
> 
> The mesos-1.0.0.jar has been released to:
> 
> https://repository.apache.org
> 
> 
> 
> The website (http://mesos.apache.org) will be updated shortly to reflect this 
> release.
> 
> 
> 
> Thanks,
> 
> 
>> On Fri, Jul 22, 2016 at 10:40 PM, Vinod Kone <vinodk...@apache.org> wrote:
>> Hi all,
>> 
>> 
>> 
>> Please vote on releasing the following candidate as Apache Mesos 1.0.0.
>> 
>> The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a majority 
>> of at least 3 +1 PMC votes are cast.
>> 
>> 
>> 1.0.0 includes the following:
>> 
>> 
>> 
>>   * Scheduler and Executor v1 HTTP APIs are now considered stable.   
>> 
>> 
>>  
>> 
>> 
>>   * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs. 
>> These  
>> 
>> APIs let operators and services (monitoring, load balancers) send HTTP   
>> 
>> 
>> requests to '/api/v1' endpoint on master or agent. See   
>> 
>> 
>> `docs/operator-http-api.md` for details. 
>> 
>> 
>>  
>> 
>> 
>>   * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
>> 
>> 
>> has been added to isolate disk resources more efficiently. Please refer 
>> to   
>> 
>> docs/mesos-containerizer.md for more details.
>>
>> 
>>  
>> 
>> 
>>   * [MESOS-4355] - **Experimental** support for Docker volume plugin. We 
>> added a 
>> 
>> new isolator 'docker/volume' which allows users to use external volumes 
>> in   
>> 
>> Mesos containerizer. Currently, the isolator interacts with the Docker   
>> 
>> 
>> volume plugins using a tool called 'dvdcli'. By speaking the Docker 
>> volume   
>> 
>> plugin API, most of the Docker volume plugins are supported. 
>> 
>> 
>>  
>> 
>> 
>>   * [MESOS-4641] - **Experimental** A new network isolator, the  
>> 
>> 
>> `network/cni` isolator, has been introduced in the `MesosContainerizer`. 
>> The 
>> 
>> `network/cni` isolator implements the Container Network Interface (CNI)  
>> 
>> 
>> specification proposed by CoreOS.  With CNI the `network/cni` isolator 
>> is
>> 
>> able to allocate a network namespace to Mesos containers and attach the  
>> 
>> 
>> container to different types of IP networks by invoking network drivers  
>> 
>> 
>> called CNI plugins.  
>>
>> 
>>  
>> 
>> 
>>   * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored 
>> in   
>> 
>>

Re: Possible authentication bug

2016-07-21 Thread Vinod Kone
On Thu, Jul 21, 2016 at 4:49 PM, Douglas Nelson  wrote:

> Just out of curiosity, is there a rough ETA for the stable release of
> 1.0.0? Or is anyone currently using rc2 in production?
>

I'm hoping to cut RC3 later today or tomorrow and barring any -ve votes do
the official release early next week.


Re: [VOTE] Release Apache Mesos 1.0.0 (rc3)

2016-07-22 Thread Vinod Kone
Looks like we missed a cherry pick. I'm cancelling this vote and spinning
up rc4.

On Fri, Jul 22, 2016 at 2:24 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Hi all,
>
>
> Please vote on releasing the following candidate as Apache Mesos 1.0.0.
>
> *The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
> majority of at least 3 +1 PMC votes are cast.*
>
> 1.0.0 includes the following:
>
>
> 
>
>   * Scheduler and Executor v1 HTTP APIs are now considered stable.
>
>
>
>
>
>   * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
> These
>
> APIs let operators and services (monitoring, load balancers) send
> HTTP
>
> requests to '/api/v1' endpoint on master or agent. See
>
>
> `docs/operator-http-api.md` for details.
>
>
>
>
>
>   * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator
>
>
> has been added to isolate disk resources more efficiently. Please
> refer to
>
> docs/mesos-containerizer.md for more details.
>
>
>
>
>
>   * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
> added a
>
> new isolator 'docker/volume' which allows users to use external
> volumes in
>
> Mesos containerizer. Currently, the isolator interacts with the
> Docker
>
> volume plugins using a tool called 'dvdcli'. By speaking the Docker
> volume
>
> plugin API, most of the Docker volume plugins are supported.
>
>
>
>
>
>   * [MESOS-4641] - **Experimental** A new network isolator, the
>
>
> `network/cni` isolator, has been introduced in the
> `MesosContainerizer`. The
>
> `network/cni` isolator implements the Container Network Interface
> (CNI)
>
> specification proposed by CoreOS.  With CNI the `network/cni` isolator
> is
>
> able to allocate a network namespace to Mesos containers and attach
> the
>
> container to different types of IP networks by invoking network
> drivers
>
> called CNI plugins.
>
>
>
>
>
>   * [MESOS-2948, MESOS-5403] - The authorizer interface has been
> refactored in
>
> order to decouple the ACLs definition language from the interface.
>
>
> It additionally includes the option of retrieving `ObjectApprover`.
> An
>
> `ObjectApprover` can be used to synchronously check authorizations for
> a
>
> given object and is hence useful when authorizing a large number of
> objects
>
> and/or large objects (which need to be copied using request based
>
>
> authorization). NOTE: This is a **breaking change** for authorizer
> modules.
>
>
>
>
>   * [MESOS-5405] - The `subject` and `object` fields in
> authorization::Request
>
> have been changed from required to optional. If either of these fields
> is
>
> not set, the request should only be authorized if any subject/object
> should
>
> be allowed.
>
>
> NOTE: This is a semantic change for authorizer modules.
>
>
>
>
>
>   * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
> endpoint
>
> filtering enables operators to restrict what part of the cluster state
> a
>
> user is authorized to see.
>
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> Consider for example the `/state` master endpoint: an operator can
> now
>
> authorize users to only see a subset of the running frameworks, tasks,
> or
>
> executors. The following endpoints support HTTP endpoint filtering:
>
>
> '/state', '/state-summary', '/tasks', '/frameworks','/weights',
>
>
> and '/roles'. Additonally the following v1 API calls support
> filtering:
>
> 'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
> 'GET_TASKS'.
>
>
>
>
>   * [MESOS-4909] - Tasks can now specify a kill policy. They are
> best-effort,
>
> because machine failures or forcible terminations may occur.
> Currently, the
>
> only available kill policy is how long to wait between graceful and
> forcible
>
> task kill. In the future, more policies may be available (e.g. hitting
> an
>
> HTTP endpoint, running a command, etc). Note that it is the
> executor's
>
> responsibility to enforce kill policies. For executor-less
> command-based
>
> tasks, the kill is performed via sending a signal to the task
> process:
>
> SIGTERM for the g

[VOTE] Release Apache Mesos 1.0.0 (rc4)

2016-07-22 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.0.

*The vote is open until Tue Jul 25 11:00:00 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.*

1.0.0 includes the following:



  * Scheduler and Executor v1 HTTP APIs are now considered stable.





  * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
These

APIs let operators and services (monitoring, load balancers) send HTTP


requests to '/api/v1' endpoint on master or agent. See


`docs/operator-http-api.md` for details.





  * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator


has been added to isolate disk resources more efficiently. Please refer
to

docs/mesos-containerizer.md for more details.





  * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
added a

new isolator 'docker/volume' which allows users to use external volumes
in

Mesos containerizer. Currently, the isolator interacts with the Docker


volume plugins using a tool called 'dvdcli'. By speaking the Docker
volume

plugin API, most of the Docker volume plugins are supported.





  * [MESOS-4641] - **Experimental** A new network isolator, the


`network/cni` isolator, has been introduced in the
`MesosContainerizer`. The

`network/cni` isolator implements the Container Network Interface (CNI)


specification proposed by CoreOS.  With CNI the `network/cni` isolator
is

able to allocate a network namespace to Mesos containers and attach the


container to different types of IP networks by invoking network drivers


called CNI plugins.





  * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored
in

order to decouple the ACLs definition language from the interface.


It additionally includes the option of retrieving `ObjectApprover`. An


`ObjectApprover` can be used to synchronously check authorizations for
a

given object and is hence useful when authorizing a large number of
objects

and/or large objects (which need to be copied using request based


authorization). NOTE: This is a **breaking change** for authorizer
modules.




  * [MESOS-5405] - The `subject` and `object` fields in
authorization::Request

have been changed from required to optional. If either of these fields
is

not set, the request should only be authorized if any subject/object
should

be allowed.


NOTE: This is a semantic change for authorizer modules.





  * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
endpoint

filtering enables operators to restrict what part of the cluster state
a

user is authorized to see.


Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

executors. The following endpoints support HTTP endpoint filtering:


'/state', '/state-summary', '/tasks', '/frameworks','/weights',


and '/roles'. Additonally the following v1 API calls support filtering:


'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
'GET_TASKS'.




  * [MESOS-4909] - Tasks can now specify a kill policy. They are
best-effort,

because machine failures or forcible terminations may occur. Currently,
the

only available kill policy is how long to wait between graceful and
forcible

task kill. In the future, more policies may be available (e.g. hitting
an

HTTP endpoint, running a command, etc). Note that it is the executor's


responsibility to enforce kill policies. For executor-less
command-based

tasks, the kill is performed via sending a signal to the task process:


SIGTERM for the graceful kill and SIGKILL for the forcible kill. For
docker

executor-less tasks the grace period is passed to 'docker stop --time'.
This

feature supersedes the '--docker_stop_timeout', which is now
deprecated.




  * [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now
be

overridden when the scheduler kills the task. This can be used by
schedulers

to forcefully kill a task which is already being killed, e.g. if
something

went wrong during a graceful kill and a forcible kill is desired. Note
that

it is the executor's responsibility to honor the
'Event.kill.kill_policy'

field and override the task's kill policy and kill policy from a
previous

kill task request. To use this feature, schedulers and executors must


support HTTP API; use the '--http_command_executor' agent flag to
ensure

the agent launches the HTTP API based command executor.





  * [MESOS-4949] - The executor shutdown grace period can now be configured
in


Re: Enabling basic access authentication

2016-08-01 Thread Vinod Kone
We separated out the default authentication mode for read only (default: no
authn) and read-write (default: authn) endpoints. Since the webui only
depends on the read-only endpoints you need to explicitly enable authn for
read-only endpoints if you need authn. See
https://github.com/apache/mesos/blob/master/docs/upgrades.md for more
details.

On Mon, Aug 1, 2016 at 12:20 PM, Douglas Nelson  wrote:

> It was working for me with mesos 1.0.0-rc2. Now that I made the switch to
> 1.0.0 the feature is missing for user/pass prompt at the WebUI. Was another
> flag added or was it decided that this feature wasn't necessary?
>
> On Tue, Jul 12, 2016 at 6:26 PM, Douglas Nelson 
> wrote:
>
>> Ah, I missed that in the vote message. That makes sense. I'm running
>> version 0.28.2 so that would be why.
>>
>> On Tue, Jul 12, 2016 at 6:22 PM, Zhitao Li  wrote:
>>
>>> Just went through this: I think the necessary endpoint `/master/state`
>>> is only authenticated after 1.0.0, which is still going through release
>>> vote.
>>>
>>> Can you share which version of Mesos you are running?
>>>
>>> On Tue, Jul 12, 2016 at 5:18 PM, Douglas Nelson 
>>> wrote:
>>>
 With marathon you can enable basic access authentication to the WebUI
 with the flag --http_credentials.

 I expected something similar with the flag --authenticate_http in mesos
 but when I hit the WebUI I'm not prompted to give a username/pass. Is that
 feature not included in mesos or is there a different configuration I need
 to set?

 Thanks!

>>>
>>>
>>>
>>> --
>>> Cheers,
>>>
>>> Zhitao Li
>>>
>>
>>
>


1.0.1 release

2016-08-01 Thread Vinod Kone
Hi,

As discussed on the 1.0 voting thread, we plan to cut a 1.0.1 as early as
this week. So if you have anything that needs to absolutely go into the
patch release, please work with your shepherd and get it landed on trunk
and backported to the 1.0.x branch.

Thanks,


[VOTE] Release Apache Mesos 1.0.1 (rc1)

2016-08-10 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.1.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.1-rc1




The candidate for Mesos 1.0.1 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz


The tag to be voted on is 1.0.1-rc1:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.1-rc1


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.1-rc1/mesos-1.0.1.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1155


Please vote on releasing this package as Apache Mesos 1.0.1!


The vote is open until Mon Aug 15 17:29:33 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.1

[ ] -1 Do not release this package because ...


Thanks,


Re: test

2016-07-13 Thread Vinod Kone
Don't sweat about the test email. Not a big deal. Welcome to the community!

On Wed, Jul 13, 2016 at 1:51 PM, Rahul Palamuttam 
wrote:

> I'm truly sorry.
> Just kept getting several message denied errors, until I realized I needed
> to send a reply to user-subscribe.
> I will not do that again.
>
>
> On Wed, Jul 13, 2016 at 11:57 AM, daemeon reiydelle 
> wrote:
>
>> Why are you wasting our time with this? Lame.
>>
>>
>> *...*
>>
>>
>>
>> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
>> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
>> <%28%2B44%29%20%280%29%2020%208144%209872>*
>>
>> On Wed, Jul 13, 2016 at 11:56 AM, Rahul Palamuttam <
>> rahulpala...@gmail.com> wrote:
>>
>>>
>>>
>>
>


Re: Possible authentication bug

2016-07-18 Thread Vinod Kone
Might be related to MESOS-2043
?

Can you paste master and agent logs?

On Mon, Jul 18, 2016 at 3:13 PM, Douglas Nelson  wrote:

> I have SSL enabled for mesos and for the most part everything seems to be
> working fine. But when I stop a slave node for long enough that it shows up
> with status LOST then I start up the slave again, registration with the
> master fails:
>
> I0718 15:51:45.646260 16791 master.cpp:5495] Authenticating slave(1)@
> 10.5.7.5:5051
> I0718 15:51:45.646960 16791 authenticator.cpp:98] Creating new server SASL
> connection
> I0718 15:51:50.648329 16790 master.cpp:5481] Queuing up authentication
> request from slave(1)@10.5.7.5:5051 because authentication is still in
> progress
> W0718 15:51:50.648696 16790 master.cpp:5522] Failed to authenticate
> slave(1)@10.5.7.5:5051: Authentication discarded
>
> It cycles through this over and over again until I restart the master
> node. Is restarting the master the only way to handle re-authentication? I
> expected it to be more automatic. Thanks!
>


Re: mesos agent not recovering after ZK init failure

2016-07-15 Thread Vinod Kone
On Fri, Jul 15, 2016 at 11:31 AM, Sharma Podila  wrote:

> We had this issue happen again and were able to debug further. The cause
> for agent not being able to restart is that one of the resources (disk)
> changed its total size since the last restart. However, this error does not
> show up in INFO/WARN/ERROR files. We saw it in stdout only when manually
> restarting the agent. It would be good to have all messages going to
> stdout/stderr show up in the logs. Is there a config setting for it that I
> missed?
>

When the master/agent exits due to an un-recoverable error they use a stout
library function `EXIT` which only prints to stderr. Agreed that this is
not great UX, mind filing a ticket? Note that even if we fix this in Mesos,
we can't easily fix this behavior in the 3rd party libraries that we use
(e.g., ZooKeeper).  The way we've dealt with this in production, in my
previous company, was to redirect stdout/stderr to a
mesos-{master,agent}.log. You can disable "--log_dir" to avoid double
logging.



> The disk size total is changing sometimes on our agents. It is off by a
> few bytes (seeing ~10 bytes difference out of, say, 600 GB). We use ZFS on
> our agents to manage the disk partition. From my colleague, Andrew (copied
> here):
>
> The current Mesos approach (i.e., `statvfs()` for total blocks and assume
>> that never changes) won’t work reliably on ZFS
>
>
As Jie alluded to, one strategy is to have a startup wrapper script that
calculates the resources and calls `mesos-agent` binary with `--resources`
flag set. This is what we used to do in production.


Re: How to send a task to a running framework?

2016-07-08 Thread Vinod Kone
Are you asking how users can submit their tasks to your custom framework?
Your framework should probably expose an API for that.

On Fri, Jul 8, 2016 at 4:30 AM, Bryan Fok  wrote:

> Hi all
>
> After I have my custom framework running in , for instance, a python
> process
>
> How do I submit a task to it through another python process from another
> machine? Through the framework name? Any document around this?
>
> BR
> Bryan
>


[VOTE] Release Apache Mesos 1.0.0 (rc2)

2016-07-07 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.0.


1.0.0 includes the following:



  * Scheduler and Executor v1 HTTP APIs are now considered stable.





  * [MESOS-4791] - **Experimental** support for v1 Master and Agent APIs.
These

APIs let operators and services (monitoring, load balancers) send HTTP


requests to '/api/v1' endpoint on master or agent. See


`docs/operator-http-api.md` for details.





  * [MESOS-4828] - **Experimental** support for a new `disk/xfs' isolator


has been added to isolate disk resources more efficiently. Please refer
to

docs/mesos-containerizer.md for more details.





  * [MESOS-4355] - **Experimental** support for Docker volume plugin. We
added a

new isolator 'docker/volume' which allows users to use external volumes
in

Mesos containerizer. Currently, the isolator interacts with the Docker


volume plugins using a tool called 'dvdcli'. By speaking the Docker
volume

plugin API, most of the Docker volume plugins are supported.





  * [MESOS-4641] - **Experimental** A new network isolator, the


`network/cni` isolator, has been introduced in the
`MesosContainerizer`. The

`network/cni` isolator implements the Container Network Interface (CNI)


specification proposed by CoreOS.  With CNI the `network/cni` isolator
is

able to allocate a network namespace to Mesos containers and attach the


container to different types of IP networks by invoking network drivers


called CNI plugins.





  * [MESOS-2948, MESOS-5403] - The authorizer interface has been refactored
in

order to decouple the ACLs definition language from the interface.


It additionally includes the option of retrieving `ObjectApprover`. An


`ObjectApprover` can be used to synchronously check authorizations for
a

given object and is hence useful when authorizing a large number of
objects

and/or large objects (which need to be copied using request based


authorization). NOTE: This is a **breaking change** for authorizer
modules.




  * [MESOS-5405] - The `subject` and `object` fields in
authorization::Request

have been changed from required to optional. If either of these fields
is

not set, the request should only be authorized if any subject/object
should

be allowed.

NOTE: This is a semantic change for authorizer modules.





  * [MESOS-4931, MESOS-5709, MESOS-5704] - Authorization based HTTP
endpoint

filtering enables operators to restrict what part of the cluster state
a

user is authorized to see.


Consider for example the `/state` master endpoint: an operator can now


authorize users to only see a subset of the running frameworks, tasks,
or

executors. The following endpoints support HTTP endpoint filtering:


'/state', '/state-summary', '/tasks', '/frameworks','/weights',


and '/roles'. Additonally the following v1 API calls support filtering:


'GET_ROLES','GET_WEIGHTS','GET_FRAMEWORKS', 'GET_STATE', and
'GET_TASKS'.




  * [MESOS-4909] - Tasks can now specify a kill policy. They are
best-effort,

because machine failures or forcible terminations may occur. Currently,
the

only available kill policy is how long to wait between graceful and
forcible

task kill. In the future, more policies may be available (e.g. hitting
an

HTTP endpoint, running a command, etc). Note that it is the executor's


responsibility to enforce kill policies. For executor-less
command-based

tasks, the kill is performed via sending a signal to the task process:


SIGTERM for the graceful kill and SIGKILL for the forcible kill. For
docker

executor-less tasks the grace period is passed to 'docker stop --time'.
This

feature supersedes the '--docker_stop_timeout', which is now
deprecated.




  * [MESOS-4908] - The task kill policy defined within 'TaskInfo' can now
be

overridden when the scheduler kills the task. This can be used by
schedulers

to forcefully kill a task which is already being killed, e.g. if
something

went wrong during a graceful kill and a forcible kill is desired. Note
that

it is the executor's responsibility to honor the
'Event.kill.kill_policy'

field and override the task's kill policy and kill policy from a
previous

kill task request. To use this feature, schedulers and executors must


support HTTP API; use the '--http_command_executor' agent flag to
ensure

the agent launches the HTTP API based command executor.





  * [MESOS-4949] - The executor shutdown grace period can now be configured
in

`ExecutorInfo`, which overrides the agent flag. When shutting down an


executor the agent will wait in a best-effort manner for the grace
period

specified here before forcibly destroying the container. The executor
must

not assume that it will always be 

Support for tasks groups aka pods in Mesos

2016-08-08 Thread Vinod Kone
Hi folks,

One of the most requested features in Mesos has been first class support
for managing pod like containers. We finally have some time to focus and
shepherd this work.

The epic tracking this work is :
https://issues.apache.org/jira/browse/MESOS-2449

Design doc: https://issues.apache.org/jira/browse/MESOS-2449

Your feedback on the design will be most welcome. Once we get agreement on
the design, we can start breaking down the epic into tickets.

Thanks,
Vinod & Jie


Re: Support for tasks groups aka pods in Mesos

2016-08-08 Thread Vinod Kone
Sorry, sent the wrong link earlier for design doc.

Design doc: https://issues.apache.org/jira/browse/MESOS-6009
>

Direct link:
https://docs.google.com/document/d/1FtcyQkDfGp-bPHTW4pUoqQCgVlPde936bo-IIENO_ho/edit#heading=h.ip4t59nlogfz


Re: GPU channel on slack

2016-06-30 Thread Vinod Kone
Mind updating
https://github.com/apache/mesos/blob/master/docs/working-groups.md with
this info?

On Thu, Jun 30, 2016 at 8:44 AM, Kevin Klues  wrote:

> If you are interested in the ongoing GPU work on Mesos, please join the
> #gpus channel at mesos.slack.com. The big announcements for the GPU work
> will still happen on this mailing list, but the day to day discussions will
> likely happen on the slack channel going forward.
>


Re: 1.0.0 RC2

2016-06-30 Thread Vinod Kone
Update: We still have about 6 blockers for the RC2 cut :( Good news is that
all of them are either reviewable or in progress :). I'll cut RC2 whenever
they land, whether it's tomorrow or coming tuesday.

Dashboard to track progress:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12328715

On Tue, Jun 21, 2016 at 11:48 AM, Vinod Kone <vinodk...@apache.org> wrote:

> There are still 8 outstanding issues, including 1 blocker. We are waiting
> for these to land for RC2.
>
>
> On Fri, Jun 17, 2016 at 5:11 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> We still have 12 issues, including 1 blocker, targeted for 1.0.
>>
>> Dashboard: https://issues.apache.org/jira/secure/Dashboard.jspa
>>
>> So I'll wait until *monday morning PST *to cut RC2, for the blocker to
>> get resolved and any other targeted issues to land.
>>
>> Also note that with RC2 we will create a 1.0.x branch and update the
>> version on trunk to 1.1.0. Any further fixes for RC2 will be cherry picked
>> on to that branch.
>>
>>
>> On Wed, Jun 15, 2016 at 4:09 PM, Vinod Kone <vinodk...@apache.org> wrote:
>>
>>> There are still 17 un-resolved issues targeted for 1.0. We have only
>>> couple more days left for the RC cut. Whoever is driving & shepherding
>>> these please make sure to land them.
>>>
>>>
>>>
>>> On Mon, Jun 13, 2016 at 1:58 PM, Vinod Kone <vinodk...@apache.org>
>>> wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I'm planning to cut 1.0 RC2 later this week (likely friday). So please
>>>> make sure to get any patches targeted for 1.0 (esp. blockers) upstreamed.
>>>>
>>>> The dashboard for the release is here:
>>>> https://issues.apache.org/jira/issues/?filter=12335793
>>>>
>>>> Thanks,
>>>> Vinod
>>>>
>>>
>>>
>>
>


[VOTE] Release Apache Mesos 1.0.3 (rc2)

2017-01-31 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.3.


1.0.3 includes the following:



* [MESOS-6052] - Unable to launch containers on CNI networks on CoreOS


* [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.


* [MESOS-6621] - SSL downgrade path will CHECK-fail when using both
temporary and persistent sockets

* [MESOS-6676] - Always re-link with scheduler during re-registration.


* [MESOS-6917] - Segfault when the executor sets an invalid UUID when
sending a status update.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.3-rc2




The candidate for Mesos 1.0.3 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.3-rc2/mesos-1.0.3.tar.gz


The tag to be voted on is 1.0.3-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.3-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.3-rc2/mesos-1.0.3.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.3-rc2/mesos-1.0.3.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1174


Please vote on releasing this package as Apache Mesos 1.0.3!


The vote is open until Fri Feb  3 11:45:37 PST 2017 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.3

[ ] -1 Do not release this package because ...


Thanks,


Re: [VOTE] Release Apache Mesos 1.1.1 (rc1)

2017-02-08 Thread Vinod Kone
+1 (binding)

Tested on ASF CI.

*Revision*: 5d4c9962930c3f5c08e802caff40b670424cb091

   - refs/tags/1.1.1-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]


On Wed, Feb 8, 2017 at 9:09 AM, Kapil Arya  wrote:

> +1 binding.
>
> Internal CI to build deb/rpm packages.
>
> The deb/rpm binary packages are available at:
> http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.1.1-rc1
>
>
> On Tue, Feb 7, 2017 at 5:39 PM, Alex R  wrote:
>
>> Hi all,
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.1.1.
>>
>> 1.1.1 includes the following:
>> 
>> 
>> ** Bug
>>   * [MESOS-6002] - The whiteout file cannot be removed correctly using
>> aufs backend.
>>   * [MESOS-6010] - Docker registry puller shows decode error "No response
>> decoded".
>>   * [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
>>   * [MESOS-6360] - The handling of whiteout files in provisioner is not
>> correct.
>>   * [MESOS-6411] - Add documentation for CNI port-mapper plugin.
>>   * [MESOS-6526] - `mesos-containerizer launch --environment` exposes
>> executor env vars in `ps`.
>>   * [MESOS-6571] - Add "--task" flag to mesos-execute.
>>   * [MESOS-6597] - Include v1 Operator API protos in generated JAR and
>> python packages.
>>   * [MESOS-6621] - SSL downgrade path will CHECK-fail when 

[RESULT][VOTE] Release Apache Mesos 1.0.3 (rc2)

2017-02-06 Thread Vinod Kone
Hi all,


The vote for Mesos 1.0.3 (rc2) has passed with the

following votes.


+1 (Binding)

--

Vinod Kone

Adam Bordelon

Kapil Arya


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.3


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.3


The mesos-1.0.3.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,

On Mon, Feb 6, 2017 at 12:07 PM, Kapil Arya <ka...@mesosphere.io> wrote:

> +1 binding.
>
> Built rpm/deb packages on internal build jobs. Packages are available here:
>  http://open.mesosphere.com/downloads/mesos-rc/#apache-mesos-1.0.3-rc2
>
> On Mon, Feb 6, 2017 at 3:01 PM, Adam Bordelon <a...@mesosphere.io> wrote:
>
> > +1 binding
> >
> > Tests passed against DC/OS 1.8.8 (prerelease), which is based on the
> Apache
> > Mesos 1.0.x branch.
> > https://github.com/dcos/dcos/pull/1210
> >
> > On Wed, Feb 1, 2017 at 10:01 AM, Vinod Kone <vinodk...@apache.org>
> wrote:
> >
> > > +1 (binding)
> > >
> > > Tested on ASF CI
> > >
> > >
> > > *Revision*: c673fdd00e7f93ab7844965435d57fd691fb4d8d
> > >
> > >- refs/tags/1.0.3-rc2
> > >
> > > Configuration Matrix gcc clang
> > > centos:7 --verbose --enable-libevent --enable-ssl autotools
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > cmake
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
> > 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > --verbose autotools
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
> > exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > cmake
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
> > 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Not run]
> > > ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=autotools,COMPILER=clang,
> CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > cmake
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> > docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > > [image: Success]
> > > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/26/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=-
> > -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> >

Re: Framework stops to receive the heartbeats and events and gets removed from master

2017-01-23 Thread Vinod Kone
Can you paste the logs or master and framework?

@vinodkone

> On Jan 23, 2017, at 8:05 AM, Vova Shelgunov  wrote:
> 
> Hi,
> 
> I faced a very strange situation with my framework that talks to mesos master 
> via Scheduler HTTP API:
> 
> Sometimes my framework stops to receive the heartbeats and task updates from 
> a master.
> I read the documentation of mesos 
> (http://mesos.apache.org/documentation/latest/scheduler-http-api/), Network 
> partitions section and I see that if a framework does not receive the 
> heartbeats within some time it should reconnect to the master.
> 
> I have written a heartbeat monitor that checks if there were not heartbeats 
> last n seconds, then reconnect, but after the reconnection, I all the time 
> receive an ERROR from the mesos master that my framework has been removed.
> 
> Why is it happening?
> 
> Regards,
> Uladzimir


Welcome Neil Conway as Mesos Committer and PMC member!

2017-01-20 Thread Vinod Kone
Hi folks,

Please welcome Neil Conway as the newest committer and PMC member of the
Apache Mesos project.

Neil has been an active contributor to Mesos for more than a year now. As
part of his work, he has contributed some major features (Partition aware
frameworks, floating point operations for resources). Neil also took the
initiative to improve the documentation of our project and shepherded
several improvements over time. Doing that even without being a committer,
shows that he takes ownership of the project seriously.

Here is his more formal checklist for your perusal.

https://docs.google.com/document/d/137MYwxEw9QCZRH09CXfn1544p1LuM
uoj9LxS-sk2_F4/edit

Thanks,
Vinod


Re: Question: Modify mesos agent to add custom resources that change dinamically

2017-02-09 Thread Vinod Kone
Don't think that's possible today and I cannot think of easy workarounds
for it.

On Thu, Feb 9, 2017 at 1:39 AM, Carnero Iglesias, Javier <
javier.carn...@atos.net> wrote:

> Hi guys, I’ve posted in StackOverflow a *question*
> 
> that is not been answered by anyone. I thought to share it with you so
> maybe I can reach someone who has the answer:
>
> I'm developing a new mesos-slurm framework where jobs from outside mesos
> can also be pushed to slurm queues.
>
> The mesos agent has a slurm workload manager installed in the same
> computer that orchestrates jobs in a HPC. This Slurm receive jobs either
> from the mesos executor as from other methods (for example third-party
> users sending jobs directly to slurm through ssh).
>
> Therefore I'd like the agent could know, before sending offers to mesos,
> the state of the slurm queues (number of jobs running and waiting to run),
> and offer resources accordingly. This cannot be achieved only by knowing
> the tasks accepted by the executor, as other resources of the HPC could
> have been taken by third-party users using slurm directly.
>
> In other words what I'd like to do is customize the way the agent know the
> resources available to offer, to take into account the current state of
> Slurm queues.
>
> Is this possible? If positive, how could be achieved?
>
> Thanks in advance.
>
> Javier Carnero
> Software Architect
> Research and Innovation Group
> *ARI booklet*
> 
> Atos IT Solutions and Services Iberia SL
> *javier.carnero**@atos.net*
> 
> +34 955 25 41 03 <+34%20955%2025%2041%2003>
>
>
> This e-mail and the documents attached are confidential and intended
> solely for the addressee; it may also be privileged. If you receive this
> e-mail in error, please notify the sender immediately and destroy it.
> As its integrity cannot be secured on the Internet, the Atos group
> liability cannot be triggered for the message content. Although the sender
> endeavors to maintain a computer virus-free network, the sender does not
> warrant that this transmission is virus-free and will not be liable for any
> damages resulting from any virus transmitted.
>
> Este mensaje y los ficheros adjuntos pueden contener información
> confidencial destinada solamente a la(s) persona(s) mencionadas
> anteriormente y pueden estar protegidos por secreto profesional.
> Si usted recibe este correo electrónico por error, gracias por informar
> inmediatamente al remitente y destruir el mensaje.
> Al no estar asegurada la integridad de este mensaje sobre la red, Atos no
> se hace responsable por su contenido. Su contenido no constituye ningún
> compromiso para el grupo Atos, salvo ratificación escrita por ambas partes.
> Aunque se esfuerza al máximo por mantener su red libre de virus, el emisor
> no puede garantizar nada al respecto y no será responsable de cualesquiera
> daños que puedan resultar de una transmisión de virus.
>


Re: Framework stops to receive the heartbeats and events and gets removed from master

2017-01-23 Thread Vinod Kone
No problem. Glad you figured out. 

@vinodkone

> On Jan 23, 2017, at 8:38 AM, Vova Shelgunov  wrote:
> 
> Yes, it works. Sorry for troubling, the first time when I looked at the logs 
> I did not notice that failover_timeout is zero.
> 
> 2017-01-23 19:27 GMT+03:00 Vova Shelgunov :
>> Logs from mesos master:
>> 
>> 0123 15:53:44.523613 7 http.cpp:391] HTTP POST for 
>> /master/api/v1/scheduler from 172.18.0.1:58864 with User-Agent='AHC/2.0'
>> I0123 15:53:44.524159 7 master.cpp:4827] Processing ACKNOWLEDGE call 
>> ac9a6e5e-67b3-490a-930f-0024eab734b4 for task 10336 of framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework) on agent 
>> 16c100c1-13fe-47b8-a2a0-aed9bafbbf8c-S0
>> I0123 15:53:44.524849 7 master.cpp:7744] Removing task 10336 with 
>> resources cpus(*):0.1; mem(*):32 of framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 on agent 
>> 16c100c1-13fe-47b8-a2a0-aed9bafbbf8c-S0 at slave(1)@172.18.0.3:5051 
>> (mesos-slave)
>> I0123 15:53:44.529033 7 master.cpp:1297] Framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework) disconnected
>> I0123 15:53:44.529636 7 master.cpp:2902] Disconnecting framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework)
>> I0123 15:53:44.529974 7 master.cpp:2926] Deactivating framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework)
>> I0123 15:53:44.530299 7 master.cpp:1310] Giving framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework) 0ns to 
>> failover
>> I0123 15:53:44.530594 7 hierarchical.cpp:386] Deactivated framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005
>> I0123 15:53:44.531962 7 master.cpp:6369] Framework failover timeout, 
>> removing framework 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTif TP 
>> Framework)
>> I0123 15:53:44.534992 7 master.cpp:7103] Removing framework 
>> 3edce0a6-2a9e-448f-a5c2-666e2c2c3086-0005 (Test HTTP Framework)
>> 
>> It seems failover timeout is set to zero for the framework.
>> 
>> It can be my coding error if framework looses its connection to the master 
>> multiple times (I see that I do not pass failover_timeout value during 
>> reconnection).
>> I will try to observe if it solves my issue.
>> 
>> Thanks
>> 
>> 2017-01-23 19:05 GMT+03:00 Vova Shelgunov :
>>> Hi,
>>> 
>>> I faced a very strange situation with my framework that talks to mesos 
>>> master via Scheduler HTTP API:
>>> 
>>> Sometimes my framework stops to receive the heartbeats and task updates 
>>> from a master.
>>> I read the documentation of mesos 
>>> (http://mesos.apache.org/documentation/latest/scheduler-http-api/), Network 
>>> partitions section and I see that if a framework does not receive the 
>>> heartbeats within some time it should reconnect to the master.
>>> 
>>> I have written a heartbeat monitor that checks if there were not heartbeats 
>>> last n seconds, then reconnect, but after the reconnection, I all the time 
>>> receive an ERROR from the mesos master that my framework has been removed.
>>> 
>>> Why is it happening?
>>> 
>>> Regards,
>>> Uladzimir
>> 
> 


Mesos 1.0.3 release

2017-01-16 Thread Vinod Kone
Hi folks,

I'm planning to cut 1.0.3 release tomorrow. If you need anything that needs
to be backported, please mark the tickets as such.

Release dashboard:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12330112

Thanks,
Vinod


Re: Mesos logging

2016-08-21 Thread Vinod Kone
Did you figure this out? AFAICT, the LOG(INFO) line should be printed in
agent logs. What agent flags are you using?

On Tue, Aug 9, 2016 at 8:19 AM, Hendrik Haddorp 
wrote:

> I saw a few "Running ..." log entries from the docker support code but
> they seem to be all from VLOG(1) calls while for some reason the code
> that does the actual "docker run" call uses LOG(INFO) and that does not
> seem to come out by default, or I don't see it. But I'll try on ...
>
> On 09/08/16 11:38, haosdent wrote:
> > Hi, @Hendrik You could see INFO log when running Mesos Agent in
> > default level. Some docker run logs may exist in the stdout/stderr of
> > executor.
> >
> > On Tue, Aug 9, 2016 at 12:27 PM, Hendrik Haddorp
> > > wrote:
> >
> > I would like to see the "docker run" trace from
> > https://github.com/apache/mesos/blob/master/src/docker/docker.cpp
> > 
> > line 811.
> > What verbosity level does INFO map to?
> >
> > On 09/08/16 05:06, Charles Allen wrote:
> > > Which glog are you trying to capture? You can set the verbosity
> > level
> > > with the environment variable GLOG_v
> > >
> > > And you can also set it through things like Spark. So if you want a
> > > lot of ZK chatter at the mesos level in your spark logs, add
> > >
> > > spark.executorEnv.GLOG_v=9
> > >
> > > to your spark context
> > >
> > > On Mon, Aug 8, 2016 at 2:53 PM Hendrik Haddorp
> > > 
> > >>
> > wrote:
> > >
> > > Hi,
> > >
> > > the Mesos code contains log statements using LOG(INFO) and
> > > VLOG(1), for
> > > example. So far I found that Mesos is using the Google Logging
> > > Library.
> > > Looking in the logs I only seem to be able to find output
> > from VLOG
> > > statements. What do I need to do to get the output from the LOG
> > > statements? Where would I typically find the output? I'm
> > using CentOS
> > > 7.2 and found the output so far in the files below
> > /var/log/mesos.
> > >
> > > thanks,
> > > Hendrik
> > >
> >
> >
> >
> >
> > --
> > Best Regards,
> > Haosdent Huang
>
>


Fwd: REMINDER: MesosCon Asia’s CFP Deadline is September 9! Submit your Proposal Today

2016-09-08 Thread Vinod Kone
Hi folks,

Just a friendly reminder that the CFP for MesosCon Asia is fast
approaching! If you were planning to submit a talk please do so ASAP. If
you weren't, please do :)

Thanks,
Vinod

-- Forwarded message --
From: Linux Foundation Events 
Date: Fri, Aug 26, 2016 at 2:09 AM
Subject: REMINDER: MesosCon Asia’s CFP Deadline is September 9! Submit your
Proposal Today
To: vi...@mesosphere.io


Having trouble? View Online

.
[image: Speak at MesosCon Asia, THE conference of the Apache Mesos
community. Proposals are due September 9.]

MesosCon Asia

fosters greater collaboration around the Apache Mesos community, bringing
together users and developers to learn and share while accelerating growth
of the project's ecosystem. The Apache Mesos community wants to hear from
you! Share lessons learned, best practices or pitch an idea for a hands-on
workshop or in-depth tutorial.

Check out the list of suggested topics for MesosCon Asia

.

Don’t delay - submit your proposal now. The deadline to submit proposals is
September 9.
Submit Now →


Thank You to Our Sponsors

*COMMUNITY PARTNER*
[image: The Apache Software Foundation]


*Apache, Apache Mesos, and Mesos are either registered trademarks or
trademarks of the Apache Software Foundation (ASF) in the United States
and/or other countries. MesosCon is run in partnership with the ASF.*



About Us 
| Events 
| Training
 |
Projects 
| Linux.com


[image: Facebook]
[image:
Twitter] [image:
YouTube]

You are receiving this email because you have expressed interest in The
Linux Foundation Events. Visit Your Email Preferences

.
The Linux Foundation One Letterman Drive Building D, Suite D4700, San
Francisco, CA, 94129


Re: Setting log path for mesos java client library

2016-09-12 Thread Vinod Kone
Looks like Mesos logging flags

for these override

the corresponding GLOG related flags.

Try setting "MESOS_LOG_DIR=" and "MESOS_QUIET=true"

On Mon, Sep 12, 2016 at 12:09 PM, Wil Yegelwel  wrote:

> I’m trying to set the log path (and later the log format) for the mesos
> java lib. From the docs in http://mesos.apache.org/api/
> latest/java/org/apache/mesos/MesosSchedulerDriver.html it appears I need
> to set the correct GLOG environment variable in order to get this to work,
> but I can’t seem to get it. I’ve tried setting the environment variables:
> “GLOG_log_dir=…”, “GLOG_logtostderr=0” but neither seem to change the
> behavior and it is still logging to stderr. Has anyone been able to set the
> path the mesos java client library writes to and, if so, how?
>
>


Re: Updating ExecutorInfo after framework failover or best practice

2016-09-29 Thread Vinod Kone
We cannot easily make ExecutorInfo mutable because there might be existing
tasks with executors with the old ExecutorInfo. If there are two different
ExecutorInfos for the same ExecutorID it gets confusing for Mesos (e.g.,
SHUTDOWN executor id 'foo' kills which executor?).

One possible solution is to not re-use ExecutorID, but that depends on what
semantics you want for your executor.

On Thu, Sep 29, 2016 at 3:01 AM, Kota UENISHI <
ueni...@nautilus-technologies.com> wrote:

> Hi there,
>
> I'm going to implement scheduler failover into my framework, and hit
> an issue - while I know it's how Mesos works for now:
>
> My framework lets Mesos agents fetch my custom executor jar file from
> scheduler process's HTTP endpoint. Suppose framework process restarted
> by Marathon or whatever in a different machine after failure, the URL
> of the HTTP endpoint to download executor jar file from changes to
> that of new scheduler process. This causes ExecutorInfo validation
> failure, like [1]. And I think this is why Spark's
> MesosClusterDispatcher is not ready for HA yet.
>
> As a (major?) workaround, [1] avoids this by assuming URL identity by
> DNS or load balancer-ish stuff. Another short-sighted kludge
> workaround would be relaxing the ExecutorInfo validation for the
> failover case - which I believe solves many framework developers'
> headache.
>
> Also, best workaround in Mesos code would be just clearing
> ExecutorInfo after Master found scheduler failover. I think
> ExecutorInfo must be 1:1 with FrameworkInfo, but I does not have to be
> immutable. Under partition, it may diverge across masters but LWW
> merge after partition heal would be enough to keep it unique.
>
> Thoughts?
>
> [1] https://github.com/mesosphere/kubernetes-mesos/issues/15
>
> Kota UENISHI
>


1.0.2 release

2016-10-05 Thread Vinod Kone
Hi,

As the Release Manager for 1.0, I'm responsible for all subsequent patch
releases.

I'm planning to cut the next patch release (1.0.2) within a week. So, if
you have any patches that need to get into 1.0.2 make sure that either it
is already in the 1.0.x branch or the corresponding ticket has a target
version set to 1.0.2.

I'll send a link to the release dashboard shortly.

Thanks,
-- Vinod


Re: 1.0.2 release

2016-10-05 Thread Vinod Kone
Release dashboard:
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329719

I'm waiting for 2 issues to be resolved. Once that's done, I'll start
prepping the release.

On Wed, Oct 5, 2016 at 4:11 PM, Vinod Kone <vinodk...@apache.org> wrote:

> Hi,
>
> As the Release Manager for 1.0, I'm responsible for all subsequent patch
> releases.
>
> I'm planning to cut the next patch release (1.0.2) within a week. So, if
> you have any patches that need to get into 1.0.2 make sure that either it
> is already in the 1.0.x branch or the corresponding ticket has a target
> version set to 1.0.2.
>
> I'll send a link to the release dashboard shortly.
>
> Thanks,
> -- Vinod
>


Re: 1.1.0 release

2016-10-07 Thread Vinod Kone
I think you need to clean up the JIRA a bit.

1) Make sure unresolved tickets do not have fix version (1.1.0) set.
2) Move "Fix version 1.1.0" to "Target version 1.1.0".

2) might obviate the need for 1).



On Fri, Oct 7, 2016 at 7:24 AM, Till Toenshoff  wrote:

> Hi everyone!
>
> its us who will be the Release Managers for 1.1.0 - Alex and Till!
>
> We are planning to cut the next release (1.1.0) within three workdays -
> that would be Wednesday next week. So, if you have any patches that need to
> get into 1.1.0 make sure that either is already in the master branch or the
> corresponding ticket has a target version set to 1.1.0.
>
> The release dashboard:
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329720
>
> Alex & Till
>


Re: Threshold-based CPU and Memory Oversubscription

2016-09-21 Thread Vinod Kone
Awesome. Great to see this!

Looking forward to the blog post on how this helped utilization in
production :P

On Wed, Sep 21, 2016 at 10:26 AM, Erb, Stephan 
wrote:

> Hi everyone,
>
>
>
> we are happy to announce that we have open sourced two simple
> threshold-based oversubscription modules for Mesos. We use them for CPU and
> memory oversubscription and have them running in production.
>
>
>
> https://github.com/blue-yonder/mesos-threshold-oversubscription
>
>
>
> The threshold-based approach enabled us to double the peak CPU and peak
> memory utilization in our Mesos/Aurora clusters. Your mileage may vary, so
> please take this statement with a grain of salt.
>
>
>
> Best Regards,
>
> Stephan Erb
>
> PS: Retweets welcome :-) https://twitter.com/BlueYonderTech/status/
> 778630174996893696
>


Re: mesos libraries

2016-08-23 Thread Vinod Kone
If you are writing a new scheduler, I would highly recommend using the new
HTTP API instead of the Java bindings. This would eliminate the dep on the
native library.

If you still want to use the old bindings, the easiest way might be to
install mesos deb package in your docker image.

On Tue, Aug 23, 2016 at 11:27 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> I wrote a Mesos scheduler using the Java bindings, which worked great so
> far. Now I would like to run my scheduler as a docker container on
> Marathon. The problem is now that I'm missing the required native
> libraries. What is the best way to install them (in Ubuntu) without
> pulling heaps of other stuff?
>
> Thanks,
> Hendrik
>


Re: Mesos 1.1.0 release date

2016-10-03 Thread Vinod Kone
We are planning to release it in a week or so.

Till has agreed to be the release manager for the release and will be
supported by AlexR.

@Till: Can you create a release dashboard and reply to this thread?


Re: Target version vs Fixed Version

2016-10-03 Thread Vinod Kone
Yes.

On Mon, Oct 3, 2016 at 7:58 PM, haosdent <haosd...@gmail.com> wrote:

> For resolved issue, is it OK to do similar things? For example, this issue
> https://issues.apache.org/jira/browse/MESOS-5613 make mesos-local not
> work in 1.0.x, and I think it would be better that check pick this into
> 1.0.x.
>
> On Tue, Oct 4, 2016 at 9:17 AM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> Hi,
>>
>> Going forward, if you want an unresolved issue to be targeted for a
>> specific version please set the "Target Version". The committer that
>> commits the fix and resolves the ticket will set the appropriate "Fix
>> Version".
>> This applies to backports as well.
>>
>> Thanks,
>> Vinod
>>
>> -- Forwarded message --
>> From: Vinod Kone (JIRA) <j...@apache.org>
>> Date: Mon, Oct 3, 2016 at 6:13 PM
>> Subject: [jira] [Updated] (MESOS-6026) Tasks mistakenly marked as FAILED
>> due to race b/w ⁠sendExecutorTerminatedStatusUpdate()⁠ and
>> ⁠_statusUpdate()⁠
>> To: iss...@mesos.apache.org
>>
>>
>>
>>  [ https://issues.apache.org/jira/browse/MESOS-6026?page=
>> com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Vinod Kone updated MESOS-6026:
>> --
>> Target Version/s: 1.0.2
>>Fix Version/s: (was: 1.0.2)
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>


Re: Does libprocess support multi-port?

2016-10-26 Thread Vinod Kone
No it doesn't.

On Wed, Oct 26, 2016 at 1:10 AM, Suteng  wrote:

> Hi,
>
> Does libprocess support multi port? Some process bind to a port, and some
> other process bind to another port in the same OS process.
>
>
>
> Thanks,
>
> Teng
>
>
>
>
>
>
>
>
>
> Su Teng  00241668
>
>
>
> Distributed and Parallel Software Lab
>
> Huawei Technologies Co., Ltd.
>
> Email:sut...@huawei.com
>
>
>
>
>


[VOTE] Release Apache Mesos 1.0.2 (rc3)

2016-11-07 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.2.


This is a bug fix release.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.2-rc3




The candidate for Mesos 1.0.2 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc3/mesos-1.0.2.tar.gz


The tag to be voted on is 1.0.2-rc3:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.2-rc3


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc3/mesos-1.0.2.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc3/mesos-1.0.2.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1168


Please vote on releasing this package as Apache Mesos 1.0.2!


The vote is open until Thu Nov 10 11:22:30 PST 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.2

[ ] -1 Do not release this package because ...


Thanks,


Re: On Mesos versioning and deprecation policy

2016-10-14 Thread Vinod Kone
We will chat about this in the upcoming community sync (thursday 3 PM). So,
please make sure to attend if you are interested.

On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu  wrote:

>
> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu  wrote:
>
>> Thanks Alex for starting this!
>>
>> In addition to comments below, I think it'll be helpful to keep the
>> existing versioning doc concise and user-friendly while having a dedicated
>> doc for the "implementation details" where precise requirements and
>> procedures go. Maybe some duplication/cross-referencing is needed but Mesos
>> developers will find the latter much more helpful while the users/framework
>> developer will find the former easy to read.
>>
>> e.g., a similar split:
>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
>> vel/api_changes.md (which has a lot of details on how the kubernetes
>> community is thinking about similar issues, which we can learn from)
>>
>> Jiang Yan Xu 
>>
>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov 
>> wrote:
>>
>>> Folks,
>>>
>>> There have been a bunch of online [1, 2] and offline discussions about
>>> our
>>> deprecation and versioning policy. I found that people—including
>>> myself—read the versioning doc [3] differently; moreover some aspects are
>>> not captured there. I would like to start a discussion around this topic
>>> by
>>> sharing my confusions and suggestions. This will hopefully help us stay
>>> on
>>> the same page and have similar expectations. The second goal is to
>>> eliminate ambiguities from the versioning doc (thanks Vinod for
>>> volunteering to update it).
>>>
>>
>> +1 Let me know if there are things I can help with.
>>
>>
>>>
>>> 1. API vs. semantic changes.
>>> Current versioning guide treat features (e.g. flags, metrics, endpoints)
>>> and API differently: incompatible changes for the former are allowed
>>> after
>>> 6 month deprecation cycle, while for the latter they require bumping a
>>> major version. I suggest we consolidate these policies.
>>>
>>
>> I feel that the distinction is not API vs. semantic changes, Backwards
>> compatible API guarantee should imply backwards compatible semantics (of
>> the API).
>> i.e., if a change in API doesn't cause the message to be dropped to the
>> floor but leads to behavior change that causes problems in the system, it
>> still breaks compatibility.
>>
>> IMO the distinction is more between:
>> - Compatibility between components that are impossible/very unpleasant to
>> upgrade in lockstep - high priority for compatibility guarantee.
>> - Compatibility between components that are generally bundled (modules)
>> or things that usually aren't built into automated tooling (e.g., the
>> /state endpoint) - more relaxed for now but we should explicitly exclude
>> them from the guarantee.
>>
>>
>>>
>>> We should also define and clearly explain what changes require bumping
>>> the
>>> major version. I have no strong opinion here and would love to hear what
>>> people think. The original motivation for maintaining backwards
>>> compatibility is to make sure vN schedulers can correctly work with vN
>>> API
>>> without being updated. But what about semantic changes that do not touch
>>> the API? For example, what if we decide to send less task health updates
>>> to
>>> schedulers based on some health policy? It influences the flow of task
>>> status updates, should such change be considered compatible? Taking it to
>>> an extreme, we may not even be able to fix some bugs because someone may
>>> already rely on this behaviour!
>>>
>>
>> API changes should warrant a major version bump. Also the API is not just
>> what the machine reads but all the documentation associated with it, right?
>> It depends on what the documentation says; what the user _should_ expect.
>>
>> That said, I feel that these things are hard to be talked about in the
>> abstract. Even with a guideline, we still need to make case-by-case
>> decisions. (e.g., has the documentation precisely defined this precise
>> behavior? If not, is it reasonable for the users to expect some behavior
>> because it's common sense? How bad is it if some behavior just changes a
>> tiny bit?) Therefore we need to make sure the process for API changes are
>> more rigorously defined.
>>
>> Whether something is a bug depends on whether the API does what it says
>> it'll do. The line may sometimes be blurry but in general I don't feel it's
>> a problem. If someone is relying on the behavior that is a bug, we should
>> still help them fix it but the bug shouldn't count as "our guarantee".
>>
>>
>>>
>>> Another tightly related thing we should explicitly call out is
>>> upgradability and rollback capabilities inside a major release.
>>> Committing
>>> to this may significantly limit what we can change within a major
>>> release;
>>> on the other side it will give users more time and a 

Re: Force offer from all of the slaves

2016-11-28 Thread Vinod Kone
Once you set GLOG_v, you should be able to see lines like these "Framework
 filtered agent   for <123> seconds"

On Sun, Nov 27, 2016 at 8:18 AM, haosdent  wrote:

> > I choose the right offer and decline the rest.
> Hi, @krishnanvr Do you use up all available resources in that agent's
> offer? If so, that agent could not provide offers anymore until the
> resource release.
>
> And you may consider starting the master with the `GLOG_v=1` environment
> variable which would print more detail logs to help you debug this.
>
> On Sat, Nov 26, 2016 at 5:05 PM, Krishnanarayanan VR <
> krishna...@phonepe.com> wrote:
>
>> Hello:
>>
>> Is there a way to force ResourceOffers to get offers from all available
>> slaves ?
>>
>> Let me clarify:
>>
>> I have a single framework in my cluster. Each time ResourceOffers gets
>> the list of offers, I choose the right offer and decline the rest. But I
>> notice that next time a callback to ResourceOffers occurs, only a subset of
>> slaves is present in the offer. The slave from offer that was chosen in the
>> previous iteration is invariably absent.
>>
>> I also tried to set refuse_seconds to 0 in  both LaunchTasks and
>> Decline(egs below):
>>
>> driver.DeclineOffer(offer.Id, {RefuseSeconds:
>> proto.Float64(0)})
>>
>> ^^ but that didn't seem to help.
>>
>> Any pointers how I can make sure am presented with offers from all the
>> slaves all the time ?
>>
>> Thanks
>>
>>
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>


Re: Mesos making offers for no CPU

2016-11-27 Thread Vinod Kone
On Sun, Nov 27, 2016 at 7:53 PM, Christopher Hunt <
christopher.h...@lightbend.com> wrote:

> My question here though is in the event of receiving a resource offer with
> no CPU in it, and then declining it, why shouldn’t my framework receive
> offers regarding other nodes with CPU? Surely a resource offer declination
> is an indicator that the particular node in question isn’t suitable and
> Mesos should move on…
>

You should receive offers from other nodes. Are there other frameworks in
your cluster that are starving out this framework? Can you see (and paste
it here) master logs to see who are being sent offers for other nodes?


[RESULT][VOTE] Release Apache Mesos 1.0.2 (rc3)

2016-11-15 Thread Vinod Kone
Hi all,


The vote for Mesos 1.0.2 (rc3) has passed with the

following votes.


+1 (Binding)

--

Alex Rukletsov

Till Toenshoff

Yan Xu


There were no 0 or -1 votes.


Please find the release at:

https://dist.apache.org/repos/dist/release/mesos/1.0.2


It is recommended to use a mirror to download the release:

http://www.apache.org/dyn/closer.cgi


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.2


The mesos-1.0.2.jar has been released to:

https://repository.apache.org


The website (http://mesos.apache.org) will be updated shortly to reflect
this release.


Thanks,


[VOTE] Release Apache Mesos 1.0.2 (rc2)

2016-10-31 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.2.


This is a bug fix release.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.2-rc2




The candidate for Mesos 1.0.2 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc2/mesos-1.0.2.tar.gz


The tag to be voted on is 1.0.2-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.2-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc2/mesos-1.0.2.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.2-rc2/mesos-1.0.2.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1164


Please vote on releasing this package as Apache Mesos 1.0.2!


The vote is open until Thu Nov  3 16:34:20 PDT 2016 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.2

[ ] -1 Do not release this package because ...


Thanks,


Re: outstanding offers

2016-10-31 Thread Vinod Kone
Are you running a custom framework?

Can you see in scheduler logs which offers you are receiving? Am I
understanding your question correctly that Mesos thinks offers are being
sent to your framework but (you think) your framework hasn't received them?

Note that you can increase logging on the framework (driver) and Mesos
master by setting GLOG_v=1 in the environment.

On Mon, Oct 31, 2016 at 12:42 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> I have a Mesos 0.28.2 system and generally things seem to run fine. The
> "Outstanding Offers" normally shows nothing, which I believe is normal.
> However at some point my framework gets disconnected for some odd reason,
> might be due to some high load or so. A few seconds later I receive a
> reregistered call from Mesos. However it looks like around this time offers
> start to get listed on the "Oustanding Offers" page. Even more strangely no
> Mesos log file contains any information for the offer IDs shown.
> Unfortunately the default logging does not show what offer IDs are being
> send out while it shows the IDs that are being declined or got accepted. So
> I don't know when these actually offers got send out.
>
> How can I deal with such situation? Should I:
> Stop the SchedulerDriver when I get disconnected instead of waiting
> for a reregistered call?
> Is it advised to set --offer_timeout to recover from such a situation?
> Is there any way to reconcile offers like one can do for tasks?
>
> thanks,
> Hendrik
>


Re: On Mesos versioning and deprecation policy

2016-10-28 Thread Vinod Kone
We had an extended discussion around this in the last community sync.
Thanks for those who participated!

To sum up the discussion:

--> As mesos devs, we should strive to not make incompatible changes in
APIs, flags, environment variables.

--> In the rare case where an incompatible change is preferred (e.g., code
complexity), we should give a clear 6 months heads up the users that a
breaking change is going to take place.

--> Breaking changes do not necessitate a major version bump. This is
because we want to allow live upgrades between major versions (e.g., 1.10
to 2.0).

--> Compatibility guarantees do not apply to experimental features (incl.
APIs).

--> We need to have clear documentation about procedure that devs could
follow when deprecating/removing stable features and adding experimental
features.

--> We need to improve upgrades.md to make it easy for operators to know
what features are deprecated/removed between versions X and Y.

--> We should decouple internal protos used by Mesos from the unversioned
protos used by driver based frameworks.

I will spend some time in the next few weeks to create/update the
documentation reflecting these points.

Anything else I missed?

Thanks,

On Sat, Oct 15, 2016 at 11:47 AM, haosdent <haosd...@gmail.com> wrote:

> Thanks @yan's great inputs! I couldn't agree more almost of them.
>
> > Also the API is not just what the machine reads but all the documentation
> associated with it, right? It depends on what the documentation says; what
> the user _should_ expect.
>
> I think different users may have different expectations. And the guy who
> developed the APIs may have different understand from some users as well.
> Our documentations should cover most of cases.
>
> But in case that we didn't or forgot to write it explicitly in the
> document, should we give up to update the API? Just like user Alice said
> this is a BUG while user Bob said this is a feature. I think we still need
> to raise it case by case to ensure most users are not affected by the
> breaking API changes.
>
> On Sat, Oct 15, 2016 at 6:55 AM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > We will chat about this in the upcoming community sync (thursday 3 PM).
> > So, please make sure to attend if you are interested.
> >
> > On Fri, Oct 14, 2016 at 3:44 PM, Yan Xu <xuj...@apple.com> wrote:
> >
> >>
> >> On Fri, Oct 14, 2016 at 3:37 PM, Yan Xu <xuj...@apple.com> wrote:
> >>
> >>> Thanks Alex for starting this!
> >>>
> >>> In addition to comments below, I think it'll be helpful to keep the
> >>> existing versioning doc concise and user-friendly while having a
> dedicated
> >>> doc for the "implementation details" where precise requirements and
> >>> procedures go. Maybe some duplication/cross-referencing is needed but
> Mesos
> >>> developers will find the latter much more helpful while the
> users/framework
> >>> developer will find the former easy to read.
> >>>
> >>> e.g., a similar split:
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/api.md
> >>> https://github.com/kubernetes/kubernetes/blob/master/docs/de
> >>> vel/api_changes.md (which has a lot of details on how the kubernetes
> >>> community is thinking about similar issues, which we can learn from)
> >>>
> >>> Jiang Yan Xu 
> >>>
> >>> On Wed, Oct 12, 2016 at 9:34 AM, Alex Rukletsov <a...@mesosphere.com>
> >>> wrote:
> >>>
> >>>> Folks,
> >>>>
> >>>> There have been a bunch of online [1, 2] and offline discussions about
> >>>> our
> >>>> deprecation and versioning policy. I found that people—including
> >>>> myself—read the versioning doc [3] differently; moreover some aspects
> >>>> are
> >>>> not captured there. I would like to start a discussion around this
> >>>> topic by
> >>>> sharing my confusions and suggestions. This will hopefully help us
> stay
> >>>> on
> >>>> the same page and have similar expectations. The second goal is to
> >>>> eliminate ambiguities from the versioning doc (thanks Vinod for
> >>>> volunteering to update it).
> >>>>
> >>>
> >>> +1 Let me know if there are things I can help with.
> >>>
> >>>
> >>>>
> >>>> 1. API vs. semantic changes.
> >>>> Current versioning guide treat features (e.g. flags, metrics,
> endpoints)
> >>>> and API dif

Re: Quota

2016-12-09 Thread Vinod Kone
How many resources does the agent register with the master? How many
resources does spark task need?

I'm guessing marathon is not registered with "test" role so it is only
getting un-reserved resources which are not enough for spark task?

On Fri, Dec 9, 2016 at 2:54 PM, Vijay Srinivasaraghavan <
vijikar...@yahoo.com> wrote:

> I have a standalone DCOS setup (Single node Vagrant VM running DCOS
> v.1.9-dev build + Mesos 1.0.1 + Marathon 1.3.0). Both master and agent are
> running on same VM.
>
> Resource: 4 CPU, 16GB Memory, 20G Disk
>
> I have created a quota using new V1 API which creates a role "test" with
> resource constraints of 0.5 CPU and 1G Memory.
>
> When I try to deploy Spark package, Marathon receives the request but the
> task is in "waiting" state since it did not receive any offers from Master
> though I don't see any resource constraints from the hardware perspective.
>
> However, when I deleted the quota, Marathon is able to move forward with
> the deployment and Spark was deployed/up and running. I could see from the
> Mesos master logs that it had sent an offer to the Marathon framework.
>
> To debug the issue, I was trying to create a quota but this time did not
> provide any CPU and Memory (0 cpu and 0 mem). After this, when I try to
> deploy Spark from DCOS UI, I could see Marathon getting offer from Master
> and able to deploy Spark without the need to delete the quota this time.
>
> Did anyone notice similar behavior?
>
> Regards
> Vijay
>


Re: Quota

2016-12-09 Thread Vinod Kone
And how many resources does spark need?

On Fri, Dec 9, 2016 at 4:05 PM, Vijay Srinivasaraghavan <
vijikar...@yahoo.com> wrote:

> Here is the slave state info. I see marathon is registered as
> "slave_public" role and is configured with "default_accepted_resource_roles"
> as "*"
>
> "slaves":[
>   {
>  "id":"69356344-e2c4-453d-baaf-22df4a4cc430-S0",
>  "pid":"slave(1)@xxx.xxx.xxx.100:5051",
>  "hostname":"xxx.xxx.xxx.100",
>  "registered_time":1481267726.19244,
>  "resources":{
> "disk":12099.0,
> "mem":14863.0,
> "gpus":0.0,
> "cpus":4.0,
> "ports":"[1025-2180, 2182-3887, 3889-5049,
> 5052-8079, 8082-8180, 8182-32000]"
>  },
>  "used_resources":{
> "disk":0.0,
> "mem":0.0,
> "gpus":0.0,
> "cpus":0.0
>  },
>  "offered_resources":{
> "disk":0.0,
> "mem":0.0,
> "gpus":0.0,
> "cpus":0.0
>  },
>  "reserved_resources":{
>
>  },
>  "unreserved_resources":{
> "disk":12099.0,
> "mem":14863.0,
> "gpus":0.0,
> "cpus":4.0,
> "ports":"[1025-2180, 2182-3887, 3889-5049,
> 5052-8079, 8082-8180, 8182-32000]"
>  },
>  "attributes":{
>
>  },
>  "active":true,
>  "version":"1.0.1"
>   }
>],
>
> Regards
> Vijay
> On Friday, December 9, 2016 3:48 PM, Vinod Kone <vinodk...@apache.org>
> wrote:
>
>
> How many resources does the agent register with the master? How many
> resources does spark task need?
>
> I'm guessing marathon is not registered with "test" role so it is only
> getting un-reserved resources which are not enough for spark task?
>
> On Fri, Dec 9, 2016 at 2:54 PM, Vijay Srinivasaraghavan <
> vijikar...@yahoo.com> wrote:
>
> I have a standalone DCOS setup (Single node Vagrant VM running DCOS
> v.1.9-dev build + Mesos 1.0.1 + Marathon 1.3.0). Both master and agent are
> running on same VM.
>
> Resource: 4 CPU, 16GB Memory, 20G Disk
>
> I have created a quota using new V1 API which creates a role "test" with
> resource constraints of 0.5 CPU and 1G Memory.
>
> When I try to deploy Spark package, Marathon receives the request but the
> task is in "waiting" state since it did not receive any offers from Master
> though I don't see any resource constraints from the hardware perspective.
>
> However, when I deleted the quota, Marathon is able to move forward with
> the deployment and Spark was deployed/up and running. I could see from the
> Mesos master logs that it had sent an offer to the Marathon framework.
>
> To debug the issue, I was trying to create a quota but this time did not
> provide any CPU and Memory (0 cpu and 0 mem). After this, when I try to
> deploy Spark from DCOS UI, I could see Marathon getting offer from Master
> and able to deploy Spark without the need to delete the quota this time.
>
> Did anyone notice similar behavior?
>
> Regards
> Vijay
>
>
>
>
>


Welcome Haosdent Huang as Mesos Committer and PMC member!

2016-12-16 Thread Vinod Kone
Hi folks,

Please join me in formally welcoming Haosdent Huang as Mesos Committer and
PMC member.

Haosdent has been an active contributor to the project for more than a year
now. He has contributed a number of patches and features to the Mesos code
base, most notably the unified cgroups isolator and health check
improvements. The most impressive thing about him is that he always
volunteers to help out people in the community, be it on slack/IRC or
mailing lists. The fact that he does all this even though working on Mesos
is not part of his day job is even more impressive.

Here is his more formal checklist

for your perusal.

Thanks,
Vinod

P.S: Sorry for the delay in sending the welcome email.


Re: Mesos YouTube Channel

2017-01-09 Thread Vinod Kone
Thanks for doing this MPark!

On Mon, Jan 9, 2017 at 6:21 PM, Michael Park  wrote:

> I've created a brand channel for Mesos on YouTube for community activities:
> https://www.youtube.com/channel/UC0wxLxgX8ilUn0m31lCpzAw.
>
> The only community activities currently captured in the channel are:
>   - Developer Community Meetings, and
>   - MesosCon presentations I've collected as "Saved Playlists".
>
> Going forward, I think we can use this channel for work group meetings as
> well
> once those are a little more fleshed out.
>
> Thanks,
>
> MPark
>


Re: Authentication module

2016-12-04 Thread Vinod Kone
Authentication is enabled for Mesos APIs used by schedulers (to talk to
master), operators (to talk to master/agent) and agents (to talk to
master). Executor to agent communication is not currently authenticated.

This might throw some light:
https://github.com/apache/mesos/blob/master/docs/authentication.md

On Fri, Dec 2, 2016 at 11:48 AM, Alexander Gallego 
wrote:

>
> For the authentication module: http://mesos.apache.org/
> documentation/latest/modules/ does it mean kerberos,ldap, etc for tasks
> or for framework registration or for machine registration
>
> are there any more docs on this?
>
>
>


Re: Proposal for evaluating Mesos scalability and robustness through stress test.

2017-01-06 Thread Vinod Kone
Great to hear!

Haven't looked at the doc yet, but I know some folks from Twitter were also
interested this.  https://issues.apache.org/jira/browse/MESOS-6768

Probably worth to see if the ideas can be consolidated?

On Fri, Jan 6, 2017 at 6:57 PM, Zhitao Li  wrote:

> (sending this again since previous attempt seemed bumped back)
>
> Hi folks,
>
> As all of you we are super excited to use Mesos to manage thousands of
> different applications on  our large-scale clusters. When the application
> and host amount keeps increasing, we are getting more and more curious
> about what would be the potential scalability limit/bottleneck to Mesos'
> centralized architecture and what is its robustness in the face of various
> failures. If we can identify them in advance, probably we can manage and
> optimize them before we are suffering in any potential performance
> degradations.
>
> To explore Mesos' capability and break the knowledge gap, we have a
> proposal to evaluate Mesos scalability and robustness through stress test,
> the draft of which can be found at: draft_link
>  qpXzHYFQAZGWjCdS3cZA/edit?usp=sharing>.
> Please
> feel free to provide your suggestions and feedback through comment on the
> draft.
>
> Probably many of you have similar questions as we have. We will be happy to
> share our findings in these experiments with the Mesos community. Please
> stay tuned.
>
> --
> Cheers,
>
> Ao Ma & Zhitao Li
>


Re: Mesos 1.1.1 release dashboard

2016-12-22 Thread Vinod Kone
Same deal with the next patch release for 1.0.x ;)

@vinodkone

> On Dec 22, 2016, at 10:15 AM, Alex Rukletsov  wrote:
> 
> Folks,
> 
> We are planning to cut the 1.1.1 release early next week. If you have any
> patches that need to get into 1.1.1, please make sure that either it is
> already in the 1.1.x branch or the corresponding ticket has a target
> version including 1.1.1 *by Monday* Dec 26.
> 
> The release dashboard:
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12329892
> 
> AlexR & Till.


Re: Welcome Guangya Liu as Mesos Committer and PMC member!

2016-12-16 Thread Vinod Kone
Congrats Guangya! Welcome to the PMC!

On Fri, Dec 16, 2016 at 7:03 PM, Sam  wrote:

> congratulations Guangya
>
> Sent from my iPhone
>
> On 17 Dec 2016, at 3:23 AM, Avinash Sridharan 
> wrote:
>
> Congrats Guangya !!
>
> On Fri, Dec 16, 2016 at 11:20 AM, Greg Mann  wrote:
>
>> Congratulations Guangya!!! :D
>>
>> On Fri, Dec 16, 2016 at 11:10 AM, Jie Yu  wrote:
>>
>>> Hi folks,
>>>
>>> Please join me in formally welcoming Guangya Liu as Mesos Committer and
>>> PMC
>>> member.
>>>
>>> Guangya has worked on the project for more than a year now and has been a
>>> very active contributor to the project. I think one of the most important
>>> contribution he has for the community is that he helped grow the Mesos
>>> community in China. He initiated the Xian-Mesos-User-Group and
>>> successfully
>>> organized two meetups which attracted more than 100 people from Xi’an
>>> China. He wrote a handful of blogs and articles in Chinese tech media
>>> which
>>> attracted a lot of interests in Mesos. He had given several talks about
>>> Mesos at conferences in China.
>>>
>>> His major coding contribution to the project was the docker volume driver
>>> isolator. He has also been involved in allocator performance improvement,
>>> gpu support for docker containerizer, Mesos Tiers/Optimistic Offer
>>> design,
>>> scarce resources discussion, and many others.
>>>
>>> His formal checklist is here:
>>> https://docs.google.com/document/d/1tot79kyJCTTgJHBhzStFKrVkDK4pX
>>> qfl-LHCLOovNtI/edit?usp=sharing
>>> 
>>>
>>> Thanks,
>>> - Jie
>>>
>>
>>
>
>
> --
> Avinash Sridharan, Mesosphere
> +1 (323) 702 5245 <(323)%20702-5245>
>
>


Re: [VOTE] Release Apache Mesos 1.2.0 (rc2)

2017-03-03 Thread Vinod Kone
+1 (binding)

Since the perf and flaky test that I reported earlier doesn't seem to be
blockers.

On Fri, Mar 3, 2017 at 4:01 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> I haven't heard any -1's so I'm going to go ahead and vote myself, from a
> DC/OS perspective:
>
> +1 (binding)
>
> I ran 1.2.0-rc2 through the DC/OS integration tests on top of the
> 1.9.0-rc1, which covers many Mesos features and tests multiple frameworks.
> See CI results of https://github.com/dcos/dcos/pull/1295
>
> This was then merged into DC/OS 1.9.0-rc2 which passed another suite of
> integration tests. Available for testing at https://dcos.io/releases/1.9.
> 0-rc2/
>
>
> On Thu, Mar 2, 2017 at 12:02 AM, Adam Bordelon <a...@mesosphere.io> wrote:
>
>> TL;DR: No consensus yet. Let's extend the vote for a day or two, until we
>> have 3 +1s or a legit -1.
>> During that time we can test further, and investigate any issues that
>> have shown up.
>>
>> Here's a summary of what's been reported on the 1.2.0-rc2 vote thread:
>>
>> - There was a perf core dump on ASF CI, which is not necessarily a
>> blocker:
>> MESOS-7160  Parsing of perf version segfaults
>>   Perhaps fixed by backporting MESOS-6982: PerfTest.Version fails on
>> recent Arch Linux
>>
>> - There were a couple of (known/unsurprising) flaky tests:
>> MESOS-7185  
>> DockerRuntimeIsolatorTest.ROOT_INTERNET_CURL_DockerDefaultEntryptRegistryPuller
>> is flaky
>> MESOS-4570  DockerFetcherPluginTest.INTERNET_CURL_FetchImage seems flaky.
>>
>> - If we were to have an rc3, the following Critical bugs could be
>> included:
>> MESOS-7050  IOSwitchboard FDs leaked when containerizer launch fails --
>> leads to deadlock
>> MESOS-6982  PerfTest.Version fails on recent Arch Linux
>>
>> - Plus doc updates:
>> MESOS-7188 Add documentation for Debug APIs to Operator API doc
>> MESOS-7189 Add nested container launch/wait/kill APIs to agent API
>> docs.
>>
>>
>> On Wed, Mar 1, 2017 at 11:30 AM, Neil Conway <neil.con...@gmail.com>
>> wrote:
>>
>>> The perf core dump might be addressed if we backport this change:
>>>
>>> https://reviews.apache.org/r/56611/
>>>
>>> Although my guess is that this isn't a severe problem: for some
>>> as-yet-unknown reason, running `perf` on the host segfaulted, which
>>> causes the test to fail.
>>>
>>> Neil
>>>
>>> On Wed, Mar 1, 2017 at 11:09 AM, Vinod Kone <vinodk...@apache.org>
>>> wrote:
>>> > Tested on ASF CI.
>>> >
>>> > Saw 2 configurations fail. One was the perf core dump issue
>>> > <https://issues.apache.org/jira/browse/MESOS-7160>. Other is a known
>>> (since
>>> > 0..28.0) flaky test with Docker fetcher plugin
>>> > <https://issues.apache.org/jira/browse/MESOS-4570>.
>>> >
>>> > Withholding the vote until we know the severity of the perf core dump.
>>> >
>>> >
>>> > *Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff
>>> >
>>> >- refs/tags/1.1.1-rc2
>>> >
>>> > Configuration Matrix gcc clang
>>> > centos:7 --verbose --enable-libevent --enable-ssl autotools
>>> > [image: Success]
>>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--ver
>>> bose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=
>>> 1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
>>> 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>> > [image: Not run]
>>> > cmake
>>> > [image: Success]
>>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose
>>> %20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
>>> 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoo
>>> p)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>> > [image: Not run]
>>> > --verbose autotools
>>> > [image: Success]
>>> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Rel
>>> ease/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--ver
>>> bose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,
>>> label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
>>> > [image: Not run]
>>> > cmake
>>> > [image: Success]
>>> > <h

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-03 Thread Vinod Kone
+1 (binding)

Since the perf issue I reported earlier doesn't seem to be a blocker.

On Fri, Mar 3, 2017 at 12:14 AM, Alex Rukletsov <a...@mesosphere.com> wrote:

> Was this perf issue introduced by one of the fixes included in 1.1.1-rc2?
> If not, I would suggest we vote for 1.1.1-rc2 and back port the perf fix
> into 1.1.2. IIUC, time based patch releases should *not be worse*, hence if
> the perf issue was already in 1.1.0 it is *fine* to fix it in 1.1.2. I
> would like to avoid postponing already belated 1.1.1 for even longer.
>
> On Wed, Mar 1, 2017 at 8:02 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > Tested on ASF CI.
> >
> > Saw 2 configurations fail with
> > https://issues.apache.org/jira/browse/MESOS-7160
> >
> > I think @jpeach and @bbannier were looking into this. Not sure about the
> > severity of the issue, so withholding my vote.
> >
> >
> > *Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff
> >
> >- refs/tags/1.1.1-rc2
> >
> > Configuration Matrix gcc clang
> > centos:7 --verbose --enable-libevent --enable-ssl autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%
> > 7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > --verbose autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_
> > exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%
> > 3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Not run]
> > ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Failed]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=clang,
> CONFIGURATION=--verbose%20--
> > enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%
> > 20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%
> > 7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > cmake
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--
> > verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> > docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=-
> > -verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=
> > GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(
> > docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > --verbose autotools
> > [image: Success]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,
> > label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
> > [image: Failed]
> > <https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-
> > Release/30/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,
> > ENVIRONMENT=GLOG_v=1%20MESOS_VERBO

Re: [VOTE] Release Apache Mesos 1.1.1 (rc2)

2017-03-01 Thread Vinod Kone
Tested on ASF CI.

Saw 2 configurations fail with
https://issues.apache.org/jira/browse/MESOS-7160

I think @jpeach and @bbannier were looking into this. Not sure about the
severity of the issue, so withholding my vote.


*Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff

   - refs/tags/1.1.1-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Mon, Feb 27, 2017 at 5:54 AM, Alex Rukletsov  wrote:

> Hi all,
>
> Please vote on releasing the following candidate as Apache Mesos 1.1.1.
>
> 1.1.1 includes the following:
> 
> 
> ** Bug
>   * [MESOS-6002] - The whiteout file cannot be removed correctly using
> aufs backend.
>   * [MESOS-6010] - Docker registry puller shows decode error "No response
> decoded".
>   * [MESOS-6142] - Frameworks may RESERVE for an arbitrary role.
>   * [MESOS-6360] - The handling of whiteout files in provisioner is not
> correct.
>   * [MESOS-6411] - Add documentation for CNI port-mapper plugin.
>   * [MESOS-6526] - `mesos-containerizer launch --environment` exposes
> executor env vars in `ps`.
>   * [MESOS-6571] - Add "--task" flag to mesos-execute.
>   * [MESOS-6597] - Include v1 Operator API protos in generated JAR and
> python packages.
>   * [MESOS-6606] - Reject optimized builds with libcxx before 3.9.
>   * [MESOS-6621] - SSL downgrade path will CHECK-fail when using both
> 

Re: [VOTE] Release Apache Mesos 1.2.0 (rc2)

2017-03-01 Thread Vinod Kone
Tested on ASF CI.

Saw 2 configurations fail. One was the perf core dump issue
. Other is a known (since
0..28.0) flaky test with Docker fetcher plugin
.

Withholding the vote until we know the severity of the perf core dump.


*Revision*: b9d8202a7444d0d1e49476bfc9817eb4583beaff

   - refs/tags/1.1.1-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Wed, Mar 1, 2017 at 9:24 AM, Greg Mann  wrote:

> I wanted to give a heads up on a flaky test failure I've encountered while
> testing this RC: 'DockerRuntimeIsolatorTest.ROO
> T_INTERNET_CURL_DockerDefaultEntryptRegistryPuller'. One issue related to
> this test was resolved recently (https://issues.apache.org/
> jira/browse/MESOS-6001), but this seems to be a separate issue (
> https://issues.apache.org/jira/browse/MESOS-7185). I haven't had time to
> triage yet so I'm not sure if this represents a legitimate bug, but I
> thought I'd email here to increase visibility while the vote is out.
>
> Cheers,
> Greg
>
>
> On Fri, Feb 24, 2017 at 1:14 AM, Adam Bordelon  wrote:
>
> > Dear Mesos developers and users,
> >
> > Please vote on releasing the following candidate as Apache Mesos 1.2.0.
> >
> > 1.2.0 includes the following:
> > 
> > 
> >   * 

Re: resourceOffer

2017-03-07 Thread Vinod Kone
Hmm. These logs do not have enough information. All I see is a master
starting up and an agent re-registering with a bunch of orphan tasks.  I
don't see the framework re-registering with the master at all.

On Tue, Mar 7, 2017 at 9:41 AM, Oeg Bizz <oegb...@yahoo.com> wrote:

> Sure, there they are.
>
>
> On Tuesday, March 7, 2017 12:34 PM, Vinod Kone <vinodk...@gmail.com>
> wrote:
>
>
> Can you share master log?
>
> @vinodkone
>
> On Mar 7, 2017, at 2:54 AM, Oeg Bizz <oegb...@yahoo.com> wrote:
>
> Hi,
>I am new at mesos and started exploring its usability for a new project
> I will be involved.  I wrote an scheduler and an executor and I am able to
> send one task which is executed properly.  After the first task is finished
> I no longer get resourceOffer() invocations to my Scheduler.  What am I
> missing?  If I do not send a task I can the resourceOffer calls
> consistently every 5 seconds or so.  Also, does Mesos send all of the
> resources every time or just a partial list?  Thanks in advance for any
> help,
>
> Oscar
>
>
>
>


[VOTE] Release Apache Mesos 1.0.4 (rc1)

2017-04-17 Thread Vinod Kone
Hi all,

Please vote on releasing the following candidate as Apache Mesos 1.0.4.


1.0.4 includes the following:



* [MESOS-2537] - AC_ARG_ENABLED checks are broken


* [MESOS-6606] - Reject optimized builds with libcxx before 3.9


* [MESOS-7008] - Quota not recovered from registry in empty cluster.


* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire
persistent volume content.

* [MESOS-7383] - Docker executor logs possibly sensitive parameters.



The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.4-rc1




The candidate for Mesos 1.0.4 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc1/mesos-1.0.4.tar.gz


The tag to be voted on is 1.0.4-rc1:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.4-rc1


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc1/mesos-1.0.4.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc1/mesos-1.0.4.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1184


Please vote on releasing this package as Apache Mesos 1.0.4!


The vote is open until Thu Apr 20 15:42:56 PDT 2017 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.4

[ ] -1 Do not release this package because ...


Thanks,


Re: [VOTE] Release Apache Mesos 1.0.4 (rc1)

2017-04-24 Thread Vinod Kone
+1 (binding)

Tested on ASF CI.

*Revision*: 71e41f166f671c988e36c1bf04728ec3589eb509

   - refs/tags/1.0.4-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/31/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>

On Mon, Apr 17, 2017 at 4:49 PM, Adam Bordelon <a...@mesosphere.io> wrote:

> -0, wish we could include the fix for https://issues.apache.org/jira
> /browse/MESOS-7265 in 1.0.4, but I won't hold the release for it.
>
> On Mon, Apr 17, 2017 at 3:44 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
>> Hi all,
>>
>> Please vote on releasing the following candidate as Apache Mesos 1.0.4.
>>
>>
>> 1.0.4 includes the following:
>>
>> 
>> 
>>
>> * [MESOS-2537] - AC_ARG_ENABLED checks are broken
>>
>>
>> * [MESOS-6606] - Reject optimized builds with libcxx before 3.9
>>
>>
>> * [MESOS-7008] - Quota not recovered from registry in empty cluster.
>>
>>
>> * [MESOS-7366] - Agent sandbox gc could accidentally delete the
>> entire persistent volume content.
>>
&g

Re: [VOTE] Release Apache Mesos 1.3.1 (rc1)

2017-08-01 Thread Vinod Kone
+1 (binding)

Tested on ASF CI. The 2 red builds are known flaky tests (health checks)
and a perf core dump issue that's fixed on HEAD.

*Revision*: 1beaede8c13f0832d4921121da34f924deec8950

   - refs/tags/1.3.1-rc1

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Failed]

[image: Not run]
cmake
[image: Success]

[image: Not run]
--verbose autotools
[image: Success]

[image: Not run]
cmake
[image: Success]

[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]

[image: Success]

cmake
[image: Success]

[image: Success]

--verbose autotools
[image: Success]

[image: Failed]

cmake
[image: Success]

[image: Success]


On Fri, Jul 28, 2017 at 5:45 PM, Michael Park  wrote:

> Hi all,
>
> Please vote on releasing the following candidate as Apache Mesos 1.3.1.
>
> The CHANGELOG for the release is available at:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_
> plain;f=CHANGELOG;hb=1.3.1-rc1
> 
> 
>
> The candidate for Mesos 1.3.1 release is available at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.1-rc1/mesos-1.3.1.tar.gz
>
> The tag to be voted on is 1.3.1-rc1:
> https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.3.1-rc1
>
> The MD5 checksum of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.1-rc1/
> mesos-1.3.1.tar.gz.md5
>
> The signature of the tarball can be found at:
> https://dist.apache.org/repos/dist/dev/mesos/1.3.1-rc1/
> mesos-1.3.1.tar.gz.asc
>
> The PGP key used to sign the release is here:
> https://dist.apache.org/repos/dist/release/mesos/KEYS
>
> The JAR is up in Maven in a staging repository here:
> https://repository.apache.org/content/repositories/orgapachemesos-1200
>
> Please vote on 

Re: How to detemine Mesos Capabilities?

2017-07-05 Thread Vinod Kone
When a scheduler registers or re-registers with the master, `MasterInfo` is
provided as the callback. This includes the version information which can
be used to determine which capabilities a Master has. This is admittedly
not great; there is a ticket to introduce Master capabilities and include
them in MasterInfo. https://issues.apache.org/jira/browse/MESOS-5675

On Wed, Jul 5, 2017 at 3:58 AM, Tomek Janiszewski  wrote:

> Here is the context of this problem https://github.com/
> mesosphere/marathon/pull/5406#discussion_r125454193
> I want to backport support for Mesos HealthChecks to Marathon 1.3. How can
> I ensure that Mesos supports HTTP/TCP healthchecks form Marathon
> perspective?
>
> wt., 4 lip 2017 o 17:56 użytkownik Tomek Janiszewski 
> napisał:
>
>> Hi
>>
>> Mesos allows frameworks to declare it's abilities. How can I get Mesos
>> capabilities from framework perspective?
>>
>> For example, I'm developing a framework that would use Mesos
>> Healthchecks. How can I determine if Mesos version support it. I think it
>> should be a part of subscription response. Currently I need to query Mesos
>> API after subscription to get Mesos version and configuration. What is the
>> best practice to do this?
>>
>> Thanks
>> Tomek
>>
>


Fwd: Github's disappearing mirrors

2017-04-28 Thread Vinod Kone
FYI

-- Forwarded message --
From: Chris Lambertus 
Date: Fri, Apr 28, 2017 at 12:22 PM
Subject: Github's disappearing mirrors
To: committers 


Hello committers,

We have received quite a few reports of github mirrors gone missing. We’ve
tracked this down to an errant process at Github which appears to be
deleting
not only ours but also other orgs’ mirrors. We contacted Github but have
yet to
receive a reply. Another organization also contacted github and received the
following reply:

"Hi there, Sorry for the trouble! We've now had a couple of reports of this
problem, and we've opened an issue internally to investigate.  I don't have
an
ETA on a fix, but we'll be in touch if we need more information from you or
if
we have any information to share.  Regards, Laura GitHub Support”


We have no further information at this time. We have been restoring the
mirrors
wherever possible, but until the root cause is resolved on Github’s side, we
expect mirrors to continue to be erroneously removed.

Access to the repos via the usual https://git-wip-us.apache.org/ channel
remains functional.

-Chris
ASF Infra


signature.asc
Description: PGP signature


Re: dynamic resource reservations

2017-07-28 Thread Vinod Kone
Typically a framework with no role cannot use resources reserved for
another role. So, it would be interesting to see what happened.

Also, please be aware that directly upgrading from 0.28.0 to 1.3.0 is not
supported. You need to go from 0.28.0 to 1.0.0 and then jump from 1.0.0 to
1.3.0.

On Fri, Jul 28, 2017 at 4:01 AM, Hendrik Haddorp 
wrote:

> Hi,
>
> we did a migration from Mesos 0.28 to 1.3.0 and somehow it looks like one
> framework "stole" resources another framework had reserved earlier.
> Unfortunately I do not have any logs for the time frame so I'm not certain
> what exactly happened. Currently we have one framework running with a role
> and principal while the others are running with roles * and no principal.
> Would a framework running with no role be able to use a resource that
> another framework reserved for a specific role?
>
> regards,
> Hendrik
>


Re: Containerizers & Executors

2017-07-30 Thread Vinod Kone
See my answers inline.


> 1. Mesos Containerizer
> - posix isolators
> - cgroups isolators
>

Mesos container also allows you to use custom isolators.



> 2. Docker containerizer
> - docker isolators
>

Docker containerizer doesn't have a concept of isolator(s).



> 3. Custom containerizer
> - my isolators
>

It is up to the custom containerizer how it wants to do containerization;
it could've have a concept of isolator or not.


- Executors:
> Generally: Each executor has the minimum resources assigned by default
> (0.01 CPU & 32MB MEM)
>Executor expands its resources when a task is assigned
> (executor default resources + task resources)
>

Only the built-in "default" executor needs to have a minimum amount of
resources. Other built-in executors and custom executors can technically
have zero resources.



> 1. Mesos commandExecutor
> - run shell commands or docker
> - Each executor is a container that can have only one task to
> execute, you can't specify group of tasks
> - Isolation between executors/containers so isolation between
> tasks because each task runs in one container
>

Not that the executor that runs shell commands is called the "command"
executor (run by mesos containerizer), whereas the one that runs docker
images is called "docker" executor (run by docker containerizer).



> 2. Mesos defaultExecutor
> - can run shell commands or a custom executor file e.g
> TestExecutor.java (from tests)
> - can execute one task per executor/container or multiple tasks (1
> group).
> - No resource isolation between tasks of the same container
>

"default" executor is another built-in executor. it can run a group of
tasks. it does not run any other (custom) executor.



> 3. Custom Executor
> - ?
>

you could write a custom executor that can run a single task or group of
tasks. totally up to you.



> So, i guess i can use one offer to run some tasks on the same agent with
> commandExecutor or with defaultExecutor….
> But how would somebody specify if the offer corresponds to one agent or
> multiple agents?
>

Each offer has an 'AgentId' which corresponds to one agent.

HTH,
Vinod


[VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-02 Thread Vinod Kone
Hi all,


Please vote on releasing the following candidate as Apache Mesos 1.0.4.


1.0.4 includes the following:



* [MESOS-2537] - AC_ARG_ENABLED checks are broken


* [MESOS-6606] - Reject optimized builds with libcxx before 3.9


* [MESOS-7008] - Quota not recovered from registry in empty cluster.


* [MESOS-7265] - Containerizer startup may cause sensitive data to leak
into sandbox logs.

* [MESOS-7366] - Agent sandbox gc could accidentally delete the entire
persistent volume content.

* [MESOS-7383] - Docker executor logs possibly sensitive parameters.


* [MESOS-7422] - Docker containerizer should not leak possibly
sensitive data to agent log.


The CHANGELOG for the release is available at:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob_plain;f=CHANGELOG;hb=1.0.4-rc2




The candidate for Mesos 1.0.4 release is available at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc2/mesos-1.0.4.tar.gz


The tag to be voted on is 1.0.4-rc2:

https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=commit;h=1.0.4-rc2


The MD5 checksum of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc2/mesos-1.0.4.tar.gz.md5


The signature of the tarball can be found at:

https://dist.apache.org/repos/dist/dev/mesos/1.0.4-rc2/mesos-1.0.4.tar.gz.asc


The PGP key used to sign the release is here:

https://dist.apache.org/repos/dist/release/mesos/KEYS


The JAR is up in Maven in a staging repository here:

https://repository.apache.org/content/repositories/orgapachemesos-1186


Please vote on releasing this package as Apache Mesos 1.0.4!


The vote is open until Fri May  5 12:02:42 PDT 2017 and passes if a
majority of at least 3 +1 PMC votes are cast.


[ ] +1 Release this package as Apache Mesos 1.0.4

[ ] -1 Do not release this package because ...


Thanks,


Re: [VOTE] Release Apache Mesos 1.0.4 (rc2)

2017-05-03 Thread Vinod Kone
+1 (binding)

*Revision*: 4154f66d6c6dde8fd2cf2bbf0bfa155f24ac55d4

   - refs/tags/1.0.4-rc2

Configuration Matrix gcc clang
centos:7 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=centos%3A7,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Not run]
ubuntu:14.04 --verbose --enable-libevent --enable-ssl autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose%20--enable-libevent%20--enable-ssl,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
--verbose autotools
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=autotools,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
cmake
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=gcc,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>
[image: Success]
<https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Release/32/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1,OS=ubuntu%3A14.04,label_exp=(docker%7C%7CHadoop)&&(!ubuntu-us1)&&(!ubuntu-eu2)/>

On Tue, May 2, 2017 at 4:03 PM, Benjamin Mahler <bmah...@apache.org> wrote:

> +1 make check passes on macOS 10.12.4 with clang
>
> On Tue, May 2, 2017 at 12:04 PM, Vinod Kone <vinodk...@apache.org> wrote:
>
> > Hi all,
> >
> >
> > Please vote on releasing the following candidate as Apache Mesos 1.0.4.
> >
> >
> > 1.0.4 includes the following:
> >
> > 
> > 
> >
> > * [MESOS-2537] - AC_ARG_ENABLED checks are broken
> >
> >
> > * [MESOS-6606] - Reject optimized builds with libcxx before 3.9
> >
> >
> > * [MESOS-7008] - Quota not recovered from registry in empty cluster.
> >
> >
> > * [MESOS-7265] - Containerizer startup may cause sensitive data to
> leak
> > into sandbox logs.
> >
> > * [MESOS-7366] - Agent 

<    1   2   3   4   5   6   >