Re: MesosCon 2018 Location Change

2018-08-26 Thread David Palaitis
I’d be happy to support a mini-con / meetup in NYC at the same time. Reach out 
if you’re interested. 

> On Aug 26, 2018, at 1:02 AM, Vaibhav Khanduja  
> wrote:
> 
> +1 for bay area.
> 
> Thx
> 
>> On Sat, Aug 25, 2018, 3:20 PM Jörg Schad  wrote:
>> Just one more comment on the reasoning here: 
>> We (i.e., the PC) want MesosCon to be a user-driven conference and hence 
>> have the conference at a location where we can gather most users.
>> We understand it might be more difficult to travel to the Bay Area from 
>> Europe, but are already considering EU timezone friendly working groups 
>> meetings which could be joined remotely. Stay tuned here.
>> We understand this is a beyond last minute change, but we are considering as 
>> a result of community (i.e., everyone here) feedback.
>> 
>> Please also consider this is the first time we are organizing MesosCon as 
>> community ourselves (the previous years it was organized by Linux 
>> Foundation) and so far I must say kudos to everyone involved. It is great to 
>> see everyone working on making it a great Mesos (+Marathon, + Paasta, + ...) 
>> community conference!
>> 
>> Also feel free to reach out personally if you have questions! 
>> 
>> 
>> 
>>> On Fri, Aug 24, 2018 at 2:23 PM, Sunil Shah  wrote:
>>> Hey everyone,
>>> 
>>> As we continue to organise this year's MesosCon, I wanted to ask for your 
>>> preferences on location of the conference. Several community members have 
>>> expressed a desire to have the conference in the Bay Area (as opposed to 
>>> New York, as currently planned).
>>> 
>>> As a reminder, this year's MesosCon is a community run conference and is 
>>> planned for November 5th to 7th.
>>> 
>>> Please let me know if you have any strong feelings one way or another and 
>>> I'll take a summary back to the MesosCon Committee.
>>> 
>>> Cheers,
>>> 
>>> Sunil
>>> (P.S., If you haven't submitted a talk already, please do!)
>>> 
>>> 
>> 


Re: Scheduler for distributed builds

2016-04-27 Thread David Palaitis
Cookies a fit  in the case where you have more jobs than resources available to 
run them. It manages large job queues, prioritizes jobs and balances resources 
fairly across users, or roles, etc.

Cook is dfntly interesting for a build farm e.g distributed Basel, although the 
scheduling overhead for the job sizes you mention doesn't seem worth it. If you 
were to do parallelism at the code base  level then I could see a possible fit 
with cook. 

Alternatively. schedule a set of long running build workers with marathon and 
proxy work requests through a Kafka. 

I'd be happy to discuss in more detail if you think Cook might be a fit. 

Sent from my iPhone

> On Apr 27, 2016, at 6:06 PM, Erb, Stephan  wrote:
> 
> ​FWIW, Apache Aurora is also supporting ad-hoc jobs. In contrast to to 
> chronos however, only without job dependencies.
> From: Guangya Liu 
> Sent: Wednesday, April 27, 2016 09:59
> To: user@mesos.apache.org
> Subject: Re: Scheduler for distributed builds
>  
> The Chronos may also help for your case http://mesos.github.io/chronos/
> 
>> On Wed, Apr 27, 2016 at 6:40 AM, David Greenberg  
>> wrote:
>> http://github.com/twosigma/cook could be a good fit for this. It supports 
>> scheduling arbitrary jobs within seconds of submission, and it has advanced 
>> QoS features.
>> 
>>> On Tue, Apr 26, 2016 at 3:13 PM Paulo Gallo  wrote:
>>> Hi,
>>> 
>>> I'm looking for a scheduler to do distributed builds, i.e. most of the 
>>> tasks would be short lived, like a few hundred ms long.
>>> 
>>> Is there's any Mesos based scheduler that would be a good fit for this?
>>> 
>>> Any help is appreciated.
>>> 
>>> Thanks,
>>> -Paulo
>>> 
>>> PS: I know that Jenkins supports distributed builds and integrates with 
>>> Mesos, but we're looking for alternatives.
> 


RE: Cassandra Mesos Framework Issue

2014-10-19 Thread David Palaitis
I'm using 
https://mesosphere.com/2014/02/12/cassandra-on-mesos-scalable-enterprise-storage/

I'm using Mesos 0.19 here and I think that may be the contributing to the 
issue. I'm giving it a try with a clean build of 0.20.



-Original Message-
From: rasput...@gmail.com [mailto:rasput...@gmail.com] On Behalf Of Dick Davies
Sent: Sunday, October 19, 2014 7:35 AM
To: user@mesos.apache.org
Subject: Re: Cassandra Mesos Framework Issue

Issue seems to be with how the tasks are asking for port resources - I'd guess 
whichever tutorial you're using may be using an old/invalid syntax.

What tutorial are you working from?

On 18 October 2014 15:08, David Palaitis david.palai...@twosigma.com wrote:
 I am having trouble getting Cassandra Mesos to work in a simple test 
 environment. The framework connects, but tasks get lost with the 
 following error.



 215872 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - 
 Got new resource offers ArrayBuffer(abc.def.ghi.com)

 215875 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - 
 resources offered: List((cpus,32.0), (mem,127877.0), (disk,2167529.0),
 (ports,0.0))

 215875 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - 
 resources required: List((cpus,1.0), (mem,2048.0), (ports,0.0),
 (disk,1000.0))

 215877 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - 
 Accepted
 offer: abc.def.ghi.com

 215889 [Thread-114] INFO mesosphere.cassandra.CassandraScheduler  - 
 Received status update for task task1413640484861: TASK_LOST (Task 
 uses invalid
 resources: ports(*):0)



 I tried configuring a port resource in the slave and restarting but 
 still get the same error e.g.



 ${INSTALL_DIR}/sbin/mesos-slave \

 --master=zk://abc.def.ghi.com:2181/mesos \

 --resources='mem:245760;ports(*):[31000-32000]'



 Any leads?








Cassandra Mesos Framework Issue

2014-10-18 Thread David Palaitis
I am having trouble getting Cassandra Mesos to work in a simple test 
environment. The framework connects, but tasks get lost with the following 
error.

215872 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - Got new 
resource offers ArrayBuffer(abc.def.ghi.com)
215875 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - resources 
offered: List((cpus,32.0), (mem,127877.0), (disk,2167529.0), (ports,0.0))
215875 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - resources 
required: List((cpus,1.0), (mem,2048.0), (ports,0.0), (disk,1000.0))
215877 [Thread-113] INFO mesosphere.cassandra.CassandraScheduler  - Accepted 
offer: abc.def.ghi.com
215889 [Thread-114] INFO mesosphere.cassandra.CassandraScheduler  - Received 
status update for task task1413640484861: TASK_LOST (Task uses invalid 
resources: ports(*):0)

I tried configuring a port resource in the slave and restarting but still get 
the same error e.g.

${INSTALL_DIR}/sbin/mesos-slave \
--master=zk://abc.def.ghi.com:2181/mesos \
--resources='mem:245760;ports(*):[31000-32000]'

Any leads?





RE: Spark scheduler receives no offers after some time

2014-08-07 Thread David Palaitis
I’m running up against this issue w/ Mesos 0.19 and Spark 1.0.1

-  https://www.mail-archive.com/issues@spark.apache.org/msg06732.html

Caused by: javax.security.auth.login.LoginException: unable to find LoginModule 
class: org/apache/hadoop/security/UserGroupInformation$HadoopLoginModule

Any clues of where to look?


RE: stale framework registrations

2014-08-06 Thread David Palaitis
Thanks, I’m looking forward to that feature.
Thanks also for pointing out the /help section. I hadn’t seen that.

I did a rolling reboot of our mesos masters last night and the frameworks have 
all synchronized now. It likely that CTRL-C to marathon doesn’t play nicely.



From: vi...@twitter.com [mailto:vi...@twitter.com] On Behalf Of Vinod Kone
Sent: Tuesday, August 05, 2014 8:11 PM
To: user@mesos.apache.org
Subject: Re: stale framework registrations


On Tue, Aug 5, 2014 at 4:58 PM, David Palaitis 
david.palai...@twosigma.commailto:david.palai...@twosigma.com wrote:
It’s still registered after a few hours…


How did you stop marathon? Also, any log messages on the master pertaining to 
this event would be useful to diagnose.

I don’t see a shutdown in the list of endpoints for /master. What version was 
that introduced?

I spoke too soon. This is only on the master branch and will be included in the 
upcoming 0.20.0 release.



stale framework registrations

2014-08-05 Thread David Palaitis

I recently stopped Marathon but it is still registered with the Mesos Masters. 
I started a new instance of Marathon and it has re-registered successfully with 
a new framework Id.

I'd like to understand how to force deregistration of the stale framework.




RE: stale framework registrations

2014-08-05 Thread David Palaitis
It’s still registered after a few hours…

I don’t see a shutdown in the list of endpoints for /master. What version was 
that introduced?

Here’s the list I see:
/master/healthhttp://usrs1021.pit.twosigma.com:5050/help/master/health
/master/observehttp://usrs1021.pit.twosigma.com:5050/help/master/observe
/master/redirecthttp://usrs1021.pit.twosigma.com:5050/help/master/redirect
/master/roles.jsonhttp://usrs1021.pit.twosigma.com:5050/help/master/roles.json
/master/state.jsonhttp://usrs1021.pit.twosigma.com:5050/help/master/state.json
/master/stats.jsonhttp://usrs1021.pit.twosigma.com:5050/help/master/stats.json
/master/tasks.jsonhttp://usrs1021.pit.twosigma.com:5050/help/master/tasks.json

From: vi...@twitter.com [mailto:vi...@twitter.com] On Behalf Of Vinod Kone
Sent: Tuesday, August 05, 2014 1:54 PM
To: user@mesos.apache.org
Subject: Re: stale framework registrations


On Tue, Aug 5, 2014 at 9:48 AM, David Palaitis 
david.palai...@twosigma.commailto:david.palai...@twosigma.com wrote:
I recently stopped Marathon but it is still registered with the Mesos Masters. 
I started a new instance of Marathon and it has re-registered successfully with 
a new framework Id.

I’d like to understand how to force deregistration of the stale framework.

Master should remove the old framework after its failover timeout has elapsed. 
If you want to force it, there is also a /master/shutdown (see: 
masterip:5050/help/master/shutdown) endpoint on the master.