Re: spark and mesos issue

2014-09-16 Thread Gurvinder Singh
It might not be related only to memory issue. Memory issue is also there as you mentioned. I have seen that one too. The fine mode issue is mainly spark considering that it got two different block manager for same ID, whereas if I search for the ID in the mesos slave, it exist only on the one

Re: spark and mesos issue

2014-09-16 Thread Tim St Clair
inline - - Original Message - From: CCAAT cc...@tampabay.rr.com To: user@mesos.apache.org Cc: cc...@tampabay.rr.com Sent: Monday, September 15, 2014 5:33:08 PM Subject: Re: spark and mesos issue Hello Brenden/Vinod, Is your installation using systemd ? Has anyone documented

Re: spark and mesos issue

2014-09-15 Thread Brenden Matthews
I started hitting a similar problem, and it seems to be related to memory overhead and tasks getting OOM killed. I filed a ticket here: https://issues.apache.org/jira/browse/SPARK-3535 On Wed, Jul 16, 2014 at 5:27 AM, Ray Rodriguez rayrod2...@gmail.com wrote: I'll set some time aside today to

Re: spark and mesos issue

2014-09-15 Thread CCAAT
Hello Brenden/Vinod, Is your installation using systemd ? Has anyone documented systemd configurations/issues for the various linux distro running mesos/spark? What if a cluster is running on a mixture of systems that use/do_not_use systemd; are there any issues, related to systemd and

Re: spark and mesos issue

2014-07-16 Thread Vinod Kone
On Fri, Jul 4, 2014 at 2:05 AM, Gurvinder Singh gurvinder.si...@uninett.no wrote: ERROR storage.BlockManagerMasterActor: Got two different block manager registrations on 201407031041-1227224054-5050-24004-0 Googling about it seems that mesos is starting slaves at the same time and giving

Re: spark and mesos issue

2014-07-16 Thread Vinod Kone
On Tue, Jul 15, 2014 at 11:02 PM, Vinod Kone vi...@twitter.com wrote: On Fri, Jul 4, 2014 at 2:05 AM, Gurvinder Singh gurvinder.si...@uninett.no wrote: ERROR storage.BlockManagerMasterActor: Got two different block manager registrations on 201407031041-1227224054-5050-24004-0 Googling

Re: spark and mesos issue

2014-07-16 Thread Ray Rodriguez
I'll set some time aside today to gather and post some logs and details about this issue from our end. On Wed, Jul 16, 2014 at 2:05 AM, Vinod Kone vinodk...@gmail.com wrote: On Tue, Jul 15, 2014 at 11:02 PM, Vinod Kone vi...@twitter.com wrote: On Fri, Jul 4, 2014 at 2:05 AM, Gurvinder Singh

spark and mesos issue

2014-07-04 Thread Gurvinder Singh
We are getting this issue when we are running jobs with close to 1000 workers. Spark is from the github version and mesos is 0.19.0 ERROR storage.BlockManagerMasterActor: Got two different block manager registrations on 201407031041-1227224054-5050-24004-0 Googling about it seems that mesos is

Re: spark and mesos issue

2014-07-04 Thread Ray Rodriguez
I've been running into the same issue with task counts greater than 600 or so using spark with mesos in fine grain mode. On Fri, Jul 4, 2014 at 5:06 AM, Gurvinder Singh gurvinder.si...@uninett.no wrote: We are getting this issue when we are running jobs with close to 1000 workers. Spark is