Indeed, also, keep in mind, there’s a talk here that fine-grained mode might be 
removed in Spark 2.










Kind regards,

Radek Gruchalski

[email protected] (mailto:[email protected])
 
(mailto:[email protected])
de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)

Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.



On Tuesday, 24 November 2015 at 11:32, 木内満歳 wrote:

> I agree. "coarse grain mode" is "take me to safe side" mode, I think.
> Unfortunately, it looks more hard to resolve...
>  
> Anyway, I appreciate your advice. Thanks much !
>  
> Mitsutoshi
>  
>  
> 2015-11-24 19:23 GMT+09:00 Rad Gruchalski <[email protected] 
> (mailto:[email protected])>:
> > Ah, I see, I am experiencing a similar thing with fine-grained where one of 
> > the tasks would stay in staging and fail the whole job but never in coarse 
> > mode.
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> >  
> > Kind regards,

> > Radek Gruchalski
> > 
[email protected] (mailto:[email protected])
 
> > (mailto:[email protected])
> > de.linkedin.com/in/radgruchalski/ (http://de.linkedin.com/in/radgruchalski/)
> >  
> > Confidentiality:
> > This communication is intended for the above-named person and may be 
> > confidential and/or legally privileged.
> > If it has come to you in error you must take no action based on it, nor 
> > must you copy or show it to anyone; please delete/destroy and inform the 
> > sender immediately.
> >  
> >  
> >  
> > On Tuesday, 24 November 2015 at 10:07, 木内満歳 wrote:
> >  
> > > Hi Rad,
> > >  
> > > I've tried both. I've experienced same symptom on both case.
> > >  
> > > Thanks,
> > > Mitsutoshi Kiuchi
> > >  
> > >  
> > > 2015-11-24 17:57 GMT+09:00 Rad Gruchalski <[email protected] 
> > > (mailto:[email protected])>:
> > > > Mitsutoshi,
> > > >  
> > > > Is this in a fine-grained mode?
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > >  
> > > > Kind regards,

> > > > Radek Gruchalski
> > > > 
[email protected] (mailto:[email protected])
 
> > > > (mailto:[email protected])
> > > > de.linkedin.com/in/radgruchalski/ 
> > > > (http://de.linkedin.com/in/radgruchalski/)
> > > >  
> > > > Confidentiality:
> > > > This communication is intended for the above-named person and may be 
> > > > confidential and/or legally privileged.
> > > > If it has come to you in error you must take no action based on it, nor 
> > > > must you copy or show it to anyone; please delete/destroy and inform 
> > > > the sender immediately.
> > > >  
> > > >  
> > > >  
> > > > On Tuesday, 24 November 2015 at 07:18, 木内満歳 wrote:
> > > >  
> > > > > Hi, Tim
> > > > >  
> > > > > I've reproduced and taken debug logs(attached).
> > > > > I cannot understand what is going on, but it seems that the slave is 
> > > > > repeatedly sending ACCEPT message to master.
> > > > >  
> > > > > Please have your comment.
> > > > >  
> > > > > Best Regards,
> > > > > Mitsutoshi Kiuchi
> > > > >  
> > > > >  
> > > > > 2015-11-24 5:28 GMT+09:00 Tim Chen <[email protected] 
> > > > > (mailto:[email protected])>:
> > > > > > Hi Mitsutoshi,
> > > > > >  
> > > > > > Can you enable TRACING log on Spark (modify your log4j.properties 
> > > > > > file)?
> > > > > >  
> > > > > > It should have more information on why offers are being rejected, 
> > > > > > but most of the time it's due to not enough resources in your 
> > > > > > cluster to satifsy launching your Spark job. You can either 
> > > > > > increase your slave(s) resources or lower your cpu/memory 
> > > > > > requirement for your job through configuration.
> > > > > >  
> > > > > > Tim
> > > > > >  
> > > > > > On Mon, Nov 23, 2015 at 6:30 AM, 木内満歳 <[email protected] 
> > > > > > (mailto:[email protected])> wrote:
> > > > > > > Hi,
> > > > > > >  
> > > > > > > I'm experiencing that some spark task on Mesos 0.25 occasionally 
> > > > > > > won't start.
> > > > > > > Please tell some advice how to see more detail against it.
> > > > > > >  
> > > > > > > Here is the slave log about bad task
> > > > > > >  
> > > > > > > Nov 23 08:54:26 mesos-s2 mesos-slave[18499]: I1123 
> > > > > > > 08:54:26.677291 18516 slave.cpp:2379] Got registration for 
> > > > > > > executor '235498ca-6603-4cfe-bfc7-94005bb235fb-S5' of framework 
> > > > > > > 235498ca-6603-4cfe-bfc7-94005bb235fb-1442 from 
> > > > > > > executor(1)@10.130.91.16:60295 (http://10.130.91.16:60295)
> > > > > > > Nov 23 08:54:26 mesos-s2 mesos-slave[18499]: I1123 
> > > > > > > 08:54:26.679875 18516 slave.cpp:1760] Sending queued task '0' to 
> > > > > > > executor '235498ca-6603-4cfe-bfc7-94005bb235fb-S5' of framework 
> > > > > > > 235498ca-6603-4cfe-bfc7-94005bb235fb-1442
> > > > > > > (no more log about this task)
> > > > > > >  
> > > > > > > When task succeed to run, slave log shows like that.
> > > > > > >  
> > > > > > > Nov 23 08:44:39 al-mesos-s3 mesos-slave[8644]: I1123 
> > > > > > > 08:44:39.637285  8658 slave.cpp:2379] Got registration for 
> > > > > > > executor '235498ca-6603-4cfe-bfc7-94005bb235fb-S6' of framework 
> > > > > > > 235498ca-6603-4cfe-bfc7-94005bb235fb-1437 from 
> > > > > > > executor(1)@10.130.98.65:52273 (http://10.130.98.65:52273)
> > > > > > > Nov 23 08:44:39 al-mesos-s3 mesos-slave[8644]: I1123 
> > > > > > > 08:44:39.639233  8658 slave.cpp:1760] Sending queued task '6' to 
> > > > > > > executor '235498ca-6603-4cfe-bfc7-94005bb235fb-S6' of framework 
> > > > > > > 235498ca-6603-4cfe-bfc7-94005bb235fb-1437
> > > > > > > Nov 23 08:44:42 al-mesos-s3 mesos-slave[8644]: I1123 
> > > > > > > 08:44:42.608182  8658 slave.cpp:2717] Handling status update 
> > > > > > > TASK_RUNNING (UUID: ff5a2278-0753-4541-bd33-a55f3a09fb69) for 
> > > > > > > task 6 of framework 235498ca-6603-4cfe-bfc7-94005bb235fb-1437 
> > > > > > > from executor(1)@10.130.98.65:52273 (http://10.130.98.65:52273)
> > > > > > > Nov 23 08:44:42 al-mesos-s3 mesos-slave[8644]: I1123 
> > > > > > > 08:44:42.612318  8658 status_update_manager.cpp:322] Received 
> > > > > > > status update TASK_RUNNING (UUID: 
> > > > > > > ff5a2278-0753-4541-bd33-a55f3a09fb69) for task 6 of framework 
> > > > > > > 235498ca-6603-4cfe-bfc7-94005bb235fb-1437
> > > > > > >  
> > > > > > > Any advice is welcome.
> > > > > > >  
> > > > > > > Best Regards,
> > > > > > > Mitsutoshi Kiuchi
> > > > > > >  
> > > > > >  
> > > > >  
> > > > >  
> > > > > Attachments:  
> > > > > - log.driverStdErr.gz
> > > > >  
> > > > > - log.mesosMaster.gz
> > > > >  
> > > > > - log.mesosSlave.gz
> > > > >  
> > > > >  
> > > >  
> > > >  
> > >  
> >  
>  

Reply via email to