Hi Niphlod,

I think I solve the problem. Followed by my last post, the issure is in 
shared memory for the multiprocessing module of python. According to the 
solution on this link 
(http://stackoverflow.com/questions/2009278/python-multiprocessing-permission-denied
 
), I let none being mounted on /dev/shm/ and the problem is fixed. 
The interesting thing is that in EC2 instance of US regions, I don't need 
to do this mounting, and it works fine for the multiprocessing, but for the 
instance in China region, I need to do that.

Thanks for all your helps!
Pengfei

On Thursday, April 30, 2015 at 8:47:48 AM UTC-4, Niphlod wrote:
>
>
>
> On Wednesday, April 29, 2015 at 11:53:56 PM UTC+2, Pengfei Yu wrote:
>>
>> Hi Niphlod and Dave,
>>
>> I think the worker is deleted because the the last heartbeat doesn't 
>> change for a while. When I comment out the line "#dead_workers.delete()" in 
>> gluon/schdeuler.py, the task is stuck as QUEUED without new run record 
>> created. And the worker supposed to be deleted with the last_heartbeat 
>> doesn't change. When I keep this line "dead_workers.delete()", the 
>> situation remain the same as my original email. 
>>
>
> a worker that can't "piggyback" its presence NEEDS to be assumed as dead.
>  
>
>>
>> I think the task is never processed, since I added a single test line as 
>> "os.system('touch <some_folder>/test.log')" at the beginning of the task 
>> function, but this "test.log" file is never created in the server. I guess 
>> the worker is died right after it is assigned a task, even before the task 
>> is being processed, and then the task is QUEUED again. 
>>
>
> in this case, you won't have any scheduler_run records. Those are created 
> when the task gets picked up.
>  
>
>>
>> I have the configuration in /etc/init/web2py-scheduler,conf, and the 
>> service is running, that is why new worker are pop out when previous one 
>> dies. 
>>
>
> Perfect, but maybe you should monitor if they ALWAYS fail and avoid 
> restarting them ?!?!!?
>  
>
>>
>> When I set timeout as 10s for the task, it is still the same as before, 
>> multiple runs are generated for the same task and they are not ended though 
>> the assigned workers are died. I guess for the task, the time from RUNNING 
>> to QUEUED again is very short.
>>
>
> This seems to point out that the problem isn't a really long task that 
> gets stuck, but rather that your task even before the "10 seconds" has 
> issues, or the scheduler has trying to pick that up.
>
>>
>>
>> One difference between the server in original region (works fine) and the 
>> current region is that in the current region, I use the port 8000 for the 
>> HTTPS access, instead of the default 443. This is because the current one 
>> is in China and there are restrictions on HTTP/HTTPS access with port 
>> 80/8080/443. I am not sure if this will affect the workers since the worker 
>> name only have ip information but not port.
>>
>>
> Shouldn't be a problem.
>  
>
>> Niphlod, 
>>
>> May I ask how can I enable the DEBUG mode and check the logging? I do see 
>> logging in the script of gluon/scheduler.py, but I don't know where I can 
>> find the logging information in a log file. 
>>
>> Thanks
>>
>
> use logging.conf (enable DEBUG log for root and web2py, then in 
> consolehandler) and start the scheduler with
>
> web2py.py -K appname -D 0
>  
>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to