@Niphlod: I decided one year ago to use web2py for our projects. And now I
have a problem and I have to solve it (shortly) - with or without the
group. You are so focused on the scheduler code itself. I search for any
hint to understand the problem. It doesn't helps me to know that the
mankind has never a problem with the scheduler since decades of wonderful
years. ;)
In the meanwhile I found a way to handle well "my" problematic situations
(unfortunately a workaround). The answers for the cause are outstanding.
A new try: be abstract away from longlive tasks and timeouts in case of
normal worker's task processing. The reason for 'my' TIMEOUT is different
(please trust me, I read the entire code of scheduler.py and know about the
general concept).
scheduler.py (worker): infinity loop -> pop a task -> call async function
-> create process environment -> start the process[1] -> wait for
completion the process or have no mercy and terminate the process if
timeout caught.
So far so good. Now a more detailed look for [1].
The process environment has an entry function as target (start point) to
start the sub process. In the case of scheduler.py it is the function
'executor'. Again, the entry point of this function we want never pass in
case of 'zombie' candidates. With pstack I saw the reason: the new process
creation process is waiting for a semaphore - sem_wait(). At this point of
the sub process nothing is passed in 'executor' function and because of
that nothing is processed of my actually task. Of course because the
'executer' didn't call "my" task function.
So, the scheduler's executor catchs the timeout (sub process is still
waiting for a semaphore) and call termiate() for the sub process. This
process is still waiting again and again and again ... The scheduler.py
registered in the meanwhile the task as STOPPED and go ahead to pick up the
next task.
Back to my pstree output with additional comments inside for example:
bash(16731) // my shell
>
\---python2.7(24545) //
> scheduler.py (-K)
>
\-+-python2.7(24564)---{python2.7}(24565) // idling
> worker
> |-python2.7(24572) // worker
> with picked task
>
\-python2.7(1110) // still
> waiting for semaphore (TIMEOUT)
> \-python2.7(8647) // still
> waiting for semaphore (TIMEOUT)
> \-python2.7(11747) // still
> waiting for semaphore (TIMEOUT)
> \-python2.7(14117) // run the
> actually task (RUNNING)
> \-python2.7(14302) // still
> waiting for semaphore (TIMEOUT)
>
The actually reason is "waiting for a semaphore". But way? And of course in
all propabillity it is not a problem of the scheduler.py code itself ;)
Thx again for your endurance.
Erwn
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.