a subtle bug appearead on the last patch, please re-download scheduler.py,
should all be ok right now (as of revision
1cc2decfddb4ec2b9a2cd8e098754504856f1990)
On Sunday, October 21, 2012 4:10:07 AM UTC+2, Adi wrote:
hmm... seems like we still have the same problem, unless i was supposed to
will do it right now :)
On Sun, Oct 21, 2012 at 10:30 AM, Niphlod niph...@gmail.com wrote:
a subtle bug appearead on the last patch, please re-download scheduler.py,
should all be ok right now (as of revision
1cc2decfddb4ec2b9a2cd8e098754504856f1990)
On Sunday, October 21, 2012 4:10:07 AM
Confirming that it works PERFECT :) handling all three queues (groups)
concurrently as it should.
now loading first 600k tasks to see if it will degrade performance, and if
ok, then couple more around 2-3M each...
Niphlod and Massimo, thank you!
When did you plan to include new scheduler into
It is already in web2py 2.2.1 ;-)
On Sunday, 21 October 2012 14:06:53 UTC-5, Adi wrote:
Confirming that it works PERFECT :) handling all three queues (groups)
concurrently as it should.
now loading first 600k tasks to see if it will degrade performance, and if
ok, then couple more
no prio available (it's hard to manage a task queued 3 hours ago with
prio 7 comes before of after one with prio 8 queued 2 hours ago ?).
hackish way: tasks are picked up ordered by next_run_time. So, queue your
tasks with next_runtime = request.now - datetime.timedelta(hours=1) kinda
all clear :) in process of implementing.
Is new api defined in scheduler.py, since i don't see it in there (2.1.1
(2012-10-17 17:00:46) dev), but I'm modifying the existing code to employ
fast_track, since order confirmations are getting behind. This will be
really good :) Thanks again, and
couple things happened...
The main group worker got created even though I didn't call it... Not sure
why, but i guess because there are lot of leftover tasks queued (500k) and
some were assigned when I stopped the process.
Even though fast_tack worker started, nothing is getting picked, assigned
i can confirm that size of the queued records has something to do with
delay to process different queues... once i deleted all outstanding records
from main group, fast_track group started working as expected... sorry for
a long thread, but i think it's a very neat idea to load scheduler with
lots
You're right, there's a bug: for zillions of queued tasks at the same
priority (i.e. got queued first) the bunch we assign on every loop
doesn't take into account that there might be 10 or 20 tasks to assign and
execute on a faster pace in the following bunch(es). Nice catch!
reviewing the
just sent the patch to Massimo. If you're in a hurry, as soon as it is
committed just replace your gluon/scheduler.py with the one from trunk
Thanks for pointing out this misbehaviour of the scheduler.
On Saturday, October 20, 2012 8:25:12 PM UTC+2, Niphlod wrote:
You're right, there's a
The main group worker got created even though I didn't call it... Not sure
why, but i guess because there are lot of leftover tasks queued (500k) and
some were assigned when I stopped the process.
Remind that a group_name for tasks is required for the scheduler to work.
However, the
will try to replacing scheduler.py in production and load some serious data
again, since all is setup there for the full process, so we can have a real
test :)
I understand the concept with main being the default group, but wasn't sure
if I was doing something wrong. All clear now :) Thanks for
hmm... seems like we still have the same problem, unless i was supposed to
copy more files than just scheduler.py
loaded around 12,000 records into slow_track, while fast_track has very
few, but some should be executed by now...
3 workers are properly running (main, slow_track, fast_track), but
I just tried to perform a select() on a mysql table with 700k records. All
available memory on my mac was consumed to the point that I had to kill the
process.
I reduced number of fields just to id, and got the results after some time,
but wondering if there is some better approach in order
_last_id = 0
_items_per_page=1000
for row in db(db.table.id_last_id).select(limitby=(0,_items_per_page),
orderby=db.table.id):
#do something
_last_id = row.id
--
Thank you Vasile for your quick response. This will be perfect.
On Friday, October 19, 2012 2:02:41 PM UTC-4, Vasile Ermicioi wrote:
_last_id = 0
_items_per_page=1000
for row in db(db.table.id_last_id).select(limitby=(0,_items_per_page),
orderby=db.table.id):
#do something
the set returned by select is always a full result set, because it is
extracted and parsed alltogether.
Slicing with limits is good (and recommended, if possible). Just remember
that you can save a lot of time and memory passing cacheable=True to the
select() function. References will be
I'm afraid, limitby will not work, since it returns limited set, and I
guess it's not possible to dynamically change the limit, so I'll have to
sort of loop through some kind of subqueries, or use the original query
with limited set of fields (takes 60secs for 700k records, not ready to
test on
_last_id = 0
_items_per_page=1000
for row in db(db.table.id_last_id).select(limitby=(0,_items_per_page),
orderby=db.table.id):
#do something
_last_id = row.id
have you tried it and it doesn't work?
do you understand the logic?
--
increase _items_per_page to 20 000
--
Yes Vasile, I tried, and understand the logic... May change it slightly and
use it as subquery with an offset. The problem is that I'm dealing with
legacy tables that go up to 3 million rows, and have lot of columns that
need to be checked, so your solution will work, and I will be loading data
in
_last_id = 0
_items_per_page=1000
for row in db(db.table.id_last_id).select(limitby=(0,_items_per_page),
orderby=db.table.id):
#do something
_last_id = row.id
you don;t need to change anything to load all data, this code is loading
everything in slices as you need,
all records
i put it exactly as it is, but it stopped working after 1000 records...
will double check again.
On Fri, Oct 19, 2012 at 3:47 PM, Vasile Ermicioi elff...@gmail.com wrote:
_last_id = 0
_items_per_page=1000
for row in db(db.table.id_last_id).select(limitby=(0,_items_per_page),
also _last_id = row.id after your code inside the loop is required
--
it's missing the outer loop.
_last_id = 0
_items_per_page=1000
while True:
rows = db(db.table.id_last_id).select(limitby=(0,_items_per_page),
orderby=db.table.id)
if len(rows) == 0:
break
for row in rows:
#do something
_last_id = row.id
Should work.
On Friday,
Does work. Thank you both very much!
Now that I have thousands of queued/backlogged tasks in a scheduler, I
noticed that my regular tasks, which are of higher priority will be on hold
until everything else gets processed. Maybe, it would be a good idea to
have a field for a priority of a task?
26 matches
Mail list logo