[web2py] Re: Send Errors To Email
If you want a realtime solution, you can trigger an email on each error. Here's a sketch: 1. Add an error handler to routes.py: routes_onerror = [ ('appname/*', '/appname/default/show_error') ] 2. and in default.py: def show_error(): Scheduler.insert(function_name='send_self_email', vars=sj.dumps({ticket=request.vars.ticket})) return 'Server Error. Super sorry about that! We be lookin into it.' 3. Now in a models file add the scheduler email function: def send_self_email(ticket): message = 'http://%s:%s/admin/default/ticket/%s' % (server,port,ticket) SENDMAIL = /usr/sbin/sendmail # sendmail location import os p = os.popen(%s -t % SENDMAIL, w) p.write(To: + email_address + \n) p.write(Subject: + subject + \n) p.write(\n) # blank line separating headers from body p.write(message) p.write(\n) status = p.close() Scheduler(db) 4. Run the scheduler with the server with python web2py.py -K appname -X On Thursday, September 27, 2012 5:32:36 AM UTC-7, Hassan Alnatour wrote: Dear ALL , Is there a way to make all the error be sent to my email when it happens ? Best regards, --
[web2py] Re: web2py 2.0.2 is out
I just discovered this sweet hidden improvement: db(db.mytable.id1).select() Rows (648) The Rows object now prints out the number of rows in the repr() function! That's so useful! Thanks everyone! --
[web2py] Re: Usage example for 'executesql(...,fields=,columns=) allows parsing of results in Rows'
This is awesome! Thanks for the example! On Thursday, August 30, 2012 1:56:09 PM UTC-7, Anthony wrote: db.define_table('person', Field('name'), Field('email')) db.define_table('dog', Field('name'), Field('owner', 'reference person')) db.executesql([SQL code returning person.name and dog.name fields], fields =[db.person.name, db.dog.name]) db.executesql([SQL code returning all fields from db.person], fields=db. person) db.executesql([SQL code returning all fields from both tables], fields=[db .person, db.dog]) db.executesql([SQL code returning person.name and all db.dog fields],fields =[db.person.name, db.dog]) --
[web2py] Re: web2py 2.0.2 is out
Wow, this is cool! But I'm hitting a bug in rewrite_on_error: http://code.google.com/p/web2py/issues/detail?id=964 --
[web2py] Re: web2py 2.0.2 is out
I'm really excited about the new scheduler -X option. What do -E -b -L do? I don't see them in --help or in the widget.py code. On Wednesday, August 29, 2012 10:17:48 PM UTC-7, Michael Toomim wrote: Wow, this is cool! But I'm hitting a bug in rewrite_on_error: http://code.google.com/p/web2py/issues/detail?id=964 --
[web2py] Re: A new and easier way to test your apps
Sometimes you write things that are just really exciting. --
Re: [web2py] Re: web2py 2.0 stats
That is true! It makes me think, perhaps it would be worthwhile at some point to pause, take stock of all the features, which ones might be better than which other ones, and write up a set of best practices to put into the book. On Wednesday, August 29, 2012 12:49:29 PM UTC-7, Richard wrote: With 2.0 I am sometimes thinking of recoding my app start almost 3 years ago from scratch, since there is so much new features since that time... And it's very difficult to know sometimes what the best practices or which experimental feature will stay in the future or not. But I love web2py :) Richard On Wed, Aug 29, 2012 at 3:45 PM, vinic...@gmail.com javascript: vinic...@gmail.com javascript: wrote: +1. That's a big weakness in our beloved framework. But, unfortunately, I cannot help with that. I've tried in the past, but it's difficult to follow pitch of new features. -- Vinicius Assef On 08/29/2012 02:32 PM, apps in tables wrote: Examples and Self explanatory documentation will save you a lot questions, specially from me:) Ashraf -- -- --
[web2py] Re: Changing DB passwords, and database/*.table files
This makes sense to me too! The simple way would break backwards compatibility. But this could be avoided if hash function first checks to see if a schema file exists WITH the password, and returns that, else returns a hash w/o the password. On Tuesday, August 28, 2012 10:17:02 AM UTC-7, Chris wrote: (2) I wonder if a change to the current behavior would be better -- change the logic to build a hash using all of the URI except the password part? Changing server or DB name feels like a real change to me; and in general changing just the user ID is too since the user may have different permissions, views etc. in the same database. But changing just the password? Should not change the underlying identity of the database connection or database object definitions. (In my view of the world.) What do you think? --
[web2py] Re: web2py 2.0.2 is out
Oh, I see, these are scheduler.py options! -b: sets the heartbeat time -L: sets the logging level -E: sets the max empty runs On Wednesday, August 29, 2012 10:23:29 PM UTC-7, Michael Toomim wrote: What do -E -b -L do? I don't see them in --help or in the widget.py code. --
Re: [web2py] Re: possible scheduler bug
Thanks for the great work on the scheduler niphlod! On Wednesday, August 1, 2012 1:19:48 PM UTC-7, Niphlod wrote: The consideration behind that is that if your function doesn't return anything, you don't need the results. Backward compatibility is quite broken in that sense (but scheduler is still marked as experimental). If the function returns something, the scheduler_run record is preserved (unless the discard_results parameter is set to True). We thought that if for some strange reason a user still needs the scheduler_run record, he could easily add a return 1 at the end and the behaviour of the old scheduler is preserved. Traceback are always saved (no matter what). On Wednesday, August 1, 2012 9:56:55 PM UTC+2, Vincenzo Ampolo wrote: On 08/01/2012 12:49 PM, Niphlod wrote: Happy to see someone noticing actual improvements :P If you need any further explanations / tipstricks, just ask. PS: another italian added to web2py-users ;) Thanks :) just some considerations though. I noticed that now in scheduler_run all the COMPLETED tasks get cancelled. Will it happen even if the completed one have an output? If there is a FAILED am I going to see the traceback from the scheduler_task - scheduler_run relation like I was doing before? btw went up to 14 workers. only 3.00 load :) woot woot (that means that the overall scheduling system is *really* lighweight) -- Vincenzo Ampolo http://vincenzo-ampolo.net http://goshawknest.wordpress.com --
Re: [web2py] Re: Broken admin in trunk?
That's not the problem. I've already hacked it. That displays admin disabled because insecure channel. On Jul 10, 2012, at 3:00 PM, Dave wrote: the /admin app only works either over https or via 127.0.0.1 / ::1 You can hack the app to change that, but it is default behavior On Tuesday, July 10, 2012 5:04:43 PM UTC-4, Michael Toomim wrote: I just upgraded to the trunk. I'm trying to log into the admin, but there's no password entry box. What's wrong? How can I debug this?
Re: [web2py] Re: Broken admin in trunk?
Ah, so I was wrong, great, thank you! On Jul 10, 2012, at 3:20 PM, Massimo Di Pierro wrote: The normal behavior is, as Dave indicated, that you must be over https or from localhost. The change in trunk is that the condition is false, the login form is not even displayed to prevent you from accidentally submitting credentials over insecure channel. You have hacked admin to bypass those conditions. Now you have one more place to hack it, in the admin/default/index.html view.
[web2py] Re: please help us test web2py
On Friday, July 6, 2012 6:35:43 PM UTC-7, Massimo Di Pierro wrote: 2. Remove Share link from welcome app I think we agreed to remove Share link because it's not used very much. I think we agreed to remove the link to addtoany. Do you really want to remove the share tab at the bottom? I think it can be useful. +1 on removing it... The internet is covered with too many meaningless share buttons already. It looks bad. There's nothing to share in the welcome app.
[web2py] Re: Best practice using scheduler as a task queue?
This is a nice solution, and clever, thanks! The upside (compared to postgres locks, as discussed above) is this works for any database. The downside is it creates a whole new table. On Thursday, July 5, 2012 2:49:36 PM UTC-7, nick name wrote: This might have been solved in this week, but in case it wasn't: You're tackling a general database problem, not a specific task queue or web2py problem. So you need to solve it with the database: set up another table to refer to you the task table, such as: db.define_table('tasks_that_were_set_up', Field('name', 'string', unique=true), Field('scheduler', db.scheduler_task, notnull=True, required=True, unique=True, ondelete='CASCADE'), ) Make your insert code insert a task to this table as well: try: rid = db.scheduler_task.insert(name='task7',) db.tasks_that_were_set_up.insert(name='task7', scheduler=rid) print 'Task was just added to db' except db.integrity_error_class(): print 'Tasks were already in db */ Now, if the task gets removed from the scheduler_task table, because of the cascading on_delete (which is the default - I just put it there for emphasis), the record in tasks_that_were_set_up will be removed as well. And otherwise, because of the unique constraint, the insert to tasks_that_were_set_up can only succeed once -- and thanks to the transaction nature -- therefore so does scheduler_task.insert.
[web2py] Re: Best practice using scheduler as a task queue?
On Wednesday, June 27, 2012 5:02:26 PM UTC-7, ptressel wrote: This won't solve your installation / setup issue, but I wonder if it would help with the overrun and timeout problems... Instead of scheduling a periodic task, what about having the task reschedule itself? When it's done with the queue, schedule itself for later. Remove the time limit so it can take whatever time it needs to finish the queue. Or maybe launch a process on startup outside of the scheduler -- when it exhausts the queue, have it sleep and either wake periodically to check the queue, or have it waked when something is inserted. I don't see why we'd do this instead of just setting stop_time=infinity and repeats=0. Is the transaction processing issue you encountered with PostgreSQL preventing you from setting up your queue as a real producer consumer queue, where you could have multiple workers? No, I only want one worker. The scheduler itself works great as a producer/consumer queue. I may have mislead you with the title of this thread—I'm trying to set up a general repeating background process, not a task queue in particular. Processing a task queue was just one use of the repeating background process function.
Re: [web2py] Installing scheduler on linux
Maybe this should go into the docs somewhere. Maybe the scheduler docstring next to the upstart script? Maybe post an issue on google code to update the docs? http://code.google.com/p/web2py/issues/list On Jun 28, 2012, at 5:41 AM, Tyrone wrote: Hi Guys, Although this script works great when upstart is available, when upstart is not available you can't do it like this. I struggled to find a solution on here or anywhere online, so I have made an alternative script which will work on sysvinit (/etc/init.d) #!/bin/sh DAEMON=/usr/local/bin/python PARAMETERS=/var/www/html/web2py/web2py.py -K init LOGFILE=/var/log/web2py-scheduler.log start() { echo -n starting up $DAEMON RUN=`$DAEMON $PARAMETERS $LOGFILE 21` if [ $? -eq 0 ]; then echo Done. else echo FAILED. fi } stop() { killall $DAEMON } status() { killall -0 $DAEMON if [ $? -eq 0 ]; then echo Running. else echo Not Running. fi } case $1 in start) start ;; restart) stop sleep 2 start ;; stop) stop ;; status) status ;; *) echo usage : $0 start|restart|stop|status ;; esac exit 0 I installed this on CentOS release 5.5, however I needed to follow these instructions to install: 1.) # vi web2py-scheduler 2.) Paste in the above 3.) Add in the following 2 lines in the web2py-scheduler file (required for centos i believe): # chkconfig: 2345 90 10 # description: web2py-scheduler 4.) make it executable # chmod 755 web2py-scheduler 5.) add it to startup # chkconfig --add web2py-scheduler
Re: [web2py] Re: Best practice using scheduler as a task queue?
I'm totally interested in solutions! It's a big problem I need to solve. The recurring maintenance task does not fix the initialization problem—because now you need to initialize the recurring maintenance task. This results in the same race condition. It does fine with the 40,000 records problem. But it's just a lot of complexity we're introducing to solve a simple problem (looping tasks) with a complex solution (scheduler). I'd still love to find a clean way to do this. Maybe we should extend the scheduler like this: • Add a daemon_tasks parameter when you call it from models Scheduler(db, daemon_tasks=[func1, func2]) • When scheduler boots up, it handles locks and everything and ensures there are two tasks that just call these functions • Then it dispatches the workers processes as usual ...ah, shoot, looking in widget.py, it looks like the code that starts schedulers doesn't have access to the parameters passed to Scheduler() because models haven't been run yet. Hmph. On Wednesday, June 27, 2012 12:56:52 AM UTC-7, Niphlod wrote: I don't know if continuing to give you fixes and alternative implementations is to be considered as harassment at this point, stop me if you're not interested into those. There is a very biiig problem in your statements: if your vision is Woohoo, this scheduler will *automatically handle locks*—so I don't need to worry about stray background processes running in parallel automatically, and it will *automatically start/stop the processes* with the web2py server with -K, which makes it much easier to deploy the code! then the scheduler is the right tool for you. it's your app that doesn't handle locks, because of your initialization code put into models. At least 2 of your problems (initialization and 40,000 scheduler_run records) could be fixed by a recurring maintenance task that will do check_daemon() without advisory locks and prune the scheduler_run table. BTW: I'm pretty sure that when you say scheduler should be terminated alongside web2py you're not perfectly grasping how webdevelopment in production works. If you're using standalone versions, i.e. not mounted on a webserver, you can start your instances as web2py -a mypassword web2py -K myapp and I'm pretty sure when hitting ctrl+c both will shutdown.
Re: [web2py] Re: Best practice using scheduler as a task queue?
The problem with terminating the processes is: • sometimes they don't respond to control-c, and need a kill -9 • or sometimes that doesn't work, maybe the os is messed up • or sometimes the developer might run two instances simultaneously, forgetting that one was already running You're right that usually I can shut them both down with control-c, but I need a safeguard. My application spends money on mechanical turk and I'll spend erroneous money and upset my users if it goes wrong by accident. On Wednesday, June 27, 2012 12:56:52 AM UTC-7, Niphlod wrote: BTW: I'm pretty sure that when you say scheduler should be terminated alongside web2py you're not perfectly grasping how webdevelopment in production works. If you're using standalone versions, i.e. not mounted on a webserver, you can start your instances as web2py -a mypassword web2py -K myapp and I'm pretty sure when hitting ctrl+c both will shutdown.
Re: [web2py] Re: Best practice using scheduler as a task queue?
:) Because I'm a perfectionist, and I want other developers to be able to install my system by just unzipping the code, running ./serve, and have it just work. So I want to use the built-in webserver and scheduler. There's no reason they shouldn't be able to manage these race conditions correctly. I'm super excited that the Ctrl-C bug was fixed! Your idea of putting the initializer in the @cron reboot is very appealing! I will think about this and see if I can come up with a nice solution with it. Ideally I could re-use this daemon_task setup for other projects as well, as I find it to be a quite common scenario. I understand you do not find it to be common. I am not sure why we have different experiences. Would portalocker be a good thing to use for this situation? I would like to be cross-platform instead of relying on postgres locks. On Wednesday, June 27, 2012 12:37:27 PM UTC-7, Niphlod wrote: uhm. why not having them started with systemd or upstart or supervisord ? Scheduler is by design allowed to run with multiple instances (to process a longer queue you may want to start more of them), but if you're really loosing money why didn't you rely on that services to be sure that there's only one instance running? There are a lot of nice implementations out there and the one I mentioned are pretty much state-of-the-art :D (while contributing to fix current issues) BTW: - responding to ctrl+c fixed in trunk recently - os messed up maybe require you to check the os, python programs can't be omniscient :D - messy developers, no easy fix for that too On Wednesday, June 27, 2012 9:18:06 PM UTC+2, Michael Toomim wrote: The problem with terminating the processes is: • sometimes they don't respond to control-c, and need a kill -9 • or sometimes that doesn't work, maybe the os is messed up • or sometimes the developer might run two instances simultaneously, forgetting that one was already running You're right that usually I can shut them both down with control-c, but I need a safeguard. My application spends money on mechanical turk and I'll spend erroneous money and upset my users if it goes wrong by accident. On Wednesday, June 27, 2012 12:56:52 AM UTC-7, Niphlod wrote: BTW: I'm pretty sure that when you say scheduler should be terminated alongside web2py you're not perfectly grasping how webdevelopment in production works. If you're using standalone versions, i.e. not mounted on a webserver, you can start your instances as web2py -a mypassword web2py -K myapp and I'm pretty sure when hitting ctrl+c both will shutdown.
Re: [web2py] Quoting reserved words in DAL
Ok, here's one step closer to a fix! This is what had to be manually changed in order to quote the names of tables/fields to load a postgres database into MySQL. This won't work for other databases. And this only works for importing a database—table creation, with foreign keys, and insert statements. Nothing else (e.g. selects, updates) was tested. And we had to guide it manually through the table creation, to get foreign key constraints satisfied in the right order. But this will show an interested developer exactly where to begin fixing the code for this bug... that seems to be biting a few of us. On Jun 20, 2012, at 1:19 PM, Rene Dohmen wrote: I'm having the same problem: https://groups.google.com/d/msg/web2py/hCsxVaDLfT4/K6UMbG5p5uAJ On Mon, Jun 18, 2012 at 9:30 AM, Michael Toomim too...@gmail.com wrote: I just got bit by the reserved-word problem: https://groups.google.com/d/msg/web2py/aSPtD_mGXdM/c7et_2l_54wJ I am trying to port a postgres database to a friend's mysql database, but we are stuck because the DAL does not quote identifiers. This problem has been discussed a fair amount: https://groups.google.com/d/msg/web2py/QKktETHk3yo/Mwm3D-JhNmAJ ...and it seems all we need to do is add some quoting mechanisms to the DAL. In postgresql you surround names with quotes and in mysql you use `backticks`. Does anyone have ideas for what to do? -- On Jun 20, 2012, at 1:19 PM, Rene Dohmen wrote:I'm having the same problem:https://groups.google.com/d/msg/web2py/hCsxVaDLfT4/K6UMbG5p5uAJ On Mon, Jun 18, 2012 at 9:30 AM, Michael Toomim too...@gmail.com wrote: I just got bit by the reserved-word problem:https://groups.google.com/d/msg/web2py/aSPtD_mGXdM/c7et_2l_54wJ I am trying to port a postgres database to a friend's mysql database, but we are stuck because the DAL does not quote identifiers.This problem has been discussed a fair amount: https://groups.google.com/d/msg/web2py/QKktETHk3yo/Mwm3D-JhNmAJ...and it seems all we need to do is add some quoting mechanisms to the DAL. In postgresql you surround names with "quotes" and in mysql you use `backticks`. Does anyone have ideas for what to do? -- dal.py.changes Description: Binary data
Re: [web2py] Re: database locking web2py vs. external access...
This is all a great unearthing of the Mystery of Transactions. Thanks for the investigation, Doug. This was difficult for me to learn when I got into web2py as well. Perhaps we could write up all this knowledge somewhere, now that you're figuring it out? Can we have a section on Transactions in the book, or somewhere?
Re: [web2py] Re: Best practice using scheduler as a task queue?
All, thank you for the excellent discussion! I should explain why I posted that recommendation. The vision of using the scheduler for background tasks was: Woohoo, this scheduler will *automatically handle locks*—so I don't need to worry about stray background processes running in parallel automatically, and it will *automatically start/stop the processes* with the web2py server with -K, which makes it much easier to deploy the code! It turned out: • Setting up scheduler tasks was complicated in itself. • 3 static tasks had to be inserted into every new db. This requires new installations of my software to run a setup routine. Yuck. • When I made that automatic in models/, it required locks to avoid db race condition. (I used postgresql advisory locks. Not cross-platform, but I dunno a better solution.) • The goal was to avoid locks in the first place! • When things go wrong, it's harder to debug. • The scheduler adds a new layer of complexity. • Because now I have to make sure my tasks are there properly. • And then look for the scheduler_run instances to see how they went. I must admit that this second problem would probably go away if we fixed all the scheduler's bugs! But it still leaves me uneasy. And I don't like having 40,000 scheduler_run instances build up over time. At this point, I realized that what I really want is a new feature in web2py that: • Runs a function in models (akin to scheduler's executor function) in a subprocess repeatedly • Ensures, with locks etc., that: • Only one is running at a time • That it dies if the parent web2py process dies And it seems better to just implement this as a web2py feature, than to stranglehold the scheduler into a different design. Cron's @reboot is very close to this. I used to use it. The problems: • I still had to implement my own locks and kills. (what I was trying to avoid) • It spawns 2 python subprocesses for each cron task (ugly, but not horrible) • It was really buggy. @reboot didn't work. I think massimo fixed this. • Syntax is gross. I basically just got scared of cron. Now I guess I'm scared of everything. :/ Hopefully this detailed report of my experience will be of help to somebody. I'm sure that fixing the bugs will make things 5x better. I will try your new scheduler.py Niphlod! On Tuesday, June 26, 2012 12:13:32 PM UTC-7, Niphlod wrote: problem here started as I can't ensure my app to insert only one task per function, that is not a scheduler problem per se: it's a common database problem. Would have been the same if someone created a db.define_table('mytable', Field('name'), Field('uniquecostraint') ) and have to ensure, without specifying Field('uniquecostraint', unique=True) that there are no records with the same value into the column uniquecostraint. From there to now I have tasks stuck in RUNNING status, please avoid using the scheduler without any further details, the leap is quite undocumented. And please do note that scheduler in trunk has gone under some changes: there was a point in time where abnormally killed schedulers (as kill -SIGKILL the process) left tasks in RUNNING status, that would not be picked up by subsequent scheduler processes. That was a design issue: if a task is RUNNING and you kill scheduler while the task was processed, you had no absolutely way to tell what the function did (say, send a batch of 500 emails) before it was actually killed. If the task was not planned properly it could send e.g. 359 mails, be killed, and if it was picked up again by another scheduler after the first killed round 359 of your recipients would get 2 identical mails. It has been decided to requeue RUNNING tasks without any active worker doing that (i.e. leave to the function the eventual check of what has been done), so now RUNNING tasks with a dead worker assigned get requeued. With other changes (soon in trunk, the previously attached file) you're able to stop workers, so they may be killed ungracefully being sure that they're not processing tasks. If you need more details, as always, I'm happy to help, and other developers too, I'm sure :D --
Re: [web2py] Re: Best practice using scheduler as a task queue?
In case it is useful to someone, here is the full code I used with locking, using postgresql advisory locks. The benefit of using postgresql's locks are that: • It locks on the database—works across multiple clients • The locks are automatically released if a client disconnects from the db • I think it's fast def check_daemon(task_name, period=None): period = period or 4 tasks_query = ((db.scheduler_task.function_name == task_name) db.scheduler_task.status.belongs(('QUEUED', 'ASSIGNED', 'RUNNING', 'ACTIVE'))) # Launch a launch_queue task if there isn't one already tasks = db(tasks_query).select() if len(tasks) 1: # Check for error raise Exception('Too many open %s tasks!!! N, there are %s' % (task_name, len(tasks))) if len(tasks) 1: if not db.executesql('select pg_try_advisory_lock(1);')[0][0]: debug('Tasks table is already locked.') return # Check again now that we're locked if db(tasks_query).count() = 1: debug('Caught a race condition! Glad we got outa there!') db.executesql('select pg_advisory_unlock(1);') return debug('Adding a %s task!', task_name) db.scheduler_task.insert(function_name=task_name, application_name='utility/utiliscope', task_name=task_name, stop_time = now + timedelta(days=9), repeats=0, period=period) db.commit() db.executesql('select pg_advisory_unlock(1);') elif tasks[0].period != period: debug('Updating period for task %s', task_name) tasks[0].update_record(period=period) db.commit() check_daemon('process_launch_queue_task') check_daemon('refresh_hit_status') check_daemon('process_bonus_queue') On Tuesday, June 26, 2012 7:57:25 PM UTC-7, Michael Toomim wrote: All, thank you for the excellent discussion! I should explain why I posted that recommendation. The vision of using the scheduler for background tasks was: Woohoo, this scheduler will *automatically handle locks*—so I don't need to worry about stray background processes running in parallel automatically, and it will *automatically start/stop the processes* with the web2py server with -K, which makes it much easier to deploy the code! It turned out: • Setting up scheduler tasks was complicated in itself. • 3 static tasks had to be inserted into every new db. This requires new installations of my software to run a setup routine. Yuck. • When I made that automatic in models/, it required locks to avoid db race condition. (I used postgresql advisory locks. Not cross-platform, but I dunno a better solution.) • The goal was to avoid locks in the first place! • When things go wrong, it's harder to debug. • The scheduler adds a new layer of complexity. • Because now I have to make sure my tasks are there properly. • And then look for the scheduler_run instances to see how they went. I must admit that this second problem would probably go away if we fixed all the scheduler's bugs! But it still leaves me uneasy. And I don't like having 40,000 scheduler_run instances build up over time. At this point, I realized that what I really want is a new feature in web2py that: • Runs a function in models (akin to scheduler's executor function) in a subprocess repeatedly • Ensures, with locks etc., that: • Only one is running at a time • That it dies if the parent web2py process dies And it seems better to just implement this as a web2py feature, than to stranglehold the scheduler into a different design. Cron's @reboot is very close to this. I used to use it. The problems: • I still had to implement my own locks and kills. (what I was trying to avoid) • It spawns 2 python subprocesses for each cron task (ugly, but not horrible) • It was really buggy. @reboot didn't work. I think massimo fixed this. • Syntax is gross. I basically just got scared of cron. Now I guess I'm scared of everything. :/ Hopefully this detailed report of my experience will be of help to somebody. I'm sure that fixing the bugs will make things 5x better. I will try your new scheduler.py Niphlod! On Tuesday, June 26, 2012 12:13:32 PM UTC-7, Niphlod wrote: problem here started as I can't ensure my app to insert only one task per function, that is not a scheduler problem per se: it's a common database problem. Would have been the same if someone created a db.define_table('mytable', Field('name'), Field('uniquecostraint') ) and have to ensure, without
[web2py] Re: Best practice using scheduler as a task queue?
This scenario is working out worse and worse. Now I'm getting tasks stuck in the 'RUNNING' state... even when there aren't any scheduler processes running behind them running! I'm guessing the server got killed mid-process, and now it doesn't know how to recover. Looks like a bug in the scheduler. I don't recommend using the scheduler as a task queue to anybody. On Tuesday, June 12, 2012 10:24:15 PM UTC-7, Michael Toomim wrote: Here's a common scenario. I'm looking for the best implementation using the scheduler. I want to support a set of background tasks (task1, task2...), where each task: • processes a queue of items • waits a few seconds It's safe to have task1 and task2 running in parallel, but I cannot have two task1s running in parallel. They will duplicately process the same queue of items. I found the scheduler supports this nicely with parameters like: db.scheduler_task.insert(function_name='task1', task_name='task1', stop_time = now + timedelta(days=9), repeats=0, period=10) I can launch 3 workers, and they coordinate amongst themselves to make sure that only one will run the task at a time. Great! This task will last forever... ...but now we encounter my problem... What happens if it crashes, or passes stop_time? Then the task will turn off, and the queue is no longer processed. Or what happens if I reset the database, or install this code on a new server? It isn't nice if I have to re-run the insert function by hand. So how can I ensure there is always EXACTLY ONE of each task in the database? I tried putting this code into models: def initialize_task_queue(task_name): num_tasks = db((db.scheduler_task.function_name == task_name) ((db.scheduler_task.status == 'QUEUED') | (db.scheduler_task.status == 'ASSIGNED') | (db.scheduler_task.status == 'RUNNING') | (db.scheduler_task.status == 'ACTIVE'))).count() # Add a task if there isn't one already if num_tasks 1: db.scheduler_task.insert(function_name=task_name, task_name=task_name, stop_time = now + timedelta(days=9), repeats=0, period=period) db.commit() initialize_task_queue('task1') initialize_task_queue('task2') initialize_task_queue('task3') This worked, except it introduces a race condition! If you start three web2py processes simultaneously (e.g., for three scheduler processes), they will insert duplicate tasks: process 1: count number of 'task1' tasks process 2: count number of 'task1' tasks process 1: there are less than 1, insert a 'task1' task process 2: there are less than 1, insert a 'task1' task I was counting on postgresql's MVCC transaction support to make each of these atomic. Unfortunately, that's not how it works. I do not understand why. As a workaround, I'm currently wrapping the code inside initialize_task_queue with postgresql advisory lock: if not db.executesql('select pg_try_advisory_lock(1);')[0][0]: return ... count tasks, add one if needed ... db.executesql('select pg_advisory_unlock(1);') But this sucks. What's a better way to ensure there is always 1 infinite-repeat task in the scheduler? Or... am I using the wrong design entirely? --
[web2py] Re: Best practice using scheduler as a task queue?
Er, let me rephrase: I don't recommend using the scheduler for *infinitely looping background tasks*. On Monday, June 25, 2012 4:54:30 PM UTC-7, Michael Toomim wrote: This scenario is working out worse and worse. Now I'm getting tasks stuck in the 'RUNNING' state... even when there aren't any scheduler processes running behind them running! I'm guessing the server got killed mid-process, and now it doesn't know how to recover. Looks like a bug in the scheduler. I don't recommend using the scheduler as a task queue to anybody. On Tuesday, June 12, 2012 10:24:15 PM UTC-7, Michael Toomim wrote: Here's a common scenario. I'm looking for the best implementation using the scheduler. I want to support a set of background tasks (task1, task2...), where each task: • processes a queue of items • waits a few seconds It's safe to have task1 and task2 running in parallel, but I cannot have two task1s running in parallel. They will duplicately process the same queue of items. I found the scheduler supports this nicely with parameters like: db.scheduler_task.insert(function_name='task1', task_name='task1', stop_time = now + timedelta(days=9), repeats=0, period=10) I can launch 3 workers, and they coordinate amongst themselves to make sure that only one will run the task at a time. Great! This task will last forever... ...but now we encounter my problem... What happens if it crashes, or passes stop_time? Then the task will turn off, and the queue is no longer processed. Or what happens if I reset the database, or install this code on a new server? It isn't nice if I have to re-run the insert function by hand. So how can I ensure there is always EXACTLY ONE of each task in the database? I tried putting this code into models: def initialize_task_queue(task_name): num_tasks = db((db.scheduler_task.function_name == task_name) ((db.scheduler_task.status == 'QUEUED') | (db.scheduler_task.status == 'ASSIGNED') | (db.scheduler_task.status == 'RUNNING') | (db.scheduler_task.status == 'ACTIVE'))).count() # Add a task if there isn't one already if num_tasks 1: db.scheduler_task.insert(function_name=task_name, task_name=task_name, stop_time = now + timedelta(days=9), repeats=0, period=period) db.commit() initialize_task_queue('task1') initialize_task_queue('task2') initialize_task_queue('task3') This worked, except it introduces a race condition! If you start three web2py processes simultaneously (e.g., for three scheduler processes), they will insert duplicate tasks: process 1: count number of 'task1' tasks process 2: count number of 'task1' tasks process 1: there are less than 1, insert a 'task1' task process 2: there are less than 1, insert a 'task1' task I was counting on postgresql's MVCC transaction support to make each of these atomic. Unfortunately, that's not how it works. I do not understand why. As a workaround, I'm currently wrapping the code inside initialize_task_queue with postgresql advisory lock: if not db.executesql('select pg_try_advisory_lock(1);')[0][0]: return ... count tasks, add one if needed ... db.executesql('select pg_advisory_unlock(1);') But this sucks. What's a better way to ensure there is always 1 infinite-repeat task in the scheduler? Or... am I using the wrong design entirely? --
[web2py] Quoting reserved words in DAL
I just got bit by the reserved-word problem: https://groups.google.com/d/msg/web2py/aSPtD_mGXdM/c7et_2l_54wJ I am trying to port a postgres database to a friend's mysql database, but we are stuck because the DAL does not quote identifiers. This problem has been discussed a fair amount: https://groups.google.com/d/msg/web2py/QKktETHk3yo/Mwm3D-JhNmAJ ...and it seems all we need to do is add some quoting mechanisms to the DAL. In postgresql you surround names with quotes and in mysql you use `backticks`. Does anyone have ideas for what to do?
[web2py] Re: Best practice using scheduler as a task queue?
Thanks for the response, niphlod! Let me explain: The task can be marked FAILED or EXPIRED if: • The code in the task throws an exception • A run of the task exceeds the timeout • The system clock goes past stop_time And it will just not plain exist if: • You have just set up the code • You install the code on a new database In order to implement a perpetually processing task queue, we need to ensure that there is always an active process queue task ready in the scheduler. So what's the best way to do this without creating race conditions? This method creates a race condition: 1. Check if task exists 2. Insert if not ...if multiple processes are checking and inserting at the same time. The only solution I've found is to wrap that code within a postgresql database advisory lock, but this only works on postgres. (And the update_or_insert() function you mentioned does the same thing as this internally, so it still suffers from a race condition.) On Wednesday, June 13, 2012 7:16:56 AM UTC-7, Niphlod wrote: Maybe I didn't get exactly what you need , but .. you have 3 tasks, that needs to be unique. Also, you want to be sure that if a task crashes doesn't remain hanged. This should never happen with the scheduler the worst situation is that if a worker crashes (here crashes is it disconnects from the database) leaves the task status as running, but as soon as another scheduler checks if that one sends heartbeats, he removes the dead worker and requeue that task. If your task goes into timeout and it's a repeating task the best practice should be to raise the timeout. Assured this, you need to initialize the database if someone truncates the scheduler_task table, inserting the 3 records in one transaction. If you need to be sure, why all the hassle when you can prepare the task_name column as a unique value and then do db.update_or_insert(task_name==myuniquetaskname, **task_record) ? PS: code in models get executed every request. What if you have no users accessing the site and in the need to call initialize_task_queue ? Isn't it better to insert the values and then start the workers ? BTW: a task that needs to be running forever but can't be launched in two instances seems to suffer some design issues but hey, everyone needs to be able to do what he wants ;-)
[web2py] Re: Best practice using scheduler as a task queue?
To respond to your last two points: You're right that models only runs on every request... I figured if my website isn't getting any usage then the tasks don't matter anyway. :P Yes, I think there are design issues here, but I haven't found a better solution. I'm very interested in hearing better overall solutions! The obvious alternative is to write a standalone script that loops forever, and launch it separately using something like python web2py.py -S app/controller -M -N -R background_work.py -A foo. But this requires solving the following problems that are *already solved* by the scheduler: • During development, restarting reloading models as I change code • Killing these background processes when I quit the server • Ensuring that no more than one background process runs at a time On Wednesday, June 13, 2012 7:16:56 AM UTC-7, Niphlod wrote: Maybe I didn't get exactly what you need , but .. you have 3 tasks, that needs to be unique. Also, you want to be sure that if a task crashes doesn't remain hanged. This should never happen with the scheduler the worst situation is that if a worker crashes (here crashes is it disconnects from the database) leaves the task status as running, but as soon as another scheduler checks if that one sends heartbeats, he removes the dead worker and requeue that task. If your task goes into timeout and it's a repeating task the best practice should be to raise the timeout. Assured this, you need to initialize the database if someone truncates the scheduler_task table, inserting the 3 records in one transaction. If you need to be sure, why all the hassle when you can prepare the task_name column as a unique value and then do db.update_or_insert(task_name==myuniquetaskname, **task_record) ? PS: code in models get executed every request. What if you have no users accessing the site and in the need to call initialize_task_queue ? Isn't it better to insert the values and then start the workers ? BTW: a task that needs to be running forever but can't be launched in two instances seems to suffer some design issues but hey, everyone needs to be able to do what he wants ;-)
[web2py] Best practice using scheduler as a task queue?
Here's a common scenario. I'm looking for the best implementation using the scheduler. I want to support a set of background tasks (task1, task2...), where each task: • processes a queue of items • waits a few seconds It's safe to have task1 and task2 running in parallel, but I cannot have two task1s running in parallel. They will duplicately process the same queue of items. I found the scheduler supports this nicely with parameters like: db.scheduler_task.insert(function_name='task1', task_name='task1', stop_time = now + timedelta(days=9), repeats=0, period=10) I can launch 3 workers, and they coordinate amongst themselves to make sure that only one will run the task at a time. Great! This task will last forever... ...but now we encounter my problem... What happens if it crashes, or passes stop_time? Then the task will turn off, and the queue is no longer processed. Or what happens if I reset the database, or install this code on a new server? It isn't nice if I have to re-run the insert function by hand. So how can I ensure there is always EXACTLY ONE of each task in the database? I tried putting this code into models: def initialize_task_queue(task_name): num_tasks = db((db.scheduler_task.function_name == task_name) ((db.scheduler_task.status == 'QUEUED') | (db.scheduler_task.status == 'ASSIGNED') | (db.scheduler_task.status == 'RUNNING') | (db.scheduler_task.status == 'ACTIVE'))).count() # Add a task if there isn't one already if num_tasks 1: db.scheduler_task.insert(function_name=task_name, task_name=task_name, stop_time = now + timedelta(days=9), repeats=0, period=period) db.commit() initialize_task_queue('task1') initialize_task_queue('task2') initialize_task_queue('task3') This worked, except it introduces a race condition! If you start three web2py processes simultaneously (e.g., for three scheduler processes), they will insert duplicate tasks: process 1: count number of 'task1' tasks process 2: count number of 'task1' tasks process 1: there are less than 1, insert a 'task1' task process 2: there are less than 1, insert a 'task1' task I was counting on postgresql's MVCC transaction support to make each of these atomic. Unfortunately, that's not how it works. I do not understand why. As a workaround, I'm currently wrapping the code inside initialize_task_queue with postgresql advisory lock: if not db.executesql('select pg_try_advisory_lock(1);')[0][0]: return ... count tasks, add one if needed ... db.executesql('select pg_advisory_unlock(1);') But this sucks. What's a better way to ensure there is always 1 infinite-repeat task in the scheduler? Or... am I using the wrong design entirely?
[web2py] Cron problems
I'm finding multiple problems getting cron to start the scheduler. Here's the cron line: @reboot dummyuser python web2py.py -K utility ...but it does not work without modifying web2py source. First, let's get an easy bug out of the way. The web2py book gives this example for @reboot: @reboot * * * * root *mycontroller/myfunction But those asterisks shouldn't be there for @reboot tasks. Can we remove them from the book? Now, when I put that line into my crontab and run web2py, it gives me this error: web2py Web Framework Created by Massimo Di Pierro, Copyright 2007-2011 Version 1.99.7 (2012-03-04 22:12:08) stable Database drivers available: SQLite3, pymysql, psycopg2, pg8000, CouchDB, IMAP Starting hardcron... please visit: http://192.168.56.101:8000 use kill -SIGTERM 10818 to shutdown the web2py server Exception in thread Thread-2: Traceback (most recent call last): File /usr/lib/python2.6/threading.py, line 532, in __bootstrap_inner self.run() File /home/toomim/projects/utility/web2py/gluon/newcron.py, line 234, in run shell=self.shell) File /usr/lib/python2.6/subprocess.py, line 633, in __init__ errread, errwrite) File /usr/lib/python2.6/subprocess.py, line 1139, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory This is an error in subprocess.Popen. I inserted some print statements and found that it's calling it like this: subprocess.Popen('python web2py.py -K utility') This is incorrect, it should be: subprocess.Popen(['python', 'web2py.py' '-K' 'utility']) I was able to make it work by adding a call to split(), as you can see here (in newcron.py: cronlauncher.run()): def run(self): import subprocess proc = subprocess.Popen(self.cmd.split(), But I do not understand how anybody could have made this work before, without adding a split() call? And what confuses me further is that there is an explicit join() call in the __init__() method that runs immediately beforehand, as if we really did NOT want to have lists: elif isinstance(cmd,list): cmd = ' '.join(cmd) So does cron @reboot work for anybody running a script? It seems impossible for it to work right now. Is this a bug? Finally, it would be great if we did not have to pass in a dummy user to each cron line that does nothing...
[web2py] Changing the controller on the fly
I need to be able to dispatch to a different controller based on a database lookup. So a user will go to a url (say '/dispatch'), and we'll look up in the database some information on that user, choose a new controller and function, and call that controller and function with its view. I've almost got this working below, but the models are not being loaded into the new controller. Is there a way to fix that? In default.py: def dispatch(): controller,function = ... load these from the database ... response.view = '%s/%s.html' % (controller, function) if not os.path.exists(request.folder + '/views/' + response.view): response.view = 'generic.html' from gluon.shell import exec_environment controller = exec_environment('%s/controllers/%s.py' % (request.folder, controller), request=request, response=response, session=session) return controller[request.task_function]() Unfortunately, the controller being called has access to request, response, and session, but none of the global variables defined in my models. Is there a way to get exec_environment() to run a function in another controller WITHOUT losing all the model definitions? Or is there a better way to do this?
[web2py] Re: Changing the controller on the fly
This is working great! It's exactly what I needed, and makes my code much simpler. Thank you very much! I love it! On Friday, May 11, 2012 5:03:51 AM UTC-7, Anthony wrote: Or to avoid a redirect, you can change the function and controller in a model file: db = DAL(...) if request.function == 'dispatch': request.controller, request.function = [fetch from db] response.view = '%s/%s.%s' % (request.controller, request.function,request .extension) response.generic_patterns = ['html'] # to enable the generic.html view if needed Anthony On Friday, May 11, 2012 6:07:56 AM UTC-4, simon wrote: You can do: def dispatch(): controller,function = ... load these from the database ... redirect(URL(c=controller, f=function, vars=request.vars, args=request.args)) On Friday, 11 May 2012 10:17:19 UTC+1, Michael Toomim wrote: I need to be able to dispatch to a different controller based on a database lookup. So a user will go to a url (say '/dispatch'), and we'll look up in the database some information on that user, choose a new controller and function, and call that controller and function with its view. I've almost got this working below, but the models are not being loaded into the new controller. Is there a way to fix that? In default.py: def dispatch(): controller,function = ... load these from the database ... response.view = '%s/%s.html' % (controller, function) if not os.path.exists(request.folder + '/views/' + response.view): response.view = 'generic.html' from gluon.shell import exec_environment controller = exec_environment('%s/controllers/%s.py' % (request.folder, controller), request=request, response=response, session=session) return controller[request.task_function]() Unfortunately, the controller being called has access to request, response, and session, but none of the global variables defined in my models. Is there a way to get exec_environment() to run a function in another controller WITHOUT losing all the model definitions? Or is there a better way to do this?
[web2py] Re: Installing scheduler on linux
Here's a solution I wrote. Is there a good place (like the web2py book) to share this recipe? Put this in /etc/init/web2py-scheduler.conf: description web2py task scheduler start on (local-filesystems and net-device-up IFACE=eth0) stop on shutdown # Give up if restart occurs 8 times in 60 seconds. respawn limit 8 60 exec sudo -u user python /home/user/web2py/web2py.py -K friendbo respawn This assumes your web2py is in user's home directory, running with permission user. Replace user with the right user id and change eth0 if your server uses a different network interface. On Thursday, May 3, 2012 1:22:25 PM UTC-7, Michael Toomim wrote: Anyone have a recipe to make the scheduler run on boot? I'm using ubuntu. Web2py is run in apache (using the recipe in the book), so I can't just use the cron @reboot line. This is the line that needs to be run when my system boots: python /home/web2py/web2py/web2py.py -K appname It seems ubuntu uses Upstart instead of sysvinit. And it might be possible to tell Upstart to launch the scheduler WITH apache, using a line like start on started apache2 but I don't know how it works, and apache2 to use upstart. It would be nice to have a recipe for the scheduler that we could add into the book, to synchronize it with apache.
[web2py] Re: Installing scheduler on linux
Also: 1. replace friendbo with the name of your app. 2. To start/stop the scheduler, use sudo start web2py-scheduler sudo stop web2py-scheduler sudo status web2py-scheduler ...etc. On Saturday, May 5, 2012 6:47:33 PM UTC-7, Michael Toomim wrote: Here's a solution I wrote. Is there a good place (like the web2py book) to share this recipe? Put this in /etc/init/web2py-scheduler.conf: description web2py task scheduler start on (local-filesystems and net-device-up IFACE=eth0) stop on shutdown # Give up if restart occurs 8 times in 60 seconds. respawn limit 8 60 exec sudo -u user python /home/user/web2py/web2py.py -K friendbo respawn This assumes your web2py is in user's home directory, running with permission user. Replace user with the right user id and change eth0 if your server uses a different network interface. On Thursday, May 3, 2012 1:22:25 PM UTC-7, Michael Toomim wrote: Anyone have a recipe to make the scheduler run on boot? I'm using ubuntu. Web2py is run in apache (using the recipe in the book), so I can't just use the cron @reboot line. This is the line that needs to be run when my system boots: python /home/web2py/web2py/web2py.py -K appname It seems ubuntu uses Upstart instead of sysvinit. And it might be possible to tell Upstart to launch the scheduler WITH apache, using a line like start on started apache2 but I don't know how it works, and apache2 to use upstart. It would be nice to have a recipe for the scheduler that we could add into the book, to synchronize it with apache.
[web2py] Installing scheduler on linux
Anyone have a recipe to make the scheduler run on boot? I'm using ubuntu. Web2py is run in apache (using the recipe in the book), so I can't just use the cron @reboot line. This is the line that needs to be run when my system boots: python /home/web2py/web2py/web2py.py -K appname It seems ubuntu uses Upstart instead of sysvinit. And it might be possible to tell Upstart to launch the scheduler WITH apache, using a line like start on started apache2 but I don't know how it works, and apache2 to use upstart. It would be nice to have a recipe for the scheduler that we could add into the book, to synchronize it with apache.
[web2py] Re: Why Bottle and Web2py?
I think the best combination of web2py and bottle would be, as you suggested—importing the web2py DAL into bottle. The DAL is the most important thing that bottle lacks, and the web2py DAL is great to plug into other projects. I use it a lot for that. That said, in my experience, you will quickly want a templating language to use with bottle as well. Because you will be making HTML. And then you will want a way to connect views to controllers... and soon I imagine you will just want to be using web2py. Web2py is very simple. I suggest you spend just a little bit more time getting to know how web2py works, and then decide whether to use bottle+web2py or web2py. On Wednesday, May 2, 2012 3:56:24 PM UTC-7, David Johnston wrote: Can someone explain to me why someone would want to use web2py with Bottle? I am considering the combination. Is it for when you want to keep the app very simple but need some of the more advanced functionality of Web2py such as the DAL? I am trying to learn web programming and wonder if this might be a good place to start. I find the full-blown web2py a little intimidating especially since I can't find what I am looking for in the documentation. I find Bottle the opposite. Too sparse. For example, I don't mind making forms but don't really want to write SQL queries. I am coming from a (short) background in Wordpress so I am looking for something like that but in Python and better for web app development. Bottle feels like building a webapp from the bottle up. Web2py feels like building it from the top down, (i.e. take the example apps and reverse engineer them to what you want). I guess I would like something in the middle. Wordpress wasn't THAT bad but I want easier CRUD and would rather ditch PHP. Web2py is supposed to be easier than Django but Django has so much better documentation, books etc that I might find this the opposite. Suggestions? Dave
[web2py] Re: Web2py templates for HamlPY
This is cool! how do we use it? On Sunday, January 9, 2011 5:07:28 PM UTC-8, Dane wrote: Hey all, thought you might be interested to know that I just patched a project HamlPy, a library for converting a pythonic haml-like syntax to django templates/html, to work with web2py templates. It allows for a less crufty, indentation-based syntax. Because it's indentation-based, {{ pass }} is no longer needed to close flow- control statements, and blocks are self-closed. Overall I think it's a much cleaner and quicker way of creating templates. And if you do want to use traditional {{ }} syntax, you can also do that in the .hamlpy files and they'll be converted as-is. https://github.com/danenania/HamlPy Hope someone gets some use out of this. My first real open source effort!
[web2py] Create indices from DAL
Here's an improved way to create indices in the DAL. Works only with postgresql and sqlite. def create_indices(*fields): ''' Creates a set of indices if they do not exist Use like: create_indices(db.posts.created_at, db.users.first_name, etc...) ''' for field in fields: table = field.tablename column = field.name db = field.db if db._uri.startswith('sqlite:'): db.executesql('create index if not exists %s_%s_index on %s (%s);' % (table, column, table, column)) elif db._uri.startswith('postgres:'): # Our indexes end with _index, but web2py autogenerates # one named _key for fields created with unique=True. # So let's check to see if one exists of either form. index_exists = \ db.executesql(select count(*) from pg_class where relname='%s_%s_index' or relname='%s_%s_key'; % (table, column, table, column))[0][0] == 1 if not index_exists: db.executesql('create index %s_%s_index on %s (%s);' % (table, column, table, column)) db.commit() This improves on this one I posted a while back: http://groups.google.com/group/web2py/browse_thread/thread/8f6179915a6df8ee/cb58f509ae0a478d?lnk=gstq=create+index#cb58f509ae0a478d
[web2py] Re: Help with OAuth20 facebook infinite redirect
Well, I don't need to debug this anymore. I switched to a different facebook app, and I'm no longer having the problem. On Dec 21, 7:55 pm, Michael Toomim too...@gmail.com wrote: I just upgraded from a modified 1.98.2 to 1.99.4 and now I'm getting an infinite redirect when logging in with OAuth20 and facebook. I'm having trouble debugging. Can someone help? What happens: User goes to /user/login This calls this code in tools.py: # we need to pass through login again before going on next = self.url('user',args='login') redirect(cas.login_url(next)) which calls this in contrib/login_methods/oauth20_account.py: def login_url(self, next=/): self.__oauth_login(next) return next and __oauth_login(next) will eventually redirect the user to this Facebook url to authenticate: https://graph.facebook.com/oauth/authorize?scope=emailredirect_uri=m... ...the user then logs in at facebook, and facebook returns back a code to us at /user/login?code=gobble dee gook Ok! Now we're at /user/login again. This calls the same functions as above (cas.login_url(next), which again calls __oath_login(next)), but this time the code variable is set, so we get an access token created. Great! BUT then __oath_login() returns to login_url() which returns /user/ login to the redirect function I pasted earlier: # we need to pass through login again before going on next = self.url('user',args='login') redirect(cas.login_url(next)) ...And the whole thing redirects BACK to /user/login. And then the whole cycle repeats itself from scratch! The login function redirects us to facebook, facebook gives us a code, sends us back to login, login creates an access_token, and then this all returns to tools.py which redirects us back to /user/login. Where is this supposed to stop cycling and go to a normal url instead of /user/login?
[web2py] Help with OAuth20 facebook infinite redirect
I just upgraded from a modified 1.98.2 to 1.99.4 and now I'm getting an infinite redirect when logging in with OAuth20 and facebook. I'm having trouble debugging. Can someone help? What happens: User goes to /user/login This calls this code in tools.py: # we need to pass through login again before going on next = self.url('user',args='login') redirect(cas.login_url(next)) which calls this in contrib/login_methods/oauth20_account.py: def login_url(self, next=/): self.__oauth_login(next) return next and __oauth_login(next) will eventually redirect the user to this Facebook url to authenticate: https://graph.facebook.com/oauth/authorize?scope=emailredirect_uri=myapp.com%2Fuser%2Floginresponse_type=codeclient_id=181047918589726 ...the user then logs in at facebook, and facebook returns back a code to us at /user/login?code=gobble dee gook Ok! Now we're at /user/login again. This calls the same functions as above (cas.login_url(next), which again calls __oath_login(next)), but this time the code variable is set, so we get an access token created. Great! BUT then __oath_login() returns to login_url() which returns /user/ login to the redirect function I pasted earlier: # we need to pass through login again before going on next = self.url('user',args='login') redirect(cas.login_url(next)) ...And the whole thing redirects BACK to /user/login. And then the whole cycle repeats itself from scratch! The login function redirects us to facebook, facebook gives us a code, sends us back to login, login creates an access_token, and then this all returns to tools.py which redirects us back to /user/login. Where is this supposed to stop cycling and go to a normal url instead of /user/login?
[web2py] Can't use custom classes in cache
Hi all, it appears I can't use any of my own classes in the cache: class Blah: pass b = blah() cache.disk('blah', lambda: b) This results in: AttributeError: 'module' object has no attribute 'Blah' I think this is because the things I'm defining (e.g. in models/) isn't accessible from cache.py in gluon/. Is there a way around this?
[web2py] Re: Can't use custom classes in cache
So let me tease these problems apart. 1. Some objects are not pickleable. These cannot be cached to disk. 2. If the object's class is not defined in the scope available to gluon/cache.py, then the object cannot be unpickled. Both of these problems can be avoided by using cache.ram. (That's what I'm doing, and probably why Bruno's works.) Another workaround would be for the user to pickle and unpickle objects himself, in his own code, and then pass the string to the cache. We could also eliminate problem #2 by making the set of additional class definitions available to gluon/cache.py. Perhaps with something like cache.set_custom_class_definitions([Blah]). This solution seems like the shortest path... but adding new APIs like this feels distasteful. :P (Massimo, I'm not sure what the problem is you're referring to with creating an instance before I cache.) On Nov 14, 7:41 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Even if they are pickable. it is possible that they get pickled but web2py cannot unpickle them. On Nov 14, 9:10 pm, Bruno Rocha rochacbr...@gmail.com wrote: I notice that if you are planning to run on GAE, your cached object needs to be Pickable, in my case, the config object instance is a king of Storage, which acts like a dict. So follow what Massimo said, and store only dict like objects in GAE, otherwise you will heve this issue: PicklingError: Can't pickle the object foo.bar.baz
[web2py] Status of new VirtualFields design?
What's the status of the new VirtualFields design? The latest I've seen is: db.item.adder=Field.lazy(lambda self:self.item.a + self.item.b) I have two design proposals: (1) Instead of calling them VirtualFields and Lazy VirtualFields, I think they should be VirtualFields and Methods. Two completely different things. The primary usage of lazy virtual fields seems to be adding helper methods to rows, so that one can call them as row.func() instead of func_on_table(row). This is a method, not a virtual field. (2) Put this API into db.define_table() instead of a separate set of statements. This puts all objects you can access on a row (Field, VirtualField, ComputedField, and Method) in one place in your code. And compared to separate class definitions, it reduces the lines of code necessary. Here's an example: db.define_table('item', Field('a', 'double'), Field('b', 'integer'), VirtualField('added', lambda self: self.item.a + self.item.b), Method('notify', lambda self: send_email(self.item.a)))
[web2py] Re: RFC: New design for {{include file.html}} in views
Yes this is a much better design pattern. I will just make a file of these helpers and include them all at once. Thanks massimo! On Sep 9, 10:32 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Why not do: in friend_selector.html {{def selector(id=None):}} div class=friend_selector div class=access_photos users=[]/div input {{if id:}}id={{=id}}{{pass}} name=access_input class=access_input size=30 type=text value= / /div {{return}} and in the extending view {{include 'friend_selector.html'}} {{selector(id='main-access')}} This is already supported and allows to define more than one function in include. And it is more pythonic. On Sep 9, 10:12 pm, Michael Toomim too...@gmail.com wrote: I frequently write short snippets of HTML that I want to replicate in many places, like: div class=friend_selector div class=access_photos users=[]/div input id=main_access_input name=access_input class=access_input size=30 type=text value= / /div So I can put this into a .html file friend_selector.html, and then include it whenever I need this. But I often need to parameterize it. For instance, I might want to set a different id each time. So I do it like this: {{vars = {id : 'main_access_input'} }} {{include 'friend_selector.html'}} {{vars = {} }} And then parameterize my html like this: div class=friend_selector div class=access_photos users=[]/div input {{if vars['id']:}}id={{=vars['id']}}{{pass}} name=access_input class=access_input size=30 type=text value= / /div Basically, I'm re-inventing a function call via the {{include}} feature. Wouldn't it be awesome if this just happened automatically??? Like this: {{include 'friend_selector.html' (id='main_access_input')}} Would you like this feature? Does this sound hard to implement? Appendix: 1. You can also do this I think with template inheritance, but that's just complicated. Or you could define and call parameterized functions in python, but editing embedded html in python strings is gross. 2. The (id='main_access_input') part would ideally be a full python- style parameter list, supporting e.g. (a=1, b=3). This is simpler to implement. But to support positional args, like (1,3,5, a=1), the html blocks would need to declare their supported parameters. They could do so with by including a declare snippet like this: {{ declare (a, b, c, id=None): }} div class=friend_selector div class=access_photos users=[]/div input {{if id:}}id={{=id}}{{pass}} name=access_input class=access_input size=30 type=text value= / /div
[web2py] Re: problems changing database field type
I think we need more tools for fixing broken migrations! When I have something broken, sometimes I go into the sql console, edit the database manually, and then use these functions to tell web2py that I've changed the table in sql. (However, I haven't had to use these for at least a year... maybe some dal bugs have been fixed and migrations don't break as frequently anymore?) def db_hash(): import cPickle, hashlib return hashlib.md5(database).hexdigest() def get_migrate_status(table_name): import cPickle, hashlib f = open('applications/init/databases/%s_%s.table' % (hashlib.md5(database).hexdigest(), table_name), 'r') result = cPickle.load(f) f.close() return result def save_migrate_status(table_name, status): import cPickle, hashlib f = open('applications/init/databases/%s_%s.table' % (hashlib.md5(database).hexdigest(), table_name), 'w') cPickle.dump(status, f) f.close() print 'saved' def del_migrate_column(table_name, column_name): a = get_migrate_status(table_name) del a[column_name] save_migrate_status(table_name, a) On Sep 9, 11:17 am, pbreit pbreitenb...@gmail.com wrote: I think we need some more documentation on migrations, especially around fixing broken migrations, how to modify schema manually, how the sql.log and .table files work and how/when to recreate, etc.
[web2py] RFC: New design for {{include file.html}} in views
I frequently write short snippets of HTML that I want to replicate in many places, like: div class=friend_selector div class=access_photos users=[]/div input id=main_access_input name=access_input class=access_input size=30 type=text value= / /div So I can put this into a .html file friend_selector.html, and then include it whenever I need this. But I often need to parameterize it. For instance, I might want to set a different id each time. So I do it like this: {{vars = {id : 'main_access_input'} }} {{include 'friend_selector.html'}} {{vars = {} }} And then parameterize my html like this: div class=friend_selector div class=access_photos users=[]/div input {{if vars['id']:}}id={{=vars['id']}}{{pass}} name=access_input class=access_input size=30 type=text value= / /div Basically, I'm re-inventing a function call via the {{include}} feature. Wouldn't it be awesome if this just happened automatically??? Like this: {{include 'friend_selector.html' (id='main_access_input')}} Would you like this feature? Does this sound hard to implement? Appendix: 1. You can also do this I think with template inheritance, but that's just complicated. Or you could define and call parameterized functions in python, but editing embedded html in python strings is gross. 2. The (id='main_access_input') part would ideally be a full python- style parameter list, supporting e.g. (a=1, b=3). This is simpler to implement. But to support positional args, like (1,3,5, a=1), the html blocks would need to declare their supported parameters. They could do so with by including a declare snippet like this: {{ declare (a, b, c, id=None): }} div class=friend_selector div class=access_photos users=[]/div input {{if id:}}id={{=id}}{{pass}} name=access_input class=access_input size=30 type=text value= / /div
[web2py] Re: Lazy virtual fields - strange result!
After some thought, I'm really liking this design for virtual fields... what if lazy/virtual fields were declared directly in db.define_table()? Like so: db.define_table('item', Field('unit_price','double'), Field('quantity','integer'), VirtualField('total_price', lambda self: self.item.unit_price*self.item.quantity, lazy=True)) It's so simple. I still kinda feel like we might find better names than lazy/virtual though. So here's a design I like even more: db.define_table('item', Field('unit_price','double'), Field('quantity','integer'), Method('total_price', lambda self: self.item.unit_price*self.item.quantity, precompute=False)) `precompute' means not lazy and would default to false. On Aug 25, 6:14 am, Massimo Di Pierro massimo.dipie...@gmail.com wrote: We are moving away from this because of many problems. Try this instead. It is still experimental but may go into stable soon. def vfields(): db.define_table('item', Field('unit_price','double'), Field('quantity','integer')) db(db.item.id0).delete() db.item.lazy_total_price=Field.lazy(lambda self:self.item.unit_price*self.item.quantity) db.item.bulk_insert([{'unit_price':12.00, 'quantity': 15}, {'unit_price':10.00, 'quantity': 99}, {'unit_price':120.00, 'quantity': 2},]) res = [] for r in db(db.item.id0).select(): res.append([r.unit_price, r.quantity, r.lazy_total_price()]) return dict(res=res)
[web2py] Re: Lazy virtual fields - strange result!
Interesting approach to use lambdas. Since lambdas don't do side- effects, I checked out my virtualfields to see if my uses have side effects. In my app I have: 12 methods total across 3 database tables 10 of those methods have no side-effects 2 have side-effects The two methods with side-effects are: 1. row.delete() ... deletes the relevant rows in related tables 2. row.toggle_x() ... toggles the boolean field of x and updates cached data on other tables and sends email notifications Perhaps this information is useful in designing the next lazy virtualfields. On Aug 25, 9:14 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: We are moving away from this because of many problems. Try this instead. It is still experimental but may go into stable soon. def vfields(): db.define_table('item', Field('unit_price','double'), Field('quantity','integer')) db(db.item.id0).delete() db.item.lazy_total_price=Field.lazy(lambda self:self.item.unit_price*self.item.quantity) db.item.bulk_insert([{'unit_price':12.00, 'quantity': 15}, {'unit_price':10.00, 'quantity': 99}, {'unit_price':120.00, 'quantity': 2},]) res = [] for r in db(db.item.id0).select(): res.append([r.unit_price, r.quantity, r.lazy_total_price()]) return dict(res=res) On Aug 25, 7:50 am, Martin Weissenboeck mweis...@gmail.com wrote: I wanted to learn more about lazy virtual fields and therefore I have repeated the example from the book: def vfields(): db.define_table('item', Field('unit_price','double'), Field('quantity','integer')) db(db.item.id0).delete() class MyVirtualFields: def lazy_total_price(self): return lambda self=self: self.item.unit_price*self.item.quantity db.item.virtualfields.append (MyVirtualFields()) db.item.bulk_insert([{'unit_price':12.00, 'quantity': 15}, {'unit_price':10.00, 'quantity': 99}, {'unit_price':120.00, 'quantity': 2},]) res = [] for r in db(db.item.id0).select(): res.append([r.unit_price, r.quantity, r.lazy_total_price()]) return dict(res=res) The expected output is: [[12.0, 15, 180.0], [10.0, 99, 990.0], [120.0, 2, 240.0]] But I got * [[12.0, 15, *240.0]*, [10.0, 99, *240.0*], [120.0, 2, 240.0]]* * * *Three times the same result. * I have read the book and my program over and over again - but I cannot see any error.* * Does somebody have an idea? Martin
[web2py] Re: Lazy virtual fields - strange result!
I'm sorry, that was a doofus comment. Of course lambdas allow side- effects! I wish mailing lists supported delete. On Aug 27, 1:08 pm, Michael Toomim too...@gmail.com wrote: Interesting approach to use lambdas. Since lambdas don't do side- effects, I checked out my virtualfields to see if my uses have side effects. In my app I have: 12 methods total across 3 database tables 10 of those methods have no side-effects 2 have side-effects The two methods with side-effects are: 1. row.delete() ... deletes the relevant rows in related tables 2. row.toggle_x() ... toggles the boolean field of x and updates cached data on other tables and sends email notifications Perhaps this information is useful in designing the next lazy virtualfields. On Aug 25, 9:14 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: We are moving away from this because of many problems. Try this instead. It is still experimental but may go into stable soon. def vfields(): db.define_table('item', Field('unit_price','double'), Field('quantity','integer')) db(db.item.id0).delete() db.item.lazy_total_price=Field.lazy(lambda self:self.item.unit_price*self.item.quantity) db.item.bulk_insert([{'unit_price':12.00, 'quantity': 15}, {'unit_price':10.00, 'quantity': 99}, {'unit_price':120.00, 'quantity': 2},]) res = [] for r in db(db.item.id0).select(): res.append([r.unit_price, r.quantity, r.lazy_total_price()]) return dict(res=res) On Aug 25, 7:50 am, Martin Weissenboeck mweis...@gmail.com wrote: I wanted to learn more about lazy virtual fields and therefore I have repeated the example from the book: def vfields(): db.define_table('item', Field('unit_price','double'), Field('quantity','integer')) db(db.item.id0).delete() class MyVirtualFields: def lazy_total_price(self): return lambda self=self: self.item.unit_price*self.item.quantity db.item.virtualfields.append (MyVirtualFields()) db.item.bulk_insert([{'unit_price':12.00, 'quantity': 15}, {'unit_price':10.00, 'quantity': 99}, {'unit_price':120.00, 'quantity': 2},]) res = [] for r in db(db.item.id0).select(): res.append([r.unit_price, r.quantity, r.lazy_total_price()]) return dict(res=res) The expected output is: [[12.0, 15, 180.0], [10.0, 99, 990.0], [120.0, 2, 240.0]] But I got * [[12.0, 15, *240.0]*, [10.0, 99, *240.0*], [120.0, 2, 240.0]]* * * *Three times the same result. * I have read the book and my program over and over again - but I cannot see any error.* * Does somebody have an idea? Martin
[web2py] Re: Virtual Fields. Strange behavior when saving db result sets Build in types
I don't have a direct solution, but FYI I added this info to a bug report on a related topic. http://code.google.com/p/web2py/issues/detail?id=374 Thanks for pointing out the problem. On Aug 14, 8:57 am, Santiago Gilabert santiagogilab...@gmail.com wrote: anyone? I found that someone else asked about the same issue 5 month ago but there are no comments about it. http://groups.google.com/group/web2py/browse_thread/thread/845e6cdef5... Thanks Santiago On Sat, Aug 13, 2011 at 7:07 PM, Santiago santiagogilab...@gmail.comwrote: Hello, I have the following definitions in db.py class ElectionVirtualField(object): ... def is_presidential(self): def lazy(self=self): return self.election.category == 'P' return lazy ... db.define_table('election', ... Field('category', length=1, label=T('Category') ... The problem I have is that when I add a bunch of elections in dict() (key = election.id, value = election row) and then traverse the dict, I get wrong values from is_presidential() Example: elections = dict() for election in db(db.election).select(): elections[election.id] = election for (k, v) in elections.items(): print k , ' ', v.category, ' ', v.is_presidential() Output: 81 D True 79 P True As you can see, it returns True for both, but for the first one, it should return False. If I change the code to reload the election from the database, the output is different: Example: elections = dict() for election in db(db.election).select(): elections[election.id] = election for (k, v) in elections.items(): reloaded_election = db.election(k) print k , ' ', v.category, ' ', v.is_presidential() Output: 81 D False 79 P True Does this mean that we can't save rows from DB on Build in types ? Thanks in advance, Santiago
[web2py] Re: Bug in virtualfields w/ session
Ok, it's here http://code.google.com/p/web2py/issues/detail?id=374 Thank you for looking into this Massimo! I do not know the best way to do this... my code is just a first reaction to making something faster. On Aug 11, 2:55 am, Massimo Di Pierro massimo.dipie...@gmail.com wrote: This is really interesting. Please give me some time to study it, meanwhile, so that I do not forget, please open an issue and post the code there. Massimo On Aug 10, 7:11 pm, MichaelToomimtoo...@gmail.com wrote: Ok. The basic idea is to allow you to define helpers methods on rows, sort of like the Models of rails/django. You use it like this... I put this in models/db_methods.py: @extra_db_methods class Users(): def name(self): return '%s %s' % ((self.first_name or ''), (self.last_name or '')) def fb_name(self): p = self.person() return (p and p.name) or 'Unknown dude' def person(self): return db.people(db.people.fb_id == self.fb_id) def friends(self): return [Storage(name=f[0], id=f[1]) for f in sj.loads(self.friends_cache)] @extra_db_methods class People(): ... etc These are for tables db.users and db.people. It looks up the table name from the class name. For each table that you want to extend, you make a class and put @extra_db_methods on top. It's implemented with the following @extra_db_methods decorator and a patch to dal.py. The decorator just traverses the class, pulls out all methods, and throws them into a methods variable on the appropriate table in dal. Then the dal's parse() routine adds these methods each row object, using the python type.MethodType() routine for retargetting a method from one class to another object. The downside is extending dal with yet ANOTHER way of adding methods to objects. That makes 3 apis to maintain for similar things (virtualfields, computedfields, and this). And I'm not sure about the names (like extra_db_methods) for these things yet. Also I think we might be able to get it even faster by being more clever with python inheritance in the Row class. Right now it has roughly 10% overhead on selects in my tests (uncompiled code). At the bottom of this message is the decorator that implements the same functionality using the existing virtualfields mechanism and your lazy decorator. Its downside is a 2x to 3x overhead on selects and instead of self.field you have to say self.tablename.field in the method bodies. def extra_db_methods(clss): tablename = clss.__name__.lower() if not tablename in db: raise Error('There is no `%s\' table to put virtual methods in' % tablename) for k in clss.__dict__.keys(): method = clss.__dict__[k] if type(method).__name__ == 'function' or type(method).__name__ == 'instancemethod': db[tablename].methods.update({method.__name__ : method}) return clss --- k/web2py/gluon/dal.py 2011-08-03 16:46:39.0 -0700 +++ web2py/gluon/dal.py 2011-08-10 17:04:48.344795251 -0700 @@ -1459,6 +1459,7 @@ new_rows.append(new_row) rowsobj = Rows(db, new_rows, colnames, rawrows=rows) for tablename in virtualtables: + rowsobj.setmethods(tablename, db[tablename].methods) for item in db[tablename].virtualfields: try: rowsobj = rowsobj.setvirtualfields(**{tablename:item}) @@ -4559,6 +4560,7 @@ tablename = tablename self.fields = SQLCallableList() self.virtualfields = [] + self.methods = {} fields = list(fields) if db and self._db._adapter.uploads_in_blob==True: @@ -5574,6 +5576,14 @@ self.compact = compact self.response = rawrows + def setmethods(self, tablename, methods): + if len(methods) 0: return + for row in self.records: + if tablename not in row: break # Abort on this and all rows. For efficiency. + for (k,v) in methods.items(): + r = row[tablename] + r.__dict__[k] = types.MethodType(v, r) + return self def setvirtualfields(self,**keyed_virtualfields): if not keyed_virtualfields: return self --- And Here's the implementation using virtualfields: def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g def extra_db_methods_vf(clss): ''' This decorator clears virtualfields on the table and replaces them with the methods on this class. ''' # First let's make the methods lazy for k in clss.__dict__.keys(): if type(getattr(clss, k)).__name__ == 'instancemethod': setattr(clss, k, lazy(getattr(clss, k))) tablename =
[web2py] Here's a helper for db.table.first() and last()
Often I'm at the shell and want to quickly pull up the most recent entry in a table. I wrote a couple of helpers for this. For instance, in a blog app: db.posts.last() ...will get the most recent post. By putting this code at the bottom of db.py, it'll automatically create a first() and last() method for each table in your app. for table in db.tables: def first(self): return db(self.id0).select(orderby=self.id, limitby=(0,1)).first() def last(self): return db(self.id0).select(orderby=~self.id, limitby=(0,1)).first() t = db[table] t.first = types.MethodType(first, t) t.last = types.MethodType(last, t)
[web2py] Re: RFC about issue
I agree not a big deal: http://www.quora.com/Should-buttons-in-web-apps-be-capitalized On Aug 11, 3:24 am, Massimo Di Pierro massimo.dipie...@gmail.com wrote: What do people think? http://code.google.com/p/web2py/issues/detail?id=370 I do not have a strong opinion.
[web2py] Re: Bug in virtualfields w/ session
Ok. The basic idea is to allow you to define helpers methods on rows, sort of like the Models of rails/django. You use it like this... I put this in models/db_methods.py: @extra_db_methods class Users(): def name(self): return '%s %s' % ((self.first_name or ''), (self.last_name or '')) def fb_name(self): p = self.person() return (p and p.name) or 'Unknown dude' def person(self): return db.people(db.people.fb_id == self.fb_id) def friends(self): return [Storage(name=f[0], id=f[1]) for f in sj.loads(self.friends_cache)] @extra_db_methods class People(): ... etc These are for tables db.users and db.people. It looks up the table name from the class name. For each table that you want to extend, you make a class and put @extra_db_methods on top. It's implemented with the following @extra_db_methods decorator and a patch to dal.py. The decorator just traverses the class, pulls out all methods, and throws them into a methods variable on the appropriate table in dal. Then the dal's parse() routine adds these methods each row object, using the python type.MethodType() routine for retargetting a method from one class to another object. The downside is extending dal with yet ANOTHER way of adding methods to objects. That makes 3 apis to maintain for similar things (virtualfields, computedfields, and this). And I'm not sure about the names (like extra_db_methods) for these things yet. Also I think we might be able to get it even faster by being more clever with python inheritance in the Row class. Right now it has roughly 10% overhead on selects in my tests (uncompiled code). At the bottom of this message is the decorator that implements the same functionality using the existing virtualfields mechanism and your lazy decorator. Its downside is a 2x to 3x overhead on selects and instead of self.field you have to say self.tablename.field in the method bodies. def extra_db_methods(clss): tablename = clss.__name__.lower() if not tablename in db: raise Error('There is no `%s\' table to put virtual methods in' % tablename) for k in clss.__dict__.keys(): method = clss.__dict__[k] if type(method).__name__ == 'function' or type(method).__name__ == 'instancemethod': db[tablename].methods.update({method.__name__ : method}) return clss --- k/web2py/gluon/dal.py 2011-08-03 16:46:39.0 -0700 +++ web2py/gluon/dal.py 2011-08-10 17:04:48.344795251 -0700 @@ -1459,6 +1459,7 @@ new_rows.append(new_row) rowsobj = Rows(db, new_rows, colnames, rawrows=rows) for tablename in virtualtables: +rowsobj.setmethods(tablename, db[tablename].methods) for item in db[tablename].virtualfields: try: rowsobj = rowsobj.setvirtualfields(**{tablename:item}) @@ -4559,6 +4560,7 @@ tablename = tablename self.fields = SQLCallableList() self.virtualfields = [] +self.methods = {} fields = list(fields) if db and self._db._adapter.uploads_in_blob==True: @@ -5574,6 +5576,14 @@ self.compact = compact self.response = rawrows +def setmethods(self, tablename, methods): +if len(methods) 0: return +for row in self.records: +if tablename not in row: break # Abort on this and all rows. For efficiency. +for (k,v) in methods.items(): +r = row[tablename] +r.__dict__[k] = types.MethodType(v, r) +return self def setvirtualfields(self,**keyed_virtualfields): if not keyed_virtualfields: return self --- And Here's the implementation using virtualfields: def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g def extra_db_methods_vf(clss): ''' This decorator clears virtualfields on the table and replaces them with the methods on this class. ''' # First let's make the methods lazy for k in clss.__dict__.keys(): if type(getattr(clss, k)).__name__ == 'instancemethod': setattr(clss, k, lazy(getattr(clss, k))) tablename = clss.__name__.lower() if not tablename in db: raise Error('There is no `%s\' table to put virtual methods in' % tablename) del db[tablename].virtualfields[:] # We clear virtualfields each time db[tablename].virtualfields.append(clss()) return clss You use this just like before but with @extra_db_methods_vf instead of @extra_db_methods, and append tablename to each use of self. On Aug 9, 11:16 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: let us see it! On Aug 9, 9:36 pm, MichaelToomimtoo...@gmail.com wrote: Result: Fixed by upgrading. I was seeing this bug:http://code.google.com/p/web2py/issues/detail?id=345 However, virtualfields still take more time than they should.
[web2py] Re: Bug in virtualfields w/ session
Result: Fixed by upgrading. I was seeing this bug: http://code.google.com/p/web2py/issues/detail?id=345 However, virtualfields still take more time than they should. My selects take 2-3x longer with virtualfields enabled than without. I implemented a little hack in the dal that adds methods to rows with only a 10% overhead (instead of 200-300%) and can share that if anyone's interested. On Aug 8, 8:38 pm, Michael Toomim too...@gmail.com wrote: It turns out the speed problem is REALLY bad. I have a table with virtualfields of 14,000 rows. When I run raw sql: a = db.executesql('select * from people;') ...the query returns in 121ms. But when I run it through the DAL on only a subset of the data: a = db(db.people.id 0).select(limitby=(0,1000)) ...it returns in 141096.431ms. That's... 141 seconds. So 1000x longer on .1 of the database. My virtualfields are all lazy functions. I'm looking into what's causing it and will report back when I find out. It seems it might have something to do with the lazy decorator func because when I hit C- c the code is often stuck there... inside import copy or something. def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g Anyway, I'll send an update when I have more info. On Aug 2, 3:03 pm, MichaelToomimtoo...@gmail.com wrote: That's way better syntax! Great idea! On Aug 2, 2011, at 2:31 AM, Massimo Di Pierro wrote: We need to work on the speed. This can perhaps help the syntax: db=DAL() db.define_table('a',Field('b','integer')) for i in range(10): db.a.insert(b=i) def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g class Scale: @lazy def c(self,scale=1): return self.a.b*scale db.a.virtualfields.append(Scale()) for row in db(db.a).select(): print row.b, row.c(1), row.c(2), row.c(3) On Aug 1, 3:10 pm, MichaelToomimtoo...@gmail.com wrote: Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this: - Large queries are slowed byvirtualfields, even if they won't be needed, esp if they query db - My definitions forvirtualfieldsaren't as clean as they could be, because I have many nested lazy funcs in the class definition - We can't serialize all objects intosessionvariables So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here? On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote: Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in thesession) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in thesession. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not. Anthony On Mon, Aug 1, 2011 at 5:30 AM, MichaelToomimtoo...@cs.washington.edu wrote: Awesome! I did not know there was an issue submission system. On Jul 30, 2011, at 7:02 AM, Anthony wrote: An issue has been submitted, and this should be corrected soon. Anthony On Friday, July 29, 2011 9:57:30 PM UTC-4, Anthony wrote: auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed. Anthony On Friday, July 29, 2011 9:05:39 PM UTC-4, MichaelToomimwrote: I think I found a bug invirtualfields. I have the following controller: def posts(): user =session.auth.user n = user.name # returns None Where person is defined as a virtualfield on user: class Users(): def name(self): return self.users.first_name + ' ' + self.users.last_name db.users.virtualfields.append(Users()) The problem is that user.name returns None, because apparently the virtualfield isn't loaded into thesessionvariable of user. I made this work with the following modification to the controller: def posts(): user = db.users[session.auth.user.id] n = user.name # returns the user name correctly! I just had to refetch the user from the database.
[web2py] Re: Bug in virtualfields w/ session
It turns out the speed problem is REALLY bad. I have a table with virtualfields of 14,000 rows. When I run raw sql: a = db.executesql('select * from people;') ...the query returns in 121ms. But when I run it through the DAL on only a subset of the data: a = db(db.people.id 0).select(limitby=(0,1000)) ...it returns in 141096.431ms. That's... 141 seconds. So 1000x longer on .1 of the database. My virtualfields are all lazy functions. I'm looking into what's causing it and will report back when I find out. It seems it might have something to do with the lazy decorator func because when I hit C- c the code is often stuck there... inside import copy or something. def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g Anyway, I'll send an update when I have more info. On Aug 2, 3:03 pm, Michael Toomim too...@gmail.com wrote: That's way better syntax! Great idea! On Aug 2, 2011, at 2:31 AM, Massimo Di Pierro wrote: We need to work on the speed. This can perhaps help the syntax: db=DAL() db.define_table('a',Field('b','integer')) for i in range(10): db.a.insert(b=i) def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g class Scale: @lazy def c(self,scale=1): return self.a.b*scale db.a.virtualfields.append(Scale()) for row in db(db.a).select(): print row.b, row.c(1), row.c(2), row.c(3) On Aug 1, 3:10 pm, Michael Toomim too...@gmail.com wrote: Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this: - Large queries are slowed byvirtualfields, even if they won't be needed, esp if they query db - My definitions forvirtualfieldsaren't as clean as they could be, because I have many nested lazy funcs in the class definition - We can't serialize all objects intosessionvariables So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here? On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote: Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in thesession) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in thesession. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not. Anthony On Mon, Aug 1, 2011 at 5:30 AM, Michael Toomim too...@cs.washington.edu wrote: Awesome! I did not know there was an issue submission system. On Jul 30, 2011, at 7:02 AM, Anthony wrote: An issue has been submitted, and this should be corrected soon. Anthony On Friday, July 29, 2011 9:57:30 PM UTC-4, Anthony wrote: auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed. Anthony On Friday, July 29, 2011 9:05:39 PM UTC-4, Michael Toomim wrote: I think I found a bug invirtualfields. I have the following controller: def posts(): user =session.auth.user n = user.name # returns None Where person is defined as a virtualfield on user: class Users(): def name(self): return self.users.first_name + ' ' + self.users.last_name db.users.virtualfields.append(Users()) The problem is that user.name returns None, because apparently the virtualfield isn't loaded into thesessionvariable of user. I made this work with the following modification to the controller: def posts(): user = db.users[session.auth.user.id] n = user.name # returns the user name correctly! I just had to refetch the user from the database.
[web2py] Re: Bug in virtualfields w/ session
Mid-status note: it would be great if the profiler worked with the web2py shell! Then I could run commands at the command prompt in isolation and see how long they take. On Aug 8, 8:38 pm, Michael Toomim too...@gmail.com wrote: It turns out the speed problem is REALLY bad. I have a table withvirtualfieldsof 14,000 rows. When I run raw sql: a = db.executesql('select * from people;') ...the query returns in 121ms. But when I run it through the DAL on only a subset of the data: a = db(db.people.id 0).select(limitby=(0,1000)) ...it returns in 141096.431ms. That's... 141 seconds. So 1000x longer on .1 of the database. Myvirtualfieldsare all lazy functions. I'm looking into what's causing it and will report back when I find out. It seems it might have something to do with the lazy decorator func because when I hit C- c the code is often stuck there... inside import copy or something. def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g Anyway, I'll send an update when I have more info. On Aug 2, 3:03 pm, Michael Toomim too...@gmail.com wrote: That's way better syntax! Great idea! On Aug 2, 2011, at 2:31 AM, Massimo Di Pierro wrote: We need to work on the speed. This can perhaps help the syntax: db=DAL() db.define_table('a',Field('b','integer')) for i in range(10): db.a.insert(b=i) def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g class Scale: @lazy def c(self,scale=1): return self.a.b*scale db.a.virtualfields.append(Scale()) for row in db(db.a).select(): print row.b, row.c(1), row.c(2), row.c(3) On Aug 1, 3:10 pm, Michael Toomim too...@gmail.com wrote: Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this: - Large queries are slowed byvirtualfields, even if they won't be needed, esp if they query db - My definitions forvirtualfieldsaren't as clean as they could be, because I have many nested lazy funcs in the class definition - We can't serialize all objects intosessionvariables So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here? On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote: Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in thesession) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in thesession. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not. Anthony On Mon, Aug 1, 2011 at 5:30 AM, Michael Toomim too...@cs.washington.edu wrote: Awesome! I did not know there was an issue submission system. On Jul 30, 2011, at 7:02 AM, Anthony wrote: An issue has been submitted, and this should be corrected soon. Anthony On Friday, July 29, 2011 9:57:30 PM UTC-4, Anthony wrote: auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed. Anthony On Friday, July 29, 2011 9:05:39 PM UTC-4, Michael Toomim wrote: I think I found a bug invirtualfields. I have the following controller: def posts(): user =session.auth.user n = user.name # returns None Where person is defined as a virtualfield on user: class Users(): def name(self): return self.users.first_name + ' ' + self.users.last_name db.users.virtualfields.append(Users()) The problem is that user.name returns None, because apparently the virtualfield isn't loaded into thesessionvariable of user. I made this work with the following modification to the controller: def posts(): user = db.users[session.auth.user.id] n = user.name # returns the user name correctly! I just had to refetch the user from the database.
Re: [web2py] Re: Bug in virtualfields w/ session
That's way better syntax! Great idea! On Aug 2, 2011, at 2:31 AM, Massimo Di Pierro wrote: We need to work on the speed. This can perhaps help the syntax: db=DAL() db.define_table('a',Field('b','integer')) for i in range(10): db.a.insert(b=i) def lazy(f): def g(self,f=f): import copy self=copy.copy(self) return lambda *a,**b: f(self,*a,**b) return g class Scale: @lazy def c(self,scale=1): return self.a.b*scale db.a.virtualfields.append(Scale()) for row in db(db.a).select(): print row.b, row.c(1), row.c(2), row.c(3) On Aug 1, 3:10 pm, Michael Toomim too...@gmail.com wrote: Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this: - Large queries are slowed by virtualfields, even if they won't be needed, esp if they query db - My definitions for virtualfields aren't as clean as they could be, because I have many nested lazy funcs in the class definition - We can't serialize all objects into session variables So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here? On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote: Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in the session) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in the session. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not. Anthony On Mon, Aug 1, 2011 at 5:30 AM, Michael Toomim too...@cs.washington.edu wrote: Awesome! I did not know there was an issue submission system. On Jul 30, 2011, at 7:02 AM, Anthony wrote: An issue has been submitted, and this should be corrected soon. Anthony On Friday, July 29, 2011 9:57:30 PM UTC-4, Anthony wrote: auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed. Anthony On Friday, July 29, 2011 9:05:39 PM UTC-4, Michael Toomim wrote: I think I found a bug in virtualfields. I have the following controller: def posts(): user = session.auth.user n = user.name # returns None Where person is defined as a virtualfield on user: class Users(): def name(self): return self.users.first_name + ' ' + self.users.last_name db.users.virtualfields.append(Users()) The problem is that user.name returns None, because apparently the virtualfield isn't loaded into the session variable of user. I made this work with the following modification to the controller: def posts(): user = db.users[session.auth.user.id] n = user.name # returns the user name correctly! I just had to refetch the user from the database.
Re: [web2py] Bug in virtualfields w/ session
Maybe it helps for me to explain my use-case. I mainly use virtual fields as lazy methods, to help traverse related tables. I was actually surprised that lazy evaluation wasn't the default. I noticed a few implications of this: - Large queries are slowed by virtualfields, even if they won't be needed, esp if they query db - My definitions for virtualfields aren't as clean as they could be, because I have many nested lazy funcs in the class definition - We can't serialize all objects into session variables So really I'm just using this because it's a nicer notation to call row.otherthing() instead of getotherthing(row). Maybe I really want some different feature here? On Aug 1, 2011, at 5:40 AM, Anthony Bastardi wrote: Note, after looking at this some more, Massimo recalled that the reason auth_user virtual fields were excluded from auth.user (and therefore from saving in the session) is because some virtual fields are objects that cannot be pickled and therefore cannot be serialized to store in the session. So, we're thinking of either creating an option to store auth_user virutual fields in auth.user, or maybe testing to make sure the virtual fields can be pickled, and excluding them if not. Anthony On Mon, Aug 1, 2011 at 5:30 AM, Michael Toomim too...@cs.washington.edu wrote: Awesome! I did not know there was an issue submission system. On Jul 30, 2011, at 7:02 AM, Anthony wrote: An issue has been submitted, and this should be corrected soon. Anthony On Friday, July 29, 2011 9:57:30 PM UTC-4, Anthony wrote: auth.user is Storage(table_user._filter_fields(user, id=True)). The _filter_fields method of the auth_user table only selects actual table fields, not virtual fields, so auth.user will not include any virtual fields. Perhaps this should be changed. Anthony On Friday, July 29, 2011 9:05:39 PM UTC-4, Michael Toomim wrote: I think I found a bug in virtualfields. I have the following controller: def posts(): user = session.auth.user n = user.name # returns None Where person is defined as a virtualfield on user: class Users(): def name(self): return self.users.first_name + ' ' + self.users.last_name db.users.virtualfields.append(Users()) The problem is that user.name returns None, because apparently the virtualfield isn't loaded into the session variable of user. I made this work with the following modification to the controller: def posts(): user = db.users[session.auth.user.id] n = user.name # returns the user name correctly! I just had to refetch the user from the database.
[web2py] Bug in virtualfields w/ session
I think I found a bug in virtualfields. I have the following controller: def posts(): user = session.auth.user n = user.name # returns None Where person is defined as a virtualfield on user: class Users(): def name(self): return self.users.first_name + ' ' + self.users.last_name db.users.virtualfields.append(Users()) The problem is that user.name returns None, because apparently the virtualfield isn't loaded into the session variable of user. I made this work with the following modification to the controller: def posts(): user = db.users[session.auth.user.id] n = user.name # returns the user name correctly! I just had to refetch the user from the database.
[web2py] Re: Apache, Wsgi problem
Great!! I also had threads=25 and changed this to threads=1 processes=5, so it makes sense that I was encountering the same problem. It sounds like something in web2py might not be thread-safe. The next time I run a production test I will report if this fixes the problem. On Feb 10, 2:38 pm, VP vtp2...@gmail.com wrote: Alright people short answer: I think I figured this out (at least with my configuration) After testing various configurations, here's the result with: ab -kc 100 -t 20https://domain.com/imageblog/default/index/ (same imageblog app, 100 connections, 20 seconds stress test). Two things you will notice with this result. 1. ZERO failed request. No morewsgipremature script error 2. Complete requests is 1234, 61 requests/second on average. Compare to prior configuration: 588 total requests, 29 requests/ sec on average. Not to mention 15 failed requests due towsgi premature script errors!!! This is insane!!! So how did I configure this? here it is: WSGIDaemonProcess web2py user=username group=username \ display-name=%{GROUP} processes=5 threads=1 The important option being 5 processes, 1 thread. With this configuration, my real app also did not getwsgipremature script errors anymore. And guess what... the requests/sec triples I am still curious about this. While my real app can possibly be not thread-safe, but the imageblog app should be thread safe (the index was simply a listing of images, i.e. read only). Why would there be aproblemwith more than 1 thread? Document Path: /imageblog/default/index Document Length: 13083 bytes Concurrency Level: 100 Time taken for tests: 20.008 seconds Complete requests: 1234 Failed requests: 0 Write errors: 0 Keep-Alive requests: 1234 Total transferred: 16827432 bytes HTML transferred: 16171262 bytes Requests per second: 61.68 [#/sec] (mean) Time per request: 1621.377 [ms] (mean) Time per request: 16.214 [ms] (mean, across all concurrent requests) Transfer rate: 821.33 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 1 9.4 0 73 Processing: 82 481 890.5 317 5475 Waiting: 76 443 878.7 274 5393 Total: 82 483 894.7 317 5503 Percentage of the requests served within a certain time (ms) 50% 317 66% 342 75% 360 80% 372 90% 416 95% 489 98% 5351 99% 5397 100% 5503 (longest request)
[web2py] Re: Django vs web2py pain points
The biggest django pain points to me: - Templating system is a PAIN. You have to learn a new language, and in the end it's not as powerful as python. - Database ORM can be a pain. Same reasons. You have to learn a big special-purpose API in addition to SQL, and learn how it translates between the two. In general, django tries to build too many special-purpose abstractions for everything. This makes it more complicated. You have to learn more. And they get in your way when they aren't powerful enough to do what you need. Then you have to hack them. more details on django pain I've encountered: - template system requires learning a new language - that new language is limited, e.g. difficult to define a variable, makes simple things hard - because it uses ORM instead of DAL, you need to learn new object language for selecting and updating database, and learn how it maps to SQL, because eventually you need to understand the raw tables too - you have to type things at the command line more often to make changes. So you shut down the server, type a command, restart the server. some of these things are automatic in web2py. You just edit the code, and everything reloads automatically. - database migrations require using south, an external plugin system - errors in your use of django will often crash deep within django. you'll be looking at a stack trace to some internal django file and not realize it's because you added a '/' to the end of a path in settings.py - more complicated to set up, mucking around in settings.py - constrained in how you pass variables to views from controllers. e.g. can't set global variables, that apply across multiple views, like a breadcrumb path. have to specify it manually in each controller and view. - have to type arcane things sometimes like return render_to_response(blah) at the end of a controller instead of just return blah
[web2py] Re: Apache, Wsgi problem
Yes, this echos my experiences exactly! Using apache ab benchmark alone would NOT trigger the error. I had plenty of RAM available. Seems to be a concurrency bug. On Jan 19, 10:53 am, VP vtp2...@gmail.com wrote: What is curious is that RAM is still available, with this error. Monitoring CPU (using top) shows CPU % is about 70% for apache2 (70% is probably capped by my VPS host, given to each VPS slice). And this occurs for a very simple app. I hope Massimo or someone else can reproduce this error with this app. Note that while testing with ab, you might have to browse the site, clicking on images, etc. Otherwise, ab just it the same domain name repeatedly.
[web2py] Re: Apache, Wsgi problem
I have pool_size=100, and get the error. On Jan 17, 12:20 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: You should really have db = DAL('postgres://name:password@localhost:5432/db',pool_size=20) The reason is that client-server databases may set a max to number of open connections and it takes time to perform the 3-way handshake to establish a new connection at every http request. Without pooling you may hit the max sooner than you should. This will also make your app faster. Massimo On Jan 17, 1:39 pm, VP vtp2...@gmail.com wrote: Here it is with private information replaced with generic information. if request.env.web2py_runtime_gae: # if running on Google App Engine db = DAL('gae') # connect to Google BigTable session.connect(request, response, db = db) # and store sessions and tickets there else: # else use a normal relational database db = DAL('postgres://name:password@localhost:5432/db') # db = DAL('sqlite://storage.sqlite') On Jan 17, 1:19 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Are you using connection pooling? Can I see the line db = DAL(...)? On Jan 16, 2:24 pm, VP vtp2...@gmail.com wrote: Kenneth (see post above) had the same problem with MySQL. On Jan 16, 3:55 am, ron_m ron.mco...@gmail.com wrote: Is there any chance to try another server based database such as MySQL. Since SQLite did not exhibit a problem that is a valid data point but it is an embedded database so called directly by the web2py server code instead of using a driver. There is a possibility there is a problem with the driver on PostgreSQL if the connections are being reused in a pool. This is very difficult to find without some further logging. I gather the page request you see failing as a user simply never completes. I saw this once while testing the application but the loading is only myself running tests. I use PostgreSQL in this app but I also have a version that runs on MySQL which has only a connection string difference plus the content of the databases directory. Maybe I can try this tomorrow, it is almost 2 am here so I don't want to start something new right now. I just need to learn how to use ab which I take it isn't going to be much to figure out.
[web2py] Re: Apache, Wsgi problem
The problem for me is that this occurs on a webapp used by mechanical turk, and it fails when I have hundreds of mechanical turkers using my app... which only happens when I pay them hundreds of dollars. So it's hard to reproduce right now without hundreds of dollars. I am excited to try using VP's ab benchmark to try to reproduce the error more cheaply. When I can do that, and get a minimal system exhibiting the error, I will let you know. I'm also currently using an older version of web2py. One pre-DAL. On Jan 17, 12:20 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: You should really have db = DAL('postgres://name:password@localhost:5432/db',pool_size=20) The reason is that client-server databases may set a max to number of open connections and it takes time to perform the 3-way handshake to establish a new connection at every http request. Without pooling you may hit the max sooner than you should. This will also make your app faster. Massimo On Jan 17, 1:39 pm, VP vtp2...@gmail.com wrote: Here it is with private information replaced with generic information. if request.env.web2py_runtime_gae: # if running on Google App Engine db = DAL('gae') # connect to Google BigTable session.connect(request, response, db = db) # and store sessions and tickets there else: # else use a normal relational database db = DAL('postgres://name:password@localhost:5432/db') # db = DAL('sqlite://storage.sqlite') On Jan 17, 1:19 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Are you using connection pooling? Can I see the line db = DAL(...)? On Jan 16, 2:24 pm, VP vtp2...@gmail.com wrote: Kenneth (see post above) had the same problem with MySQL. On Jan 16, 3:55 am, ron_m ron.mco...@gmail.com wrote: Is there any chance to try another server based database such as MySQL. Since SQLite did not exhibit a problem that is a valid data point but it is an embedded database so called directly by the web2py server code instead of using a driver. There is a possibility there is a problem with the driver on PostgreSQL if the connections are being reused in a pool. This is very difficult to find without some further logging. I gather the page request you see failing as a user simply never completes. I saw this once while testing the application but the loading is only myself running tests. I use PostgreSQL in this app but I also have a version that runs on MySQL which has only a connection string difference plus the content of the databases directory. Maybe I can try this tomorrow, it is almost 2 am here so I don't want to start something new right now. I just need to learn how to use ab which I take it isn't going to be much to figure out.
[web2py] Re: Apache, Wsgi problem
1.74.5. I will upgrade when I can reproduce the problem locally. On Jan 17, 5:13 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: How old web2py? We have had bugs in the past that may cause your problem. You should try upgrade. Massimo On Jan 17, 6:58 pm, Michael Toomim too...@gmail.com wrote: The problem for me is that this occurs on a webapp used by mechanical turk, and it fails when I have hundreds of mechanical turkers using my app... which only happens when I pay them hundreds of dollars. So it's hard to reproduce right now without hundreds of dollars. I am excited to try using VP's ab benchmark to try to reproduce the error more cheaply. When I can do that, and get a minimal system exhibiting the error, I will let you know. I'm also currently using an older version of web2py. One pre-DAL. On Jan 17, 12:20 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: You should really have db = DAL('postgres://name:password@localhost:5432/db',pool_size=20) The reason is that client-server databases may set a max to number of open connections and it takes time to perform the 3-way handshake to establish a new connection at every http request. Without pooling you may hit the max sooner than you should. This will also make your app faster. Massimo On Jan 17, 1:39 pm, VP vtp2...@gmail.com wrote: Here it is with private information replaced with generic information. if request.env.web2py_runtime_gae: # if running on Google App Engine db = DAL('gae') # connect to Google BigTable session.connect(request, response, db = db) # and store sessions and tickets there else: # else use a normal relational database db = DAL('postgres://name:password@localhost:5432/db') # db = DAL('sqlite://storage.sqlite') On Jan 17, 1:19 pm, Massimo Di Pierro massimo.dipie...@gmail.com wrote: Are you using connection pooling? Can I see the line db = DAL(...)? On Jan 16, 2:24 pm, VP vtp2...@gmail.com wrote: Kenneth (see post above) had the same problem with MySQL. On Jan 16, 3:55 am, ron_m ron.mco...@gmail.com wrote: Is there any chance to try another server based database such as MySQL. Since SQLite did not exhibit a problem that is a valid data point but it is an embedded database so called directly by the web2py server code instead of using a driver. There is a possibility there is a problem with the driver on PostgreSQL if the connections are being reused in a pool. This is very difficult to find without some further logging. I gather the page request you see failing as a user simply never completes. I saw this once while testing the application but the loading is only myself running tests. I use PostgreSQL in this app but I also have a version that runs on MySQL which has only a connection string difference plus the content of the databases directory. Maybe I can try this tomorrow, it is almost 2 am here so I don't want to start something new right now. I just need to learn how to use ab which I take it isn't going to be much to figure out.
[web2py] Re: Output of sum(), simplifying the JSON
I find it easiest and cleanest to reformat data structures in python, using list comprehensions. Javascript sucks for loops. So instead of jsonifying the raw database output, fix it first: export_optimizer_records = [{'FreezeTime': r.panel_1hrs.FreezeTime, 'StringID': r.panel_1hrs.StringID, 'Po_avg_sum': r._extra['sum(panel_1hrs.Po_avg)']} for r in export_optimizer_records] Basically, just add this line to your controller. On Jan 14, 5:58 pm, Lorin Rivers lriv...@mosasaur.com wrote: Controller: export_optimizer_records = dbset(db.table.FreezeTime,db.table.StringID,db.table.Po_avg.sum(),groupby=. .FreezeTime|..StringID).as_list() View: var optimizerdata = {{response.write(json(export_optimizer_records), escape=False)}}; The JSON looks like this: [{ panel_1hrs: { FreezeTime: 2010-12-12 19:00:00, StringID: S0001 }, _extra: { sum(panel_1hrs.Po_avg): 519.912549612443 }}, { panel_1hrs: { FreezeTime: 2010-12-12 19:00:00, StringID: S0002 }, _extra: { sum(panel_1hrs.Po_avg): 532.390706326218 } }] What I want is this: [{ FreezeTime: 2010-12-12 19:00:00, StringID: S0001, Po_avg_sum: 519.912549612443}, { FreezeTime: 2010-12-12 19:00:00, StringID: S0002, Po_avg_sum: 532.390706326218 }] What's the easiest way to get that? -- Lorin Rivers Mosasaur: Killer Technical Marketing http://www.mosasaur.com mailto:lriv...@mosasaur.com 512/203.3198 (m)
[web2py] Re: Apache, Wsgi problem
I'm still having this problem too (previous posts linked below). I would love to find a solution. I'm not sure how to debug. VP: Can you provide instructions for reproducing this bug using ab? I had trouble using ab in the past. I am also on a VPS. Since my last post (linked below), I have tried the following: 1) setting migrate=False 2) compiling the app These did not fix the problem. On Jan 9, 9:30 am, VP vtp2...@gmail.com wrote: It is indeed the case that there was a segmentation fault, reported in apache error log. Perhaps, it's not clear, but this problem occurs under posgres under debian lenny, not sqlite. I am not running web2py as a CGIscript. I am using the web2py deploymentscript(for setting up apache and web2py): VirtualHost *:80 WSGIDaemonProcess web2py user=myusername group=myusername \ display-name=%{GROUP} WSGIProcessGroup web2py WSGIScriptAlias / /home/myusername/web2py/wsgihandler.py On Jan 8, 9:59 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: You were possibly using a an old version of sqlite which isn't safe to use in a multithreaded configuration. The MPM settings are not going to help in this case as that error could only come about because you are using mod_wsgi daemon mode and so application is running in distinct process and not those affected by the MPM or its settings. The only other way you could get that error is that you are actually running web2py as a CGIscript. Overall, that specific error message means your daemon mode process that is running web2py crashed. You would likely find that there is a segmentation fault messages in main Apache error log as well at that time. Crashing could be because of sqlite thread problems, but could also be because you are forcing web2py to run in main interpreter of daemon processes and at the same time are using a third party C extension module for Python that is not safe for use in sub interpreters. So, ensure sqlite is up to date. And ensure that you have: WSGIApplicationGroup %{GLOBAL} in configuration to force use of main interpreter. Graham On Sunday, January 9, 2011 6:44:14 AM UTC+11, VP wrote: We occasionally got an Apache error so the page didn't get displayed. So I decided to stress test using Apache Bench (ab). It seems the site suffered failure up to 50-100 concurrent connection. Apache error log showed this error: Prematureendofscriptheaders: wsgihandler.py After digging around, I found similar discussions and change apache2.conf like this: # prefork MPM StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 256 MaxRequestsPerChild 500 ServerLimit 256 Didn't seem to help. A few notes: + It appears when I switched to sqlite instead of posgres, I didn't have the problem. (Sqlite had other problems, such as occasional database locking, which is more serious) + I am on a VPS with 768MB with 1GB burstable. While I'm doing the stress test with Apache Bench (ab), using free on the server revealed memory usage was about 450MB. (Which is a lot, but is still under limit). = In summary, memory was available. But we got this wsgi error in Apache with multiple requests. Any idea please? Thanks.
[web2py] Re: Apache, Wsgi problem
Thanks, I just investigated this, but it looks like it did not fix the problem. In 8.4.6 Postgres changed the default wal_sync_method to fdatasync, because the old default open_datasync failed on ext4. I use ext3 (on ubuntu 9.10), but I tried changing this option in my postgres database anyway. I changed it, restarted postgres and apache, but still get the error.
[web2py] How to create indexes on postgres if not exists
I wanted the equivalent of sqlite's create index if not exists on postgresql. Here's a solution for web2py. It is useful whenever you set up a new database, or migrate new tables to an existing database after a code update and want to ensure the right indexes are set up. def create_indices_on_postgres(): '''Creates a set of indices if they do not exist''' ## Edit this list of table columns to index ## The format is [('table', 'column')...] indices = [('actions', 'study'), ('actions', 'assid'), ('actions', 'hitid'), ('actions', 'time'), ('actions', 'workerid'), ('countries', 'code'), ('continents', 'code'), ('ips', 'from_ip'), ('ips', 'to_ip')] for table, column in indices: index_exists = db.executesql(select count(*) from pg_class where relname='%s_%s_idx'; % (table, column))[0][0] == 1 if not index_exists: db.executesql('create index %s_%s_idx on %s (%s);' % (table, column, table, column)) db.commit()
[web2py] Re: Error in wsgi/apache
Ah, preventing multithreading is a good idea to try too. It wasn't a file descriptor problem either, I had Files used: 1376 out of 75556 On Jul 20, 9:14 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 21, 1:41 pm, Michael Toomim too...@gmail.com wrote: I'm using daemon mode... I didn't realize that the directive won't matter in daemon mode. Yes, I think I probably will run into the problem again when I get more usage. However, I'm still not convinced it's a memory problem, because I had 30mb free on my 740mb machine when I was having the problem, with 0 swap usage. Well, as I explained before, it perhaps is a resource leakage such as file descriptors. You can exhaust kernel file descriptors and still have lots of memory available. I have seen various cases before for different peoples applications on different frameworks where file objects weren't being explicitly closed, or database connection pools not being managed properly, such that the number of open file descriptors went up and up and eventually they ran out. This can cause all sorts of weird errors to manifest when it occurs, including it impacting other applications if is exhausted system wide. For example, in a shell, not being able to execute commands to even debug the problem. I would suggest you become familiar with some of the basic monitoring commands such as 'lsof' or 'ofiles', depending on what your system provides. You can then use these to monitor file descriptor usage by your processes. Also be aware that such problems may only arise when multithreading kicks in and concurrent requests run. In other words, due to code which isn't thread safe. If you don't get the concurrency, you may well see your application run quite happily. Thus, one suggestion is to not use multiple threads for daemon mode processes and instead use something like 'processes=5 threads=1'. This will avoid the potential of it being caused by multithreading issues at least. Graham I don't know what I'll do if I this happens again. My code just does simple database lookups and updates, it doesn't create circular references nor store anything in global variables, so if there's a memory leak I worry it's somewhere further up the stack. I don't know any ways to investigate memory consumption to see where it's being used. On Jul 20, 8:23 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 21, 1:03 pm, Michael Toomim too...@gmail.com wrote: THANK YOU ALL SO MUCH for your help! I just learned a LOT. It looks like resource consumption was the problem, because things are doing better on the bigger machine and scaled down code. I've also added the MaxRequestsPerChild directive. Are you using mod_wsgi embedded mode or daemon mode? That directive should not be required if you are using daemon mode of mod_wsgi. It is generally a bad idea to make arbitrary changes without understanding whether they are necessary. Changing to a larger machine without understanding why your application is using lots of memory in the first place is also questionable. All you have done is increased your head room but potentially not solved the original underlying problem. You may well just find that it all just blows up again when you get hit with a larger amount of traffic or a request which generates a lot of data. What are you going to do then, get an even bigger machine? Graham I am soo happy to have this web server working, and very pleased to know what to do when I hit a such a scaling wall again! And flask looks interesting, but I must say I really really like how web2py's execfile puts things into global scope from the controllers and automatically reloads code with each request. On Jul 20, 5:02 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 21, 8:18 am, mdipierro mdipie...@cs.depaul.edu wrote: Can you comment on memory usage? I have see this once: after a while web serving slows it appeared to be due to a memory leak somewhere (did not experience it with web2py+Rocket but only in web2py+mod_wsgi+apache). I googled it and I found Django was having the same problem on some hosts: Not sure how you can draw a parallel to that as it is a completely different framework and just because another framework, or more specifically one persons code, has issues, doesn't imply there is an issue with underlying web hosting. These sorts of problems are isolated cases. If there was an issue with memory leakage in the hosting mechanism it would be affecting everyone and there have been no such reports of mod_wsgi itself leaking memory. That said, ensure you read: http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html This describes how Python itself leaks memory. For mod_wsgi 2.X and older
[web2py] Re: Error in wsgi/apache
Great! I want to understand it too! Your breakdown helps me think about how to look into this. I will do some more analysis, but for now: - I'm not using forms - I had migrate=True and the app was not compiled earlier. now it's compiled and migrate=False and I have a bigger machine, things are running ok, but I changed quite a few variables. Would migrate=True create locking issues? The problems seemed similar to locking issues. - My machine's ram slowly filled up from ~700mb/1500mb to 1400mb/ 1500mb over 5 hours. But I haven't looked to see which process grew yet, I will investigate. On Jul 21, 1:13 am, mdipierro mdipie...@cs.depaul.edu wrote: I still want to understand why you are having this problem. I see the following possibility: 1) There is a threshold in requests/seconds that depends on memory available and it is different for different frameworks. web2py does more than Flask (which is a microframework by definitions) and this threshold may be lower. If this is the case the problem should go away with more ram. 2) There is a bug in web2py or one of its modules that causes a memory leak (did you experience a memory leak?) or a locking issues (for example a process crashes before a locked resource is released). I remember you mentioning have this problem exclusively with actions using SQLFORM under heavy load. Is that correct? This could help us narrow down the problem. Massimo On Jul 20, 7:17 pm, Thadeus Burgess thade...@thadeusb.com wrote: The solution: I switched to Flask. And the problems dissipated completely, without modifying any configuration of the web server. I would not, and will not use web2py for any application that is mission critical. For personal sites, or quick projects that I know won't receive that much attention, web2py is fine. For quickly prototyping something, web2py excels. For stability, reliability, and scalability, use Flask or Django. The DAL is great though, nothing quite like it, thats why I am working on a Flask-DAL extension, and I am working to re-write parts of the DAL and strip out the web2py cohesion (such as SQLFORM validators). Using the DAL inside of flask works fine, and I do not run into these errors. This means that the DAL is not the cause of these errors, but web2py core code. Most likely in dealing with pages that have forms is where these errors arise. Web2py core is messy, and it ignores the wsgi specification for the most part. I am sure that these errors arise from the fact that web2py uses execfile in many places over and over again, which is a discouraged practice among the python community, and you see why now. -- Thadeus On Tue, Jul 20, 2010 at 4:17 PM, Michael Toomim too...@gmail.com wrote: Thank you for the clarification. My wsgi.conf has default values, so I have not set maximum-requests. Perhaps there are settings there I should look into? I still have free memory, so perhaps there is not a memory leak issue. I'm also not sure how one would get memory leaks in web2py, since isn't the environment wiped clean with each request? This looks similar to the issue here: http://groups.google.com/group/web2py/browse_thread/thread/49a7ecabf4... Was there any resolution? I use logging by having the following file in models/0_log.py: import logging def get_log(): try: f = open(logging.handlers[0].baseFilename, 'r') c = f.readlines() f.close() return {'log':TABLE(*[TR(str(item)) for item in c])} except: return () # This model file defines some magic to implement app_wide_log. def _init_log(): # Does not work on GAE import os,logging,logging.handlers logger = logging.getLogger(request.application) logger.setLevel(logging.DEBUG) handler = logging.handlers.RotatingFileHandler( os.path.join( # so that it can be served as http:// .../yourapp/static/applog.txt request.folder,'static','applog.txt'),'a',1024*1024,1) handler.setLevel(logging.DEBUG) handler.setFormatter(logging.Formatter(%(asctime)s %(levelname)s % (filename)s:%(lineno)d %(funcName)s(): %(message)s)) logger.addHandler(handler) return logger logging = cache.ram('app_wide_log',lambda:_init_log(),time_expire=None) On Jul 20, 2:03 am, mdipierro mdipie...@cs.depaul.edu wrote: Thanks for the clarification. @Michael, do you use the logging module? How? On Jul 20, 4:00 am, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 20, 5:17 pm, mdipierro mdipie...@cs.depaul.edu wrote: The problem with IOError, I can understand. As Graham says, if the client closes the connection before the server responds or if the server timesout the socket is closed and apache logs the IOError. That isn't what I said. If you see that message when using daemon mode, the Apache server process
[web2py] Re: Error in wsgi/apache
Great! I want to understand it too! Your breakdown helps me think about how to look into this. I will do some more analysis, but for now: - I'm not using forms - I had migrate=True and the app was not compiled earlier. now it's compiled and migrate=False and I have a bigger machine, things are running ok, but I changed quite a few variables. Would migrate=True create locking issues? The problems seemed similar to locking issues. - My machine's ram slowly filled up from ~700mb/1500mb to 1400mb/ 1500mb over 5 hours. But I haven't looked to see which process grew yet, I will investigate. On Jul 21, 1:13 am, mdipierro mdipie...@cs.depaul.edu wrote: I still want to understand why you are having this problem. I see the following possibility: 1) There is a threshold in requests/seconds that depends on memory available and it is different for different frameworks. web2py does more than Flask (which is a microframework by definitions) and this threshold may be lower. If this is the case the problem should go away with more ram. 2) There is a bug in web2py or one of its modules that causes a memory leak (did you experience a memory leak?) or a locking issues (for example a process crashes before a locked resource is released). I remember you mentioning have this problem exclusively with actions using SQLFORM under heavy load. Is that correct? This could help us narrow down the problem. Massimo On Jul 20, 7:17 pm, Thadeus Burgess thade...@thadeusb.com wrote: The solution: I switched to Flask. And the problems dissipated completely, without modifying any configuration of the web server. I would not, and will not use web2py for any application that is mission critical. For personal sites, or quick projects that I know won't receive that much attention, web2py is fine. For quickly prototyping something, web2py excels. For stability, reliability, and scalability, use Flask or Django. The DAL is great though, nothing quite like it, thats why I am working on a Flask-DAL extension, and I am working to re-write parts of the DAL and strip out the web2py cohesion (such as SQLFORM validators). Using the DAL inside of flask works fine, and I do not run into these errors. This means that the DAL is not the cause of these errors, but web2py core code. Most likely in dealing with pages that have forms is where these errors arise. Web2py core is messy, and it ignores the wsgi specification for the most part. I am sure that these errors arise from the fact that web2py uses execfile in many places over and over again, which is a discouraged practice among the python community, and you see why now. -- Thadeus On Tue, Jul 20, 2010 at 4:17 PM, Michael Toomim too...@gmail.com wrote: Thank you for the clarification. My wsgi.conf has default values, so I have not set maximum-requests. Perhaps there are settings there I should look into? I still have free memory, so perhaps there is not a memory leak issue. I'm also not sure how one would get memory leaks in web2py, since isn't the environment wiped clean with each request? This looks similar to the issue here: http://groups.google.com/group/web2py/browse_thread/thread/49a7ecabf4... Was there any resolution? I use logging by having the following file in models/0_log.py: import logging def get_log(): try: f = open(logging.handlers[0].baseFilename, 'r') c = f.readlines() f.close() return {'log':TABLE(*[TR(str(item)) for item in c])} except: return () # This model file defines some magic to implement app_wide_log. def _init_log(): # Does not work on GAE import os,logging,logging.handlers logger = logging.getLogger(request.application) logger.setLevel(logging.DEBUG) handler = logging.handlers.RotatingFileHandler( os.path.join( # so that it can be served as http:// .../yourapp/static/applog.txt request.folder,'static','applog.txt'),'a',1024*1024,1) handler.setLevel(logging.DEBUG) handler.setFormatter(logging.Formatter(%(asctime)s %(levelname)s % (filename)s:%(lineno)d %(funcName)s(): %(message)s)) logger.addHandler(handler) return logger logging = cache.ram('app_wide_log',lambda:_init_log(),time_expire=None) On Jul 20, 2:03 am, mdipierro mdipie...@cs.depaul.edu wrote: Thanks for the clarification. @Michael, do you use the logging module? How? On Jul 20, 4:00 am, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 20, 5:17 pm, mdipierro mdipie...@cs.depaul.edu wrote: The problem with IOError, I can understand. As Graham says, if the client closes the connection before the server responds or if the server timesout the socket is closed and apache logs the IOError. That isn't what I said. If you see that message when using daemon mode, the Apache server process
[web2py] Re: Error in wsgi/apache
Thank you for the clarification. My wsgi.conf has default values, so I have not set maximum-requests. Perhaps there are settings there I should look into? I still have free memory, so perhaps there is not a memory leak issue. I'm also not sure how one would get memory leaks in web2py, since isn't the environment wiped clean with each request? This looks similar to the issue here: http://groups.google.com/group/web2py/browse_thread/thread/49a7ecabf4910bcc/b6fac1806ffebcd1?lnk=gstq=ioerror#b6fac1806ffebcd1 Was there any resolution? I use logging by having the following file in models/0_log.py: import logging def get_log(): try: f = open(logging.handlers[0].baseFilename, 'r') c = f.readlines() f.close() return {'log':TABLE(*[TR(str(item)) for item in c])} except: return () # This model file defines some magic to implement app_wide_log. def _init_log(): # Does not work on GAE import os,logging,logging.handlers logger = logging.getLogger(request.application) logger.setLevel(logging.DEBUG) handler = logging.handlers.RotatingFileHandler( os.path.join( # so that it can be served as http://.../yourapp/static/applog.txt request.folder,'static','applog.txt'),'a',1024*1024,1) handler.setLevel(logging.DEBUG) handler.setFormatter(logging.Formatter(%(asctime)s %(levelname)s % (filename)s:%(lineno)d %(funcName)s(): %(message)s)) logger.addHandler(handler) return logger logging = cache.ram('app_wide_log',lambda:_init_log(),time_expire=None) On Jul 20, 2:03 am, mdipierro mdipie...@cs.depaul.edu wrote: Thanks for the clarification. @Michael, do you use the logging module? How? On Jul 20, 4:00 am, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 20, 5:17 pm, mdipierro mdipie...@cs.depaul.edu wrote: The problem with IOError, I can understand. As Graham says, if the client closes the connection before the server responds or if the server timesout the socket is closed and apache logs the IOError. That isn't what I said. If you see that message when using daemon mode, the Apache server process that is proxying to the daemon process is crashing. This is different to the HTTP client closing the connection. You would only see that message if HTTP client closed connection if using embedded mode. I know they are using daemon mode as that is the only situation where they could also see the message about premature end of script headers. What I really do not understand is why some requests are handled by multiple threads. web2py is agnostic to this (unless you use Rocket which you do not). web2py only provides a wsgi application which is executed - per thread - by the web server. It is the web server (in your case apache) that spans the thread, maps requests to threads, calls the web2py wsgi application for each of them. If this is happening it is a problem with apache or with mod_wsgi. More likely the problem is that they are registering the logging module from multiple places and that is why logging is displayed more than once. They should log the thread ID as well as that would confirm whether actually from the same thread where logging module handler has been registered multiple times. Multiple registrations of logging handler could occur if it isn't done in a thread safe why, ie., so as to avoid multiple threads doing it at the same time. Graham Can you tell us more about the version of ubuntu, apache and mod_wsgi that you are using? Any additional information will be very useful. Massimo On Jul 19, 9:01 pm, Michael Toomim too...@gmail.com wrote: I'm getting errors like these in my apache error logs: [Mon Jul 19 18:55:20 2010] [error] [client 65.35.93.74] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A7KADKCHTB1IJS3Z5CR16OZM4V... [Mon Jul 19 18:55:20 2010] [error] [client 143.166.226.43] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A9FV5YBGVV54NALMIRILFKHPT1... [Mon Jul 19 18:55:50 2010] [error] [client 117.204.99.178] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py
[web2py] Re: Error in wsgi/apache
Let me also summarize the issues so far. Originally: - I got three types of error messages in apache logs - Logging messages were often duplicated 2, 3, 5 times - I got the IOError ticket a few times - After a while the web serving slowed (some requests took up to a minute) and then quit completely After rebooting: - I get one type of error message in apache logs, in big batches - I get the IOError ticket once or twice - After a while web serving slows (sometimes 150s per request) and stops So I haven't been seeing the duplicate log messages anymore. I upgraded to a bigger machine and am changing my code to remove ajax (will reduce load by 60x by decreasing functionality). I don't know what else to do. On Jul 20, 2:03 am, mdipierro mdipie...@cs.depaul.edu wrote: Thanks for the clarification. @Michael, do you use the logging module? How? On Jul 20, 4:00 am, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 20, 5:17 pm, mdipierro mdipie...@cs.depaul.edu wrote: The problem with IOError, I can understand. As Graham says, if the client closes the connection before the server responds or if the server timesout the socket is closed and apache logs the IOError. That isn't what I said. If you see that message when using daemon mode, the Apache server process that is proxying to the daemon process is crashing. This is different to the HTTP client closing the connection. You would only see that message if HTTP client closed connection if using embedded mode. I know they are using daemon mode as that is the only situation where they could also see the message about premature end of script headers. What I really do not understand is why some requests are handled by multiple threads. web2py is agnostic to this (unless you use Rocket which you do not). web2py only provides a wsgi application which is executed - per thread - by the web server. It is the web server (in your case apache) that spans the thread, maps requests to threads, calls the web2py wsgi application for each of them. If this is happening it is a problem with apache or with mod_wsgi. More likely the problem is that they are registering the logging module from multiple places and that is why logging is displayed more than once. They should log the thread ID as well as that would confirm whether actually from the same thread where logging module handler has been registered multiple times. Multiple registrations of logging handler could occur if it isn't done in a thread safe why, ie., so as to avoid multiple threads doing it at the same time. Graham Can you tell us more about the version of ubuntu, apache and mod_wsgi that you are using? Any additional information will be very useful. Massimo On Jul 19, 9:01 pm, Michael Toomim too...@gmail.com wrote: I'm getting errors like these in my apache error logs: [Mon Jul 19 18:55:20 2010] [error] [client 65.35.93.74] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A7KADKCHTB1IJS3Z5CR16OZM4V... [Mon Jul 19 18:55:20 2010] [error] [client 143.166.226.43] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A9FV5YBGVV54NALMIRILFKHPT1... [Mon Jul 19 18:55:50 2010] [error] [client 117.204.99.178] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data My web app gets about 7 requests per second. At first, things work fine. Then after a while it seems like every request gets handled
[web2py] Re: Error in wsgi/apache
THANK YOU ALL SO MUCH for your help! I just learned a LOT. It looks like resource consumption was the problem, because things are doing better on the bigger machine and scaled down code. I've also added the MaxRequestsPerChild directive. I am soo happy to have this web server working, and very pleased to know what to do when I hit a such a scaling wall again! And flask looks interesting, but I must say I really really like how web2py's execfile puts things into global scope from the controllers and automatically reloads code with each request. On Jul 20, 5:02 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 21, 8:18 am, mdipierro mdipie...@cs.depaul.edu wrote: Can you comment on memory usage? I have see this once: after a while web serving slows it appeared to be due to a memory leak somewhere (did not experience it with web2py+Rocket but only in web2py+mod_wsgi+apache). I googled it and I found Django was having the same problem on some hosts: Not sure how you can draw a parallel to that as it is a completely different framework and just because another framework, or more specifically one persons code, has issues, doesn't imply there is an issue with underlying web hosting. These sorts of problems are isolated cases. If there was an issue with memory leakage in the hosting mechanism it would be affecting everyone and there have been no such reports of mod_wsgi itself leaking memory. That said, ensure you read: http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html This describes how Python itself leaks memory. For mod_wsgi 2.X and older, or if you are still loading mod_python into your Apache server, then you can be affected by this, but not if using mod_wsgi 3.X. That post also explains how to completely disable initialisation of Python in Apache server child processes, ie., embedded, if you aren't using it. http://stackoverflow.com/questions/229/django-memory-usage-going-.. I followed the advice from a comment in the last post to limit the number of requests served by each process: Which is actually a useless thing to do if you are using daemon mode which I understood you were, as MaxRequestsPerChild directive only affects Apache server child process, ie., those use for embedded mode, and not daemon mode processes. If using that directive helped and you were using daemon mode, then you likely have a memory leak in some other Apache module. What you should have ensured you were doing was using display-name option to WSGIDaemonProcess to name the process. That way in 'ps' you can easily distinguish the mod_wsgi daemon mode processes from the Apache processes and work out which is leaking memory. If it is the daemon processes, it is likely to be a Python web application issue. If the Apache parent process is getting fatter and you perform a lot of Apache restart/reloads, then it could be that you are still using mod_wsgi 2.X or mod_python is loaded at same time, and you are using a version of Python that has lots of memory leaks on restarts. If your daemon processes are not getting fat and the Apache server child processes are, then you may through incorrect configuration not even be running Python web application in daemon mode. This is where WSGIRestrictEmbedded as described in my post is good, as it will highlight when the configuration is screwed up. # prefork MPM StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 256 MaxRequestsPerChild 500 ServerLimit 30 instead of the default: # prefork MPM StartServers 5 MinSpareServers 5 MaxSpareServers 10 MaxClients 256 The problem disappeared. The exact values that fix the problem may depend on the ram available. The other difference with above is that I think by setting ServerLimit to 30, you have effectively overridden MaxClients down to 30 even though set to 256. You have thus in part limited the exact problems described in: http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usa... if it so happens you were using embedded mode and not daemon mode. Graham Massimo On Jul 20, 4:30 pm, Michael Toomim too...@gmail.com wrote: Let me also summarize the issues so far. Originally: - I got three types of error messages in apache logs - Logging messages were often duplicated 2, 3, 5 times - I got the IOError ticket a few times - After a while the web serving slowed (some requests took up to a minute) and then quit completely After rebooting: - I get one type of error message in apache logs, in big batches - I get the IOError ticket once or twice - After a while web serving slows (sometimes 150s per request) and stops So I haven't been seeing the duplicate log messages anymore. I upgraded to a bigger machine and am changing my code to remove ajax (will reduce load by 60x by decreasing functionality). I don't know
[web2py] Re: Error in wsgi/apache
I'm using daemon mode... I didn't realize that the directive won't matter in daemon mode. Yes, I think I probably will run into the problem again when I get more usage. However, I'm still not convinced it's a memory problem, because I had 30mb free on my 740mb machine when I was having the problem, with 0 swap usage. I don't know what I'll do if I this happens again. My code just does simple database lookups and updates, it doesn't create circular references nor store anything in global variables, so if there's a memory leak I worry it's somewhere further up the stack. I don't know any ways to investigate memory consumption to see where it's being used. On Jul 20, 8:23 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 21, 1:03 pm, Michael Toomim too...@gmail.com wrote: THANK YOU ALL SO MUCH for your help! I just learned a LOT. It looks like resource consumption was the problem, because things are doing better on the bigger machine and scaled down code. I've also added the MaxRequestsPerChild directive. Are you using mod_wsgi embedded mode or daemon mode? That directive should not be required if you are using daemon mode of mod_wsgi. It is generally a bad idea to make arbitrary changes without understanding whether they are necessary. Changing to a larger machine without understanding why your application is using lots of memory in the first place is also questionable. All you have done is increased your head room but potentially not solved the original underlying problem. You may well just find that it all just blows up again when you get hit with a larger amount of traffic or a request which generates a lot of data. What are you going to do then, get an even bigger machine? Graham I am soo happy to have this web server working, and very pleased to know what to do when I hit a such a scaling wall again! And flask looks interesting, but I must say I really really like how web2py's execfile puts things into global scope from the controllers and automatically reloads code with each request. On Jul 20, 5:02 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 21, 8:18 am, mdipierro mdipie...@cs.depaul.edu wrote: Can you comment on memory usage? I have see this once: after a while web serving slows it appeared to be due to a memory leak somewhere (did not experience it with web2py+Rocket but only in web2py+mod_wsgi+apache). I googled it and I found Django was having the same problem on some hosts: Not sure how you can draw a parallel to that as it is a completely different framework and just because another framework, or more specifically one persons code, has issues, doesn't imply there is an issue with underlying web hosting. These sorts of problems are isolated cases. If there was an issue with memory leakage in the hosting mechanism it would be affecting everyone and there have been no such reports of mod_wsgi itself leaking memory. That said, ensure you read: http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html This describes how Python itself leaks memory. For mod_wsgi 2.X and older, or if you are still loading mod_python into your Apache server, then you can be affected by this, but not if using mod_wsgi 3.X. That post also explains how to completely disable initialisation of Python in Apache server child processes, ie., embedded, if you aren't using it. http://stackoverflow.com/questions/229/django-memory-usage-going-.. I followed the advice from a comment in the last post to limit the number of requests served by each process: Which is actually a useless thing to do if you are using daemon mode which I understood you were, as MaxRequestsPerChild directive only affects Apache server child process, ie., those use for embedded mode, and not daemon mode processes. If using that directive helped and you were using daemon mode, then you likely have a memory leak in some other Apache module. What you should have ensured you were doing was using display-name option to WSGIDaemonProcess to name the process. That way in 'ps' you can easily distinguish the mod_wsgi daemon mode processes from the Apache processes and work out which is leaking memory. If it is the daemon processes, it is likely to be a Python web application issue. If the Apache parent process is getting fatter and you perform a lot of Apache restart/reloads, then it could be that you are still using mod_wsgi 2.X or mod_python is loaded at same time, and you are using a version of Python that has lots of memory leaks on restarts. If your daemon processes are not getting fat and the Apache server child processes are, then you may through incorrect configuration not even be running Python web application in daemon mode. This is where WSGIRestrictEmbedded as described in my post is good
[web2py] Error in wsgi/apache
I'm getting errors like these in my apache error logs: [Mon Jul 19 18:55:20 2010] [error] [client 65.35.93.74] Premature end of script headers: wsgihandler.py, referer: http://yuno.us/init/hits/hit?assignmentId=1A7KADKCHTB1IJS3Z5CR16OZM4VLSQhitId=1NAV09D0NWNU2X87QR3I6RXXG0ER8NworkerId=A37YC0D90LZF2MturkSubmitTo=https%3A%2F%2Fwww.mturk.com [Mon Jul 19 18:55:20 2010] [error] [client 143.166.226.43] Premature end of script headers: wsgihandler.py, referer: http://yuno.us/init/hits/hit?assignmentId=1A9FV5YBGVV54NALMIRILFKHPT1O3IhitId=1G15BSUI1DBLMZPV54KGZFTE6JM0Z3workerId=A3I5DLZHYT46GSturkSubmitTo=https%3A%2F%2Fwww.mturk.com [Mon Jul 19 18:55:50 2010] [error] [client 117.204.99.178] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data My web app gets about 7 requests per second. At first, things work fine. Then after a while it seems like every request gets handled by MULTIPLE threads, because my logging.debug() statements print multiple copies of each message and it seems my database gets multiple entries. And I get these errors in the apache logs (with LogLevel debug). Any idea what to do? Where to look? I'm on ubuntu.
[web2py] Re: Error in wsgi/apache
This message about bucket brigade is also appearing in the apache error log: [Mon Jul 19 19:01:53 2010] [error] [client 183.87.223.111] (9)Bad file descriptor: mod_wsgi (pid=7940): Unable to get bucket brigade for request., referer: http://yuno.us/init/hits/hit?assignmentId=1WL68USPJR0HY1ENS50GN6IJ33ZY32hitId=1TK6NH2ZSBU3RCI3F8FK7JE1YXMG96workerId=AJ8R357DF74FFturkSubmitTo=https%3A%2F%2Fwww.mturk.com On Jul 19, 7:01 pm, Michael Toomim too...@gmail.com wrote: I'm getting errors like these in my apache error logs: [Mon Jul 19 18:55:20 2010] [error] [client 65.35.93.74] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A7KADKCHTB1IJS3Z5CR16OZM4V... [Mon Jul 19 18:55:20 2010] [error] [client 143.166.226.43] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A9FV5YBGVV54NALMIRILFKHPT1... [Mon Jul 19 18:55:50 2010] [error] [client 117.204.99.178] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data My web app gets about 7 requests per second. At first, things work fine. Then after a while it seems like every request gets handled by MULTIPLE threads, because my logging.debug() statements print multiple copies of each message and it seems my database gets multiple entries. And I get these errors in the apache logs (with LogLevel debug). Any idea what to do? Where to look? I'm on ubuntu.
[web2py] Re: Error in wsgi/apache
And after a while apache completely freezes. On Jul 19, 7:05 pm, Michael Toomim too...@gmail.com wrote: This message about bucket brigade is also appearing in the apache error log: [Mon Jul 19 19:01:53 2010] [error] [client 183.87.223.111] (9)Bad file descriptor: mod_wsgi (pid=7940): Unable to get bucket brigade for request., referer:http://yuno.us/init/hits/hit?assignmentId=1WL68USPJR0HY1ENS50GN6IJ33Z... On Jul 19, 7:01 pm, Michael Toomim too...@gmail.com wrote: I'm getting errors like these in my apache error logs: [Mon Jul 19 18:55:20 2010] [error] [client 65.35.93.74] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A7KADKCHTB1IJS3Z5CR16OZM4V... [Mon Jul 19 18:55:20 2010] [error] [client 143.166.226.43] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A9FV5YBGVV54NALMIRILFKHPT1... [Mon Jul 19 18:55:50 2010] [error] [client 117.204.99.178] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data My web app gets about 7 requests per second. At first, things work fine. Then after a while it seems like every request gets handled by MULTIPLE threads, because my logging.debug() statements print multiple copies of each message and it seems my database gets multiple entries. And I get these errors in the apache logs (with LogLevel debug). Any idea what to do? Where to look? I'm on ubuntu.
[web2py] Re: Error in wsgi/apache
Thanks! I tried rebooting the OS. Now my resources seem ok (but I didn't check before the reboot): Files used: 1376 out of 75556 Mem used: 580mb out of 796mb Swap used: 0 CPU: 88-99% idle And I know longer see the Exception occurred or IOError messages, however I DO still see Premature end of script headers. These errors come in batches, every 10-20 seconds or so I get a continuous block of 10-20 Premature end of script headers errors from different clients. These are followed by errors notifying me that clients' ajax requests failed. I also found three of these in my web2py tickets: Traceback (most recent call last): File gluon/main.py, line 337, in wsgibase parse_get_post_vars(request, environ) File gluon/main.py, line 222, in parse_get_post_vars request.body = copystream_progress(request) ### stores request body File gluon/main.py, line 95, in copystream_progress copystream(source, dest, size, chunk_size) File gluon/fileutils.py, line 301, in copystream data = src.read(size) IOError: request data read error However, I've gotten around 3000 premature end of script errors, and only 3 of these IOErrors. Is there a way to identify what is causing the Premature end of script errors? On Jul 19, 7:50 pm, Graham Dumpleton graham.dumple...@gmail.com wrote: On Jul 20, 12:01 pm, Michael Toomim too...@gmail.com wrote: I'm getting errors like these in my apache error logs: [Mon Jul 19 18:55:20 2010] [error] [client 65.35.93.74] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A7KADKCHTB1IJS3Z5CR16OZM4V... [Mon Jul 19 18:55:20 2010] [error] [client 143.166.226.43] Premature end of script headers: wsgihandler.py, referer:http://yuno.us/init/hits/hit?assignmentId=1A9FV5YBGVV54NALMIRILFKHPT1... The above is because the daemon process you are running web2py in crashed. [Mon Jul 19 18:55:50 2010] [error] [client 117.204.99.178] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data In the case of daemon mode being used, this is because the Apache server child process crashed. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] mod_wsgi (pid=7730): Exception occurred processing WSGI script '/home/toomim/ projects/utility/web2py/wsgihandler.py'. [Mon Jul 19 18:55:50 2010] [error] [client 117.201.42.84] IOError: failed to write data My web app gets about 7 requests per second. At first, things work fine. Then after a while it seems like every request gets handled by MULTIPLE threads, because my logging.debug() statements print multiple copies of each message and it seems my database gets multiple entries. And I get these errors in the apache logs (with LogLevel debug). Any idea what to do? Where to look? I'm on ubuntu. Look at your systems resource usage, ie., memory, open files etc. The above are symptomatic of your operating system running out of resources and processes not coping too well with that. Graham
[web2py] Best practice for logging w/ wsgi?
Now that I'm on apache, I find that the logging library iceberg wrote no longer works: http://groups.google.com/group/web2py/browse_thread/thread/ae37920ce03ba165/6e5d746f6222f70a I suspect this is because of the stdout/stderr problem with wsgi, but I thought that would only affect print statements... which is the reason for using logging.debug(). But my logging.debug() doesn't work on apache. Is there a way to fix this? How do you guys debug on apache? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
Thanks guys. Each time I run a test, though, it costs me money because I'm paying people on mechanical turk. And if it's slow, it gives me a bad reputation. So I don't want to run more slow tests unless we have good request time logging in place and a good hypothesis to test. Wouldn't cron only make it slow every minute or something? On Apr 5, 5:49 am, Timothy Farrell tfarr...@swgen.com wrote: You are right, I was going to add that feature and then forgot about it. Someone reported a PyPI bug over the weekend (it would not affect web2py). I'll see if I can make the logging a bit more flexible and release a 1.1 in the next few days. In the meantime, look into the cron thing. -tim On 4/4/2010 6:44 PM, Michael Toomim wrote: I see, thank you. I want to measure the web server's response time when I deploy this on turk... Unfortunately the rocket log does not report time to serve a request. Do you think it is easy to get that information from rocket? Do you store the start and stop times for each request? I see start times stored in connections, but I'm not sure that's the right object. On Mar 30, 6:09 am, Timothy Farrelltfarr...@swgen.com wrote: I don't think upgrading will help much since Cherrypy was also slow. However, doing so would help cover all your bases. If you want to use the http log from Rocket you can do this. I'm assuming you invoke web2py.py from a bash script or just run it manually. Paste the following code into the top of web2py.py import logging import logging.handlers log = logging.getLogger('Rocket.Requests') log.setLevel(logging.INFO) log.addHandler(logging.handlers.FileHandler('rocket.log') I, like Yarko, do think this has more to do with something else. At one point web2py had a profiler built-in. That could be a good tool for finding slow spots. -tim On 3/29/2010 7:59 PM, MichaelToomimwrote: Yes, this is on linux! Do you recommend upgrading and trying again? mturk doesn't affect anything, I am just serving webpages that appear in iframes on the mturk website. From our perspective, I'm serving webpages. Do you have a method of logging how much time it takes to serve a page with rocket? Something that I can use instead of httpserver.log? It seems important for me to measure real-world performance, which ab does not do. My server has 768MB ram, and the only thing it does is run this web2py server. I assumed ram was not full, but did not check. I will check next time. On Mar 29, 12:10 pm, Timothy Farrelltfarr...@swgen.com wrote: snip/ -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
and I'm using postgres not sqlite. On Apr 5, 12:44 pm, Michael Toomim too...@gmail.com wrote: Thanks guys. Each time I run a test, though, it costs me money because I'm paying people on mechanical turk. And if it's slow, it gives me a bad reputation. So I don't want to run more slow tests unless we have good request time logging in place and a good hypothesis to test. Wouldn't cron only make it slow every minute or something? On Apr 5, 5:49 am, Timothy Farrell tfarr...@swgen.com wrote: You are right, I was going to add that feature and then forgot about it. Someone reported a PyPI bug over the weekend (it would not affect web2py). I'll see if I can make the logging a bit more flexible and release a 1.1 in the next few days. In the meantime, look into the cron thing. -tim On 4/4/2010 6:44 PM, Michael Toomim wrote: I see, thank you. I want to measure the web server's response time when I deploy this on turk... Unfortunately the rocket log does not report time to serve a request. Do you think it is easy to get that information from rocket? Do you store the start and stop times for each request? I see start times stored in connections, but I'm not sure that's the right object. On Mar 30, 6:09 am, Timothy Farrelltfarr...@swgen.com wrote: I don't think upgrading will help much since Cherrypy was also slow. However, doing so would help cover all your bases. If you want to use the http log from Rocket you can do this. I'm assuming you invoke web2py.py from a bash script or just run it manually. Paste the following code into the top of web2py.py import logging import logging.handlers log = logging.getLogger('Rocket.Requests') log.setLevel(logging.INFO) log.addHandler(logging.handlers.FileHandler('rocket.log') I, like Yarko, do think this has more to do with something else. At one point web2py had a profiler built-in. That could be a good tool for finding slow spots. -tim On 3/29/2010 7:59 PM, MichaelToomimwrote: Yes, this is on linux! Do you recommend upgrading and trying again? mturk doesn't affect anything, I am just serving webpages that appear in iframes on the mturk website. From our perspective, I'm serving webpages. Do you have a method of logging how much time it takes to serve a page with rocket? Something that I can use instead of httpserver.log? It seems important for me to measure real-world performance, which ab does not do. My server has 768MB ram, and the only thing it does is run this web2py server. I assumed ram was not full, but did not check. I will check next time. On Mar 29, 12:10 pm, Timothy Farrelltfarr...@swgen.com wrote: snip/ -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
I see, thank you. I want to measure the web server's response time when I deploy this on turk... Unfortunately the rocket log does not report time to serve a request. Do you think it is easy to get that information from rocket? Do you store the start and stop times for each request? I see start times stored in connections, but I'm not sure that's the right object. On Mar 30, 6:09 am, Timothy Farrell tfarr...@swgen.com wrote: I don't think upgrading will help much since Cherrypy was also slow. However, doing so would help cover all your bases. If you want to use the http log from Rocket you can do this. I'm assuming you invoke web2py.py from a bash script or just run it manually. Paste the following code into the top of web2py.py import logging import logging.handlers log = logging.getLogger('Rocket.Requests') log.setLevel(logging.INFO) log.addHandler(logging.handlers.FileHandler('rocket.log') I, like Yarko, do think this has more to do with something else. At one point web2py had a profiler built-in. That could be a good tool for finding slow spots. -tim On 3/29/2010 7:59 PM, MichaelToomimwrote: Yes, this is on linux! Do you recommend upgrading and trying again? mturk doesn't affect anything, I am just serving webpages that appear in iframes on the mturk website. From our perspective, I'm serving webpages. Do you have a method of logging how much time it takes to serve a page with rocket? Something that I can use instead of httpserver.log? It seems important for me to measure real-world performance, which ab does not do. My server has 768MB ram, and the only thing it does is run this web2py server. I assumed ram was not full, but did not check. I will check next time. On Mar 29, 12:10 pm, Timothy Farrelltfarr...@swgen.com wrote: snip/ -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
You are both right that I do not know where the slowness is coming from. My goal is to measure it so that I can narrow in on the problem. So far I know that it is external to web2py because it does not show up in httpserver.log, so my reasoning is to look at rocket which wraps the web2py part. On Apr 4, 4:44 pm, Michael Toomim too...@gmail.com wrote: I see, thank you. I want to measure the web server's response time when I deploy this on turk... Unfortunately the rocket log does not report time to serve a request. Do you think it is easy to get that information from rocket? Do you store the start and stop times for each request? I see start times stored in connections, but I'm not sure that's the right object. On Mar 30, 6:09 am, Timothy Farrell tfarr...@swgen.com wrote: I don't think upgrading will help much since Cherrypy was also slow. However, doing so would help cover all your bases. If you want to use the http log from Rocket you can do this. I'm assuming you invoke web2py.py from a bash script or just run it manually. Paste the following code into the top of web2py.py import logging import logging.handlers log = logging.getLogger('Rocket.Requests') log.setLevel(logging.INFO) log.addHandler(logging.handlers.FileHandler('rocket.log') I, like Yarko, do think this has more to do with something else. At one point web2py had a profiler built-in. That could be a good tool for finding slow spots. -tim On 3/29/2010 7:59 PM, MichaelToomimwrote: Yes, this is on linux! Do you recommend upgrading and trying again? mturk doesn't affect anything, I am just serving webpages that appear in iframes on the mturk website. From our perspective, I'm serving webpages. Do you have a method of logging how much time it takes to serve a page with rocket? Something that I can use instead of httpserver.log? It seems important for me to measure real-world performance, which ab does not do. My server has 768MB ram, and the only thing it does is run this web2py server. I assumed ram was not full, but did not check. I will check next time. On Mar 29, 12:10 pm, Timothy Farrelltfarr...@swgen.com wrote: snip/ -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
I started using apache with mod_wsgi, and now it's fast! So this indicates it's a problem that only occurs when using rocket or cherrypy, but again I'm only measuring it with firebug in my browser. I have 768MB of ram, ~500MB free. I use cron for @reboot only. I only run it on a remote machine (on slicehost VPS). I do not think it is network problems because it is fast in my web browser until I get hit by a bunch of users. Of course I'd like to eliminate the network from my measurements, but for that I need webserver processing- time logs. On Apr 4, 9:08 pm, mdipierro mdipie...@cs.depaul.edu wrote: Some more questions: how much ram? can you check memory usage? A memory leak may cause slowness. are you using cron? when cron starts it may spike memory usage. are you experience the slowness from localhost or from remote machines? On Apr 4, 6:46 pm, Michael Toomim too...@gmail.com wrote: You are both right that I do not know where the slowness is coming from. My goal is to measure it so that I can narrow in on the problem. So far I know that it is external to web2py because it does not show up in httpserver.log, so my reasoning is to look at rocket which wraps the web2py part. On Apr 4, 4:44 pm, Michael Toomim too...@gmail.com wrote: I see, thank you. I want to measure the web server's response time when I deploy this on turk... Unfortunately the rocket log does not report time to serve a request. Do you think it is easy to get that information from rocket? Do you store the start and stop times for each request? I see start times stored in connections, but I'm not sure that's the right object. On Mar 30, 6:09 am, Timothy Farrell tfarr...@swgen.com wrote: I don't think upgrading will help much since Cherrypy was also slow. However, doing so would help cover all your bases. If you want to use the http log from Rocket you can do this. I'm assuming you invoke web2py.py from a bash script or just run it manually. Paste the following code into the top of web2py.py import logging import logging.handlers log = logging.getLogger('Rocket.Requests') log.setLevel(logging.INFO) log.addHandler(logging.handlers.FileHandler('rocket.log') I, like Yarko, do think this has more to do with something else. At one point web2py had a profiler built-in. That could be a good tool for finding slow spots. -tim On 3/29/2010 7:59 PM, MichaelToomimwrote: Yes, this is on linux! Do you recommend upgrading and trying again? mturk doesn't affect anything, I am just serving webpages that appear in iframes on the mturk website. From our perspective, I'm serving webpages. Do you have a method of logging how much time it takes to serve a page with rocket? Something that I can use instead of httpserver.log? It seems important for me to measure real-world performance, which ab does not do. My server has 768MB ram, and the only thing it does is run this web2py server. I assumed ram was not full, but did not check. I will check next time. On Mar 29, 12:10 pm, Timothy Farrelltfarr...@swgen.com wrote: snip/ -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
I was having slowness problems with cherrypy too! That's why I switched to rocket. So perhaps it's something common to cherrypy and rocket, or perhaps they are both slow in their own ways? This is using web2py from march 16th, so it's not the latest rocket. Do you think something important changed for concurrency? On Mar 29, 5:56 am, Timothy Farrell tfarr...@swgen.com wrote: Perhaps a simpler set of questions: Did you have this working with Cherrypy beforehand? If so, is Rocket the only thing to have changed? The latest changes to Rocket were committed to the Mercurial web2py repo on March 18th. I'm assuming you've run a checkout since then. -tim On 3/28/2010 4:23 PM, mdipierro wrote: One more thing. You ask But a single process doing complex joins should not slow down all other simple selects and inserts, right? no, except for sqlite. sqlite serializes all requests because locks the db. That could explain the 0.20s if you have lots of queries per request, but not the 54s for the server. On Mar 28, 4:22 pm, mdipierromdipie...@cs.depaul.edu wrote: On Mar 28, 3:46 pm, Michael Toomimtoo...@gmail.com wrote: Any idea why there is a discrepancy between Firebug and httpserver.log? httpserver.log logs the time spend in web2py, not including the time for sending and receiving the http request/response. firebug logs the the total time, including time spend by the web server for communication. I am using postgresql. What would indicate model complexity? I have around 9 tables, but most of the requests just do single-object selects and inserts. No complex joins are in public-facing pages, but myself as an administrator periodically load a page that does big joins. But a single process doing complex joins should not slow down all other simple selects and inserts, right? In your case there are two problems (and I do not know what causes them): 1) web2py is taking 0.20seconds to process a response. That is more than 10 times what it should be. 2) the communication between the web server and the browser takes very very long time. Is the server on localhost? If not this could be a network issue. On Mar 27, 6:48 am, mdipierromdipie...@cs.depaul.edu wrote: Mind that if you use sqlite there is no concurrency. Still these numbers are very low. Are your models very complex? On 27 Mar, 00:06, Michael Toomimtoo...@gmail.com wrote: I'm using web2py+rocket to serve jobs on mechanical turk. The server probably gets a hit per second or so by workers on mechanical turk using it. When I have no users, everything is fast. But in active use, I notice that web pages often load reay slow in my web browser, but the httpserver.log file reports only small times. For instance, I just loaded a page that httpserver.log said took 0.20 seconds, but Firebug said took 54.21 seconds. That's a big difference. Any idea what's going on? I guess I'll have to try apache? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
Yes, this is on linux! Do you recommend upgrading and trying again? mturk doesn't affect anything, I am just serving webpages that appear in iframes on the mturk website. From our perspective, I'm serving webpages. Do you have a method of logging how much time it takes to serve a page with rocket? Something that I can use instead of httpserver.log? It seems important for me to measure real-world performance, which ab does not do. My server has 768MB ram, and the only thing it does is run this web2py server. I assumed ram was not full, but did not check. I will check next time. On Mar 29, 12:10 pm, Timothy Farrell tfarr...@swgen.com wrote: On 3/29/2010 1:39 PM, Michael Toomim wrote: I was having slowness problems with cherrypy too! That's why I switched to rocket. So perhaps it's something common to cherrypy and rocket, or perhaps they are both slow in their own ways? This is using web2py from march 16th, so it's not the latest rocket. Do you think something important changed for concurrency? I'm the author of Rocket. I _know_ something important changed on March 18th. =) But that important change only really affects the *nix platform. You haven't said what you're running on. I'm not familiar with MTurk very well. Is it directly connected to your web2py setup? Does it run on Windows/Linux? You said that you were having trouble with Cherrypy too. Is Rocket better or worse than Cherrypy? The one hang-up that I can see here is if you're server is memory-limited then multiple concurrent connections will cause thrashing due to swapping. This situation would be fast with one but slow with multiple connections. We need some more information before we can help you further. But if Cherrypy wasn't cutting it then perhaps you should look into some of the native code solutions such as Apache. This sounds like something wider than just the webserver. -tim On Mar 29, 5:56 am, Timothy Farrelltfarr...@swgen.com wrote: Perhaps a simpler set of questions: Did you have this working with Cherrypy beforehand? If so, is Rocket the only thing to have changed? The latest changes to Rocket were committed to the Mercurial web2py repo on March 18th. I'm assuming you've run a checkout since then. -tim On 3/28/2010 4:23 PM, mdipierro wrote: One more thing. You ask But a single process doing complex joins should not slow down all other simple selects and inserts, right? no, except for sqlite. sqlite serializes all requests because locks the db. That could explain the 0.20s if you have lots of queries per request, but not the 54s for the server. On Mar 28, 4:22 pm, mdipierromdipie...@cs.depaul.edu wrote: On Mar 28, 3:46 pm, Michael Toomimtoo...@gmail.com wrote: Any idea why there is a discrepancy between Firebug and httpserver.log? httpserver.log logs the time spend in web2py, not including the time for sending and receiving the http request/response. firebug logs the the total time, including time spend by the web server for communication. I am using postgresql. What would indicate model complexity? I have around 9 tables, but most of the requests just do single-object selects and inserts. No complex joins are in public-facing pages, but myself as an administrator periodically load a page that does big joins. But a single process doing complex joins should not slow down all other simple selects and inserts, right? In your case there are two problems (and I do not know what causes them): 1) web2py is taking 0.20seconds to process a response. That is more than 10 times what it should be. 2) the communication between the web server and the browser takes very very long time. Is the server on localhost? If not this could be a network issue. On Mar 27, 6:48 am, mdipierromdipie...@cs.depaul.edu wrote: Mind that if you use sqlite there is no concurrency. Still these numbers are very low. Are your models very complex? On 27 Mar, 00:06, Michael Toomimtoo...@gmail.com wrote: I'm using web2py+rocket to serve jobs on mechanical turk. The server probably gets a hit per second or so by workers on mechanical turk using it. When I have no users, everything is fast. But in active use, I notice that web pages often load reay slow in my web browser, but the httpserver.log file reports only small times. For instance, I just loaded a page that httpserver.log said took 0.20 seconds, but Firebug said took 54.21 seconds. That's a big difference. Any idea what's going on? I guess I'll have to try apache? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
Any idea why there is a discrepancy between Firebug and httpserver.log? I am using postgresql. What would indicate model complexity? I have around 9 tables, but most of the requests just do single-object selects and inserts. No complex joins are in public-facing pages, but myself as an administrator periodically load a page that does big joins. But a single process doing complex joins should not slow down all other simple selects and inserts, right? On Mar 27, 6:48 am, mdipierro mdipie...@cs.depaul.edu wrote: Mind that if you use sqlite there is no concurrency. Still these numbers are very low. Are your models very complex? On 27 Mar, 00:06, Michael Toomim too...@gmail.com wrote: I'm using web2py+rocket to serve jobs on mechanical turk. The server probably gets a hit per second or so by workers on mechanical turk using it. When I have no users, everything is fast. But in active use, I notice that web pages often load reay slow in my web browser, but the httpserver.log file reports only small times. For instance, I just loaded a page that httpserver.log said took 0.20 seconds, but Firebug said took 54.21 seconds. That's a big difference. Any idea what's going on? I guess I'll have to try apache? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] webserver slow, misreported
I'm using web2py+rocket to serve jobs on mechanical turk. The server probably gets a hit per second or so by workers on mechanical turk using it. When I have no users, everything is fast. But in active use, I notice that web pages often load reay slow in my web browser, but the httpserver.log file reports only small times. For instance, I just loaded a page that httpserver.log said took 0.20 seconds, but Firebug said took 54.21 seconds. That's a big difference. Any idea what's going on? I guess I'll have to try apache? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: webserver slow, misreported
Actually it's handling about 5 requests per second, so there is def some concurrency. On Mar 26, 10:06 pm, Michael Toomim too...@gmail.com wrote: I'm using web2py+rocket to serve jobs on mechanical turk. The server probably gets a hit per second or so by workers on mechanical turk using it. When I have no users, everything is fast. But in active use, I notice that web pages often load reay slow in my web browser, but the httpserver.log file reports only small times. For instance, I just loaded a page that httpserver.log said took 0.20 seconds, but Firebug said took 54.21 seconds. That's a big difference. Any idea what's going on? I guess I'll have to try apache? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] create index on postgres
I can't create an index on postgresql using executesql. Here's what happens: db.executesql('create index bq_index on bonus_queue (hitid);') ...but the index does not show up in psql. It does not return anything. It seems like the command might be blocking psql, because if I run another index command on psql: =# create index bq_index2 on bonus_queue (reason); ...it will block (not return) until I exit the web2py process that I ran the executesql command from. The postgresql docs says that this command will lock the table from writes until it completes. BUT if I just run the command from psql without trying the db.executesql(), it does what it should, returning immediately with the message CREATE INDEX and shows me an index in psql: =# \d bonus_queue ... Indexes: bq_index2 btree (reason) Any idea how to get db.executesql() to work? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Helper functions to get one from database
Hi guys, I've found the following functions to be commonly useful in practice. Has anyone else written anything similar? Is there a better idiom here, or better names or interfaces for these? def get_one(query): result = db(query).select() assert len(result) = 1, GAH get_one called when there's MORE than one! return result[0] if len(result) == 1 else None def get_or_make_one(query, table, default_values): result = get_one(query) if not result: table.insert(**default_values) return get_one(query) -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: no more cherrypy wsgiserver
Did you do anything special to use apachebench on the cherrypy server? When I run ab http://localhost/init/; I get a apr_socket_recv: Connection refused (111) error from apachebench. If I do the same command when running the latest hg tip of web2py (with rocket), the benchmark works. I'm trying to see if rocket will speed up my website. On Mar 12, 9:13 am, Timothy Farrell tfarr...@swgen.com wrote: The benchmarks are in. As you can see from the attached PDF, there is a strong case for Rocket. How I conducted these benchmarks: CPU: Athlon 4050e 2.1 GHz RAM: 3GB OS: Windows 7 Ultimate Python 2.6.1 Rocket 0.3.1 Cherrypy 3.1.2 I used ApacheBench to run the numbers you see. The wsgi app used was as basic as it gets: def test_app(env, start_response): start_response('200 OK', [('Content-Type', 'text/plain')]) return [True] Apache (and mod_wsgi) were not particularly tuned but were included to show generally where it would end up on scales. Don't take this as a definitive look at Apache or mod_wsgi's performance (back you nginx/cherokee/lighty trolls! ;-). This is about a server that can be included in web2py. You'll notice some blank entries in the numbers...here's why: My original intervals were 1,2,5,10,25,50,100,250,500,1000. However, I added in 6,7,8 after seeing Cherrypy's performance hit a wall. I wanted to show where that happened. I didn't see it necessary to include Rocket or mod_wsgi in those iterations since they saw no such wall. mod_wsgi does not include numbers for 500 or 1000 concurrent connections because at that point Apache started rejecting connections. This would not be an issue on a properly configured Apache. Once again, the main comparison here is between Rocket and Cherrypy's wsgiserver. If you would like the full spreadsheet, email me privately. -tim On 3/11/2010 10:19 AM, Timothy Farrell wrote: The code has changed since version 0.1, Let me re-run some benchmarks. I'll have time to tomorrow. For those curious, the basic difference is that Rocket handles a few concurrent connections as fast as wsgiserver and many concurrent connections much much faster. It's also smaller, with cleaner code. -tim On 3/11/2010 10:08 AM, mdipierro wrote: We moved from cherrypy wsgiserver to Rocket, by Timothy Farrell. I included an older version, need to include the latest one. It needs to be tested but let's wait I post the latest version before we do so. Why? @Tim, you made a very convincing case to me some time ago. Can you share your benchmark with the rest of the users? Massimo Rocket Benchmarks.pdf 10KViewDownload -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Re: no more cherrypy wsgiserver
I'm so excited! I was about to try moving to rocket myself, because I need the scalability and it is very useful for my app to run without apache. THANKS GUYS! On Mar 11, 8:08 am, mdipierro mdipie...@cs.depaul.edu wrote: We moved from cherrypy wsgiserver toRocket, by Timothy Farrell. I included an older version, need to include the latest one. It needs to be tested but let's wait I post the latest version before we do so. Why? @Tim, you made a very convincing case to me some time ago. Can you share your benchmark with the rest of the users? Massimo -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] How does None and Null work in dal?
How is a database Null entry represented in python when using the DAL? How can you query for null rows? How can you set them? Is this the same as None? And if you create a database row without setting a value for a column, this is set to Null=None, right? Thank you! -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py] Limiting to a single process
I'm running a background database processing task, and I only want to have ONE task running so I don't have to worry about race conditions. What is the best way to do this? I run this task from a cron @reboot. It runs this script: while True: time.sleep(10) process_queue() I'm worried that I might accidentally run two web2py processes at the same time, each starting a cron job, and then I'd have two running and mess up my data. So I want this script to check to see if another one is currently running, and if so, to give up. I considered making an entry in a database like background_task_running = true, but how can I guarantee that it gets set to false if my web2py crashes? I don't want to have to manually reset that field all the time. Is it best to see if this process is currently running on the machine, with os.system('ps aux | grep background_work.py'), but this isn't cross-platform, and won't work if the process is running on another machine. Is there a way to check if another of these processes is connected to the postgresql database? -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py:38429] Migrations broken in hg tip?
I'm using hg tip and can't add a column to a table. It figures out what to do, but doesn't alter the postgresql database. Then any commands that need to use that column fail. It was able to create the table in the first place, but cannot add a column now that the table has been created. How should I debug this? Here's more detail: == Step 1 == db.py: db.define_table('hits_log', db.Field('hitid', 'text'), db.Field('creation_time', 'datetime')) .table file: {'creation_time': 'TIMESTAMP', 'hitid': 'TEXT', 'id': 'SERIAL PRIMARY KEY'} postgresql schema: id| integer | not null default nextval('hits_log_id_seq'::regclass) hitid | text| creation_time | timestamp without time zone | Indexes: hits_log_pkey PRIMARY KEY, btree (id) == Step 2 == Now I add a column xmlbody: db.define_table('hits_log', db.Field('hitid', 'text'), db.Field('creation_time', 'datetime'), db.Field('xmlbody', 'text')) I re-run web2py with: python web2py.py -a 'foo' -i lovebox.local I load a controller function. But it hasn't updated my postgresql schema: id| integer | not null default nextval('hits_log_id_seq'::regclass) hitid | text| creation_time | timestamp without time zone | Indexes: hits_log_pkey PRIMARY KEY, btree (id) even though the .table file is updated: {'creation_time': 'TIMESTAMP', 'hitid': 'TEXT', 'id': 'SERIAL PRIMARY KEY', 'xmlbody': 'TEXT'} and sql.log now has: ALTER TABLE hits_log ADD xmlbody TEXT; -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.
[web2py:38430] Re: How to do a complex migration?
Here's what I propose: In define_table, at this point: if migrate: sql_locker.acquire() try: t._create(migrate=migrate, fake_migrate=fake_migrate) finally: sql_locker.release() At the end, it can read the database schema and see if it contains all the columns needed in the cPickle .table file. If not, it can output something like Sorry, database table %s needs column %s to look like %s, this needs a manual fix. Is this right? I'm not sure how to read a database table schema programmatically. Is this different for each db? Hopefully it's at least easy for postgresql. On Nov 18 2009, 11:46 am, mdipierro mdipie...@cs.depaul.edu wrote: Yes, I would. On Nov 18, 2:03 am,toomimtoo...@gmail.com wrote: Thanks! Would you accept a patch that makes the error messages more obvious in these situations where you need to manually edit the tables? I think ROR and Django's approaches to migrations (fully-specifying deltas vs. maintaining static schemas) are both more burdensome than web2py's system of semi-automatic for most changes (adding columns tables) but manual for major changes. But the manual for major changes part needs better error messages with suggestions for how to proceed. On Nov 14, 8:04 pm, mdipierro mdipie...@cs.depaul.edu wrote: You are changing a text column (Field) into an ineteger column. My changing the type web2py assumes you want to keep the data and yet it cannot convert text to integert. You have to do the migration in two steps: 1) remove the column (comment it and run appadmin) 2) add the column again with the new type. In this case web2py understands you do not want to keep the data and will not attempt to do it. On Nov 14, 7:08 pm,toomimtoo...@gmail.com wrote: I routinely run into migration problems. I suspect this happens when I change a column's datatype, or when I remove a table from db.py and then later make a new table with the same name as the old one. In these situations, the migrations get messed up and I get stacktraces in sql.py with errors like: ProgrammingError: column study__tmp is of type integer but expression is of type text LINE 1: UPDATE actions SET study__tmp=study; ^ HINT: You will need to rewrite or cast the expression. How can I migrate in these situations? As an aside, I love the automatic migrations when they work, but I don't mind writing sql to fix the underlying database when they don't. Could web2py just tell me what schema it expects, let me fix the database, and then let me say OK web2py, the database is ready to go! -- You received this message because you are subscribed to the Google Groups web2py-users group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.