I think I've isolated the performance issue to the web.py database
module, specifically its handling of char fields in the db result
Storage object.
To isolate the issue, I simplified the code and compared results of
the web.py database module to the same logic using MySQL-python 1.2.2
for the database connection.
Here are the times to extract 4000 records from the database and
render 28,003 lines of XML by iterating over the database result set.
Each version produced identical output.
With the web.py database module:
Database time: 0.17220 second
Render time: 176.08740 seconds <---- Here's the problem
Total time: 176.25962 seconds
With MySQL-python 1.2.2:
Database time: 0.12736 second
Render time: 0.07125 second <---- No problem here
Total time: 0.22433 second
All the fields in the database result set are integers, floats and
dates except one -- a char(50) field. When I eliminate the char(50)
field, the problem goes away:
With the web.py database module, eliminating the char(50) field:
Database time: 0.15156 second
Render time: 0.11541 seconds <---- Nice and fast now!
Total time: 0.26698 seconds
With MySQL-python 1.2.2, eliminating the char(50) field:
Database time: 0.11003 second
Render time: 0.04565 second
Total time: 0.18564 second
So it looks like I can work around the issue by using MySQL-python
instead of the new web.py database module. Or I could revert to
version 0.23 since it works fine, and this appears to be a new issue
with 0.31.
For what it's worth, here's my test code that uses the web.py database
module:
class diag_webpydb:
def GET(self):
query = """SELECT d.id, d.item, d.store, d.tdate, i.descr
FROM t09_diag d
JOIN t07_items i
ON d.item = i.item
WHERE d.id BETWEEN 170001 AND 174000"""
result = db.query(query)
xml = '<?xml version=\"1.0\"?>\n'
xml += '<root>\n'
for row in result:
xml += ' <product>'
xml += ' <id>%s</id>\n' % row.id
xml += ' <item>%s</item>\n' % row.item
xml += ' <store>%s</store>\n' % row.store
xml += ' <date>%s</date>\n' % row.tdate
# comment out the char(50) field to solve perf issue
# xml += ' <descr>%s</descr>\n' % row.descr
xml += ' </product>\n'
xml += '</root>\n'
return xml
And here's the test code using MySQL-python:
class diag_mysqldb:
def GET(self):
query = """SELECT d.id, d.item, d.store, d.tdate, i.descr
FROM t09_diag d
JOIN t07_items i
ON d.item = i.item
WHERE d.id BETWEEN 170001 AND 174000"""
cursor = mydb.cursor()
numrows = cursor.execute(query)
xml = '<?xml version=\"1.0\"?>\n'
xml += '<root>\n'
for i in range(numrows):
row = cursor.fetchone()
xml += ' <product>'
xml += ' <id>%s</id>\n' % row[0]
xml += ' <item>%s</item>\n' % row[1]
xml += ' <store>%s</store>\n' % row[2]
xml += ' <date>%s</date>\n' % row[3]
# comment out the char(50) to compare with web.py
# xml += ' <descr>%s</descr>\n' % row[4]
xml += ' </product>\n'
xml += '</root>\n'
return xml
On Jan 2, 7:26 am, mb <[email protected]> wrote:
> The render function didn't change between version 0.23 and 0.31, but
> the speed went from 0.16659 second to 67.98967 seconds for the same
> action.
>
> But there were changes to the database module, so I'm wondering if
> iterating over the db result object is causing the performance hit.
>
> On Jan 2, 2:35 am, hhsuper <[email protected]> wrote:
>
> > apparently, the render method use string concat often cause the problem, it
> > should have other method to process
> > string concat method you can try simulate other languages StringBuilder, use
> > a list append your string,
> > and then use a join method get the string to output
>
> > On Fri, Jan 2, 2009 at 12:52 PM, mb <[email protected]> wrote:
>
> > > It looks like iterating over the db result object is what's causing
> > > the slowdown.
>
> > > With a small result set (20 records), here are the times in 0.31 to do
> > > the db query and to render xml by iterating over the result object:
> > > db: 0.01993 second
> > > render: 0.00087 second
> > > total: 0.02671 second
>
> > > Then with a large result set (17,142 records):
> > > db: 0.09306 second
> > > render: 67.98967 seconds
> > > total: 68.08277
>
> > > On version 0.23 and the large result set (17,142 records):
> > > db: 0.06552
> > > render: 0.16659
> > > total: 0.23248
>
> > > Clearly the render function is the problem. This function takes the
> > > db result object as a parameter, and iterates over it to create the
> > > xml:
>
> > > def render_xml(items):
> > > xml = '<?xml version=\"1.0\"?>\n'
> > > xml += '<rootnode>\n'
> > > for item in items:
> > > xml += '<itemnumber>%s</itemnumber>\n' % item.id
> > > xml += '<description>%s</description>\n' % item.descr
> > > xml += '<quantity>%s</quantity>\n' % item.qty
> > > ...
> > > return xml
>
> > > dbresult = db.query(querystring)
> > > return render_xml(dbresult)
>
> > > On Jan 1, 8:04 pm, "Aaron Swartz" <[email protected]> wrote:
> > > > Can you share more details? Code? Profiler output?
>
> > > > On Jan 1, 2009 10:55 PM, "mb" <[email protected]> wrote:
>
> > > > I have a web service that queries a database and returns a fairly
> > > > large xml document based on parameters in the URL.
>
> > > > This morning, with web.py 0.23, it would return a 17,142 line response
> > > > (491k) in 200 ms.
>
> > > > This afternoon I upgraded to 0.31, and it now takes 56 seconds for the
> > > > same action.
>
> > > > For the entire 56 seconds, the cpu is pegged at 100% on the python
> > > > process with just one request, where on 0.23 I could run many
> > > > simultaneous requests and the cpu was rarely over 40%.
>
> > > > Any ideas why my performance has dropped by a factor of 300?
>
> > --
> > Su zhaohui 苏召辉
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"web.py" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/webpy?hl=en
-~----------~----~----~----~------~----~------~--~---