Re: [libdbi-users] libdbi+mysql speed issues
Markus Hoenicka writes: > Hi, > > I'll have to read the code again a little more thoroughly, but to the > best of my knowledge libdbi emulates MySQL's approach to retrieving > rows from result sets. In order to walk through the rows of e.g. a > PostgreSQL result set you have to retrieve the rows by index > sequentially, so libdbi has to maintain an internal pointer anyway. We > do not have to add one, so there are no extra changes. Also, libdbi > internally already mixes the cursor style and next_row style calls, > because we have to cater for database engines which use either of > these methods without exposing these differences to the libdbi user. > > As for the API change, we have extensive driver API changes between > 0.8.x and the upcoming(TM) 0.9 release anyway, think of the recent > addition of the transaction stuff. You won't be able to keep your > 0.8.x drivers once you switch to libdbi 0.9. You'll probably notice > problems only if you build from cvs regularly (and only if you update > one but not the other), but I expect those users to know what they're > doing. > Hi, I've implemented the suggested changes and checked in the updated files. If you want to speed test the code, please check out, build, and install the current cvs revisions of both libdbi and libdbi-drivers. I've updated the mysql, pgsql, and sqlite3 drivers at this time, but making the remaining drivers compile is trivial. I've added the suggested check to speed up sequential row fetching in mysql. I do see a small but significant decrease in the time required to run gmake check (11.78 vs. 12.51 s), although the test code does not test retrieving boatloads of rows specifically. I'd appreciate if someone with a nice testcase (Olivier?) could give the changes a try and report some numbers. regards, Markus -- Markus Hoenicka http://www.mhoenicka.de AQ score 38 -- Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery and much more. Keep your Java skills current with LearnJavaNow - 200+ hours of step-by-step video tutorials by Java experts. SALE $49.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122612 ___ libdbi-users mailing list libdbi-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libdbi-users
Re: [libdbi-users] libdbi+mysql speed issues
Mike Rylander was heard to say: > Markus, > > Would it be worth the ugliness of a module-global variable to bury the > value of the row index used in the previous call to > dbd_mysql.c::dbd_goto_row() > inside the MySQL driver itself, side-stepping the function signature change > for other drivers? The benefit of avoiding the API change may not be > outweighed by the potential fragility of naively tracking the state > internally, of course, since in practice folks install and upgrade to new > versions of drivers and the libdbi core at the same time, but then again, > it may. I'm a little nervous about the potential for problems when mixing > direct goto_row() (cursor style) and next_row() calls, but I haven't looked > at the code to see if there's actually an issue there... > Hi, I'll have to read the code again a little more thoroughly, but to the best of my knowledge libdbi emulates MySQL's approach to retrieving rows from result sets. In order to walk through the rows of e.g. a PostgreSQL result set you have to retrieve the rows by index sequentially, so libdbi has to maintain an internal pointer anyway. We do not have to add one, so there are no extra changes. Also, libdbi internally already mixes the cursor style and next_row style calls, because we have to cater for database engines which use either of these methods without exposing these differences to the libdbi user. As for the API change, we have extensive driver API changes between 0.8.x and the upcoming(TM) 0.9 release anyway, think of the recent addition of the transaction stuff. You won't be able to keep your 0.8.x drivers once you switch to libdbi 0.9. You'll probably notice problems only if you build from cvs regularly (and only if you update one but not the other), but I expect those users to know what they're doing. regards, Markus -- Markus Hoenicka http://www.mhoenicka.de AQ score 38 -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512 ___ libdbi-users mailing list libdbi-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libdbi-users
Re: [libdbi-users] libdbi+mysql speed issues
Markus, Would it be worth the ugliness of a module-global variable to bury the value of the row index used in the previous call to dbd_mysql.c::dbd_goto_row() inside the MySQL driver itself, side-stepping the function signature change for other drivers? The benefit of avoiding the API change may not be outweighed by the potential fragility of naively tracking the state internally, of course, since in practice folks install and upgrade to new versions of drivers and the libdbi core at the same time, but then again, it may. I'm a little nervous about the potential for problems when mixing direct goto_row() (cursor style) and next_row() calls, but I haven't looked at the code to see if there's actually an issue there... --miker On Tue, Jan 8, 2013 at 4:46 AM, Markus Hoenicka < markus.hoeni...@mhoenicka.de> wrote: > Olivier Doucet was heard to say: > > > Hello everyone, > > > > I'm following a quite old topic about libdbi speed issues. > > I was able to track the cause of these issues : The major problem is > > how libdbi goes from one row to another. > > > > RRDTool (the tool that used libdbi and that I was inspecting) is using > > dbi_result_next_row() function (as stated in libdbi documentation > > btw). > > > > This function moves from one row to another with function > > dbi_result_seek_row(), incrementing currentRow index each time. This > > gives a call to dbd_mysql.c::dbd_goto_row() that uses > > mysql_data_seek() each time... > > > > That's why for a query result of 34k rows (yes it happens. No it is > > not a problem in the query itself), we have tens of thousands of call > > to this function (which is very low), and this is definitely not > > needed, because as we use fetch_row(), we automatically move from one > > row to another. Seeking is just a useless task (as internal driver > > does not know where we are, and needs to start from row 0 and seek to > > the given row - where we already were). > > > > I'm absolutely not a libdbi user, and I don't know what could be done > > outside libdbi to not use dbi_result_next_row() and use directly > > RESULT->onn->driver->functions->fetch_row() directly. Is it possible ? > > > > And/or patching dbi_result.c : > > just check RESULT->currowidx near line 102 before calling doing > > goto_row() function and call it only if we are not on the good row. Am > > I right ? > > > > Hi, > > your analysis is pretty much correct. If you look at the comments in > dbd_mysql.c::dbd_goto_row(), the original author of the mysql driver > was well aware of the limitations of his implementation. The reason is > that other database APIs, e.g. PostgreSQL, allow to fetch rows from a > result set by index, whereas the MySQL API assumes that you step > through the rows sequentially. The original design of libdbi appears > to somewhat favor PostgreSQL in this respect. > > Anyway, without having thought about the issue in too much detail, one > possible solution comes to mind. We could modify the driver function > dbd_goto_row() by passing both the wanted row index rowidx and the > current row index currowidx(which libdbi keeps track of anyway). This > would allow drivers to decide whether they have to actually seek the > position. pgsql doesn't have to anyway, and mysql doesn't have to if > rowidx = currowidx+1. This API change would not mandate changes to > existing drivers as they may ignore the additional parameter and keep > working as before, but it may offer options to speed up queries in > some drivers. > > regards, > Markus > > > > -- > Markus Hoenicka > http://www.mhoenicka.de > AQ score 38 > > > > > -- > Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS > and more. Get SQL Server skills now (including 2012) with LearnDevNow - > 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. > SALE $99.99 this month only - learn more at: > http://p.sf.net/sfu/learnmore_122512 > ___ > libdbi-users mailing list > libdbi-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/libdbi-users > -- Mike Rylander | Director of Research and Development | Equinox Software, Inc. / Your Library's Guide to Open Source | phone: 1-877-OPEN-ILS (673-6457) | email: mi...@esilibrary.com | web: http://www.esilibrary.com -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512___ libdbi-users mailing list libdbi-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libdbi-users
Re: [libdbi-users] libdbi+mysql speed issues
Olivier Doucet was heard to say: > Hi Markus, > > 2013/1/8 Markus Hoenicka : >> We could modify the driver function >> dbd_goto_row() by passing both the wanted row index rowidx and the >> current row index currowidx(which libdbi keeps track of anyway). > > This is one way to fix the problem, I agree. Unfortunately my level in > C is too low to make such huge changes without breaking everything > else :) Anyone willing to create the patch for this ? > I'll commit these changes asap (may take a day or two). Can you build libdbi and libdbi-drivers from the cvs sources? You seem to have a good test case to see if these changes help. > Is there a way, outside libdbi, to fix this problem ? For example, go > over dbd_goto_row() and call fetch_row directly() ? Or maybe the > behaviour is different between database engines ? > I'm afraid you can't do that except if you bypass the abstraction layer altogether and use libmysqlclient natively. regards, Markus -- Markus Hoenicka http://www.mhoenicka.de AQ score 38 -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512 ___ libdbi-users mailing list libdbi-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libdbi-users
Re: [libdbi-users] libdbi+mysql speed issues
Hi Markus, 2013/1/8 Markus Hoenicka : > We could modify the driver function > dbd_goto_row() by passing both the wanted row index rowidx and the > current row index currowidx(which libdbi keeps track of anyway). This is one way to fix the problem, I agree. Unfortunately my level in C is too low to make such huge changes without breaking everything else :) Anyone willing to create the patch for this ? Is there a way, outside libdbi, to fix this problem ? For example, go over dbd_goto_row() and call fetch_row directly() ? Or maybe the behaviour is different between database engines ? Olivier -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512 ___ libdbi-users mailing list libdbi-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libdbi-users
Re: [libdbi-users] libdbi+mysql speed issues
Olivier Doucet was heard to say: > Hello everyone, > > I'm following a quite old topic about libdbi speed issues. > I was able to track the cause of these issues : The major problem is > how libdbi goes from one row to another. > > RRDTool (the tool that used libdbi and that I was inspecting) is using > dbi_result_next_row() function (as stated in libdbi documentation > btw). > > This function moves from one row to another with function > dbi_result_seek_row(), incrementing currentRow index each time. This > gives a call to dbd_mysql.c::dbd_goto_row() that uses > mysql_data_seek() each time... > > That's why for a query result of 34k rows (yes it happens. No it is > not a problem in the query itself), we have tens of thousands of call > to this function (which is very low), and this is definitely not > needed, because as we use fetch_row(), we automatically move from one > row to another. Seeking is just a useless task (as internal driver > does not know where we are, and needs to start from row 0 and seek to > the given row - where we already were). > > I'm absolutely not a libdbi user, and I don't know what could be done > outside libdbi to not use dbi_result_next_row() and use directly > RESULT->onn->driver->functions->fetch_row() directly. Is it possible ? > > And/or patching dbi_result.c : > just check RESULT->currowidx near line 102 before calling doing > goto_row() function and call it only if we are not on the good row. Am > I right ? > Hi, your analysis is pretty much correct. If you look at the comments in dbd_mysql.c::dbd_goto_row(), the original author of the mysql driver was well aware of the limitations of his implementation. The reason is that other database APIs, e.g. PostgreSQL, allow to fetch rows from a result set by index, whereas the MySQL API assumes that you step through the rows sequentially. The original design of libdbi appears to somewhat favor PostgreSQL in this respect. Anyway, without having thought about the issue in too much detail, one possible solution comes to mind. We could modify the driver function dbd_goto_row() by passing both the wanted row index rowidx and the current row index currowidx(which libdbi keeps track of anyway). This would allow drivers to decide whether they have to actually seek the position. pgsql doesn't have to anyway, and mysql doesn't have to if rowidx = currowidx+1. This API change would not mandate changes to existing drivers as they may ignore the additional parameter and keep working as before, but it may offer options to speed up queries in some drivers. regards, Markus -- Markus Hoenicka http://www.mhoenicka.de AQ score 38 -- Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS and more. Get SQL Server skills now (including 2012) with LearnDevNow - 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only - learn more at: http://p.sf.net/sfu/learnmore_122512 ___ libdbi-users mailing list libdbi-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libdbi-users