Re: [libdbi-users] libdbi+mysql speed issues

2013-01-08 Thread Markus Hoenicka
Olivier Doucet webmas...@ajeux.com was heard to say:

 Hello everyone,

 I'm following a quite old topic about libdbi speed issues.
 I was able to track the cause of these issues : The major problem is
 how libdbi goes from one row to another.

 RRDTool (the tool that used libdbi and that I was inspecting) is using
 dbi_result_next_row() function (as stated in libdbi documentation
 btw).

 This function moves from one row to another with function
 dbi_result_seek_row(), incrementing currentRow index each time. This
 gives a call to dbd_mysql.c::dbd_goto_row() that uses
 mysql_data_seek() each time...

 That's why for a query result of 34k rows (yes it happens. No it is
 not a problem in the query itself), we have tens of thousands of call
 to this function (which is very low), and this is definitely not
 needed, because as we use fetch_row(), we automatically move from one
 row to another. Seeking is just a useless task (as internal driver
 does not know where we are, and needs to start from row 0 and seek to
 the given row - where we already were).

 I'm absolutely not a libdbi user, and I don't know what could be done
 outside libdbi to not use dbi_result_next_row() and use directly
 RESULT-onn-driver-functions-fetch_row() directly. Is it possible ?

 And/or patching dbi_result.c :
 just check RESULT-currowidx near line 102 before calling doing
 goto_row() function and call it only if we are not on the good row. Am
 I right ?


Hi,

your analysis is pretty much correct. If you look at the comments in  
dbd_mysql.c::dbd_goto_row(), the original author of the mysql driver  
was well aware of the limitations of his implementation. The reason is  
that other database APIs, e.g. PostgreSQL, allow to fetch rows from a  
result set by index, whereas the MySQL API assumes that you step  
through the rows sequentially. The original design of libdbi appears  
to somewhat favor PostgreSQL in this respect.

Anyway, without having thought about the issue in too much detail, one  
possible solution comes to mind. We could modify the driver function  
dbd_goto_row() by passing both the wanted row index rowidx and the  
current row index currowidx(which libdbi keeps track of anyway). This  
would allow drivers to decide whether they have to actually seek the  
position. pgsql doesn't have to anyway, and mysql doesn't have to if  
rowidx = currowidx+1. This API change would not mandate changes to  
existing drivers as they may ignore the additional parameter and keep  
working as before, but it may offer options to speed up queries in  
some drivers.

regards,
Markus



-- 
Markus Hoenicka
http://www.mhoenicka.de
AQ score 38



--
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users


Re: [libdbi-users] libdbi+mysql speed issues

2013-01-08 Thread Olivier Doucet
Hi Markus,

2013/1/8 Markus Hoenicka markus.hoeni...@mhoenicka.de:
 We could modify the driver function
 dbd_goto_row() by passing both the wanted row index rowidx and the
 current row index currowidx(which libdbi keeps track of anyway).

This is one way to fix the problem, I agree. Unfortunately my level in
C is too low to make such huge changes without breaking everything
else :) Anyone willing to create the patch for this ?

Is there a way, outside libdbi, to fix this problem ? For example, go
over dbd_goto_row() and call fetch_row directly() ? Or maybe the
behaviour is different between database engines ?

Olivier

--
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users


Re: [libdbi-users] libdbi+mysql speed issues

2013-01-08 Thread Markus Hoenicka
Olivier Doucet webmas...@ajeux.com was heard to say:

 Hi Markus,

 2013/1/8 Markus Hoenicka markus.hoeni...@mhoenicka.de:
 We could modify the driver function
 dbd_goto_row() by passing both the wanted row index rowidx and the
 current row index currowidx(which libdbi keeps track of anyway).

 This is one way to fix the problem, I agree. Unfortunately my level in
 C is too low to make such huge changes without breaking everything
 else :) Anyone willing to create the patch for this ?


I'll commit these changes asap (may take a day or two). Can you build  
libdbi and libdbi-drivers from the cvs sources? You seem to have a  
good test case to see if these changes help.

 Is there a way, outside libdbi, to fix this problem ? For example, go
 over dbd_goto_row() and call fetch_row directly() ? Or maybe the
 behaviour is different between database engines ?


I'm afraid you can't do that except if you bypass the abstraction  
layer altogether and use libmysqlclient natively.

regards,
Markus


-- 
Markus Hoenicka
http://www.mhoenicka.de
AQ score 38



--
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users


Re: [libdbi-users] libdbi+mysql speed issues

2013-01-08 Thread Mike Rylander
Markus,

Would it be worth the ugliness of a module-global variable to bury the
value of the row index used in the previous call to dbd_mysql.c::dbd_goto_row()
inside the MySQL driver itself, side-stepping the function signature change
for other drivers?  The benefit of avoiding the API change may not be
outweighed by the potential fragility of naively tracking the state
internally, of course, since in practice folks install and upgrade to new
versions of drivers and the libdbi core at the same time, but then again,
it may.  I'm a little nervous about the potential for problems when mixing
direct goto_row() (cursor style) and next_row() calls, but I haven't looked
at the code to see if there's actually an issue there...

--miker


On Tue, Jan 8, 2013 at 4:46 AM, Markus Hoenicka 
markus.hoeni...@mhoenicka.de wrote:

 Olivier Doucet webmas...@ajeux.com was heard to say:

  Hello everyone,
 
  I'm following a quite old topic about libdbi speed issues.
  I was able to track the cause of these issues : The major problem is
  how libdbi goes from one row to another.
 
  RRDTool (the tool that used libdbi and that I was inspecting) is using
  dbi_result_next_row() function (as stated in libdbi documentation
  btw).
 
  This function moves from one row to another with function
  dbi_result_seek_row(), incrementing currentRow index each time. This
  gives a call to dbd_mysql.c::dbd_goto_row() that uses
  mysql_data_seek() each time...
 
  That's why for a query result of 34k rows (yes it happens. No it is
  not a problem in the query itself), we have tens of thousands of call
  to this function (which is very low), and this is definitely not
  needed, because as we use fetch_row(), we automatically move from one
  row to another. Seeking is just a useless task (as internal driver
  does not know where we are, and needs to start from row 0 and seek to
  the given row - where we already were).
 
  I'm absolutely not a libdbi user, and I don't know what could be done
  outside libdbi to not use dbi_result_next_row() and use directly
  RESULT-onn-driver-functions-fetch_row() directly. Is it possible ?
 
  And/or patching dbi_result.c :
  just check RESULT-currowidx near line 102 before calling doing
  goto_row() function and call it only if we are not on the good row. Am
  I right ?
 

 Hi,

 your analysis is pretty much correct. If you look at the comments in
 dbd_mysql.c::dbd_goto_row(), the original author of the mysql driver
 was well aware of the limitations of his implementation. The reason is
 that other database APIs, e.g. PostgreSQL, allow to fetch rows from a
 result set by index, whereas the MySQL API assumes that you step
 through the rows sequentially. The original design of libdbi appears
 to somewhat favor PostgreSQL in this respect.

 Anyway, without having thought about the issue in too much detail, one
 possible solution comes to mind. We could modify the driver function
 dbd_goto_row() by passing both the wanted row index rowidx and the
 current row index currowidx(which libdbi keeps track of anyway). This
 would allow drivers to decide whether they have to actually seek the
 position. pgsql doesn't have to anyway, and mysql doesn't have to if
 rowidx = currowidx+1. This API change would not mandate changes to
 existing drivers as they may ignore the additional parameter and keep
 working as before, but it may offer options to speed up queries in
 some drivers.

 regards,
 Markus



 --
 Markus Hoenicka
 http://www.mhoenicka.de
 AQ score 38




 --
 Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
 and more. Get SQL Server skills now (including 2012) with LearnDevNow -
 200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
 SALE $99.99 this month only - learn more at:
 http://p.sf.net/sfu/learnmore_122512
 ___
 libdbi-users mailing list
 libdbi-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/libdbi-users




-- 
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  mi...@esilibrary.com
 | web:  http://www.esilibrary.com
--
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users


Re: [libdbi-users] libdbi+mysql speed issues

2013-01-08 Thread Markus Hoenicka

Mike Rylander mrylan...@gmail.com was heard to say:

 Markus,

 Would it be worth the ugliness of a module-global variable to bury the
 value of the row index used in the previous call to  
 dbd_mysql.c::dbd_goto_row()
 inside the MySQL driver itself, side-stepping the function signature change
 for other drivers?  The benefit of avoiding the API change may not be
 outweighed by the potential fragility of naively tracking the state
 internally, of course, since in practice folks install and upgrade to new
 versions of drivers and the libdbi core at the same time, but then again,
 it may.  I'm a little nervous about the potential for problems when mixing
 direct goto_row() (cursor style) and next_row() calls, but I haven't looked
 at the code to see if there's actually an issue there...


Hi,

I'll have to read the code again a little more thoroughly, but to the  
best of my knowledge libdbi emulates MySQL's approach to retrieving  
rows from result sets. In order to walk through the rows of e.g. a  
PostgreSQL result set you have to retrieve the rows by index  
sequentially, so libdbi has to maintain an internal pointer anyway. We  
do not have to add one, so there are no extra changes. Also, libdbi  
internally already mixes the cursor style and next_row style calls,  
because we have to cater for database engines which use either of  
these methods without exposing these differences to the libdbi user.

As for the API change, we have extensive driver API changes between  
0.8.x and the upcoming(TM) 0.9 release anyway, think of the recent  
addition of the transaction stuff. You won't be able to keep your  
0.8.x drivers once you switch to libdbi 0.9. You'll probably notice  
problems only if you build from cvs regularly (and only if you update  
one but not the other), but I expect those users to know what they're  
doing.

regards,
Markus

-- 
Markus Hoenicka
http://www.mhoenicka.de
AQ score 38



--
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users


Re: [libdbi-users] libdbi+mysql speed issues

2013-01-08 Thread markus . hoenicka
Markus Hoenicka writes:
  Hi,
  
  I'll have to read the code again a little more thoroughly, but to the  
  best of my knowledge libdbi emulates MySQL's approach to retrieving  
  rows from result sets. In order to walk through the rows of e.g. a  
  PostgreSQL result set you have to retrieve the rows by index  
  sequentially, so libdbi has to maintain an internal pointer anyway. We  
  do not have to add one, so there are no extra changes. Also, libdbi  
  internally already mixes the cursor style and next_row style calls,  
  because we have to cater for database engines which use either of  
  these methods without exposing these differences to the libdbi user.
  
  As for the API change, we have extensive driver API changes between  
  0.8.x and the upcoming(TM) 0.9 release anyway, think of the recent  
  addition of the transaction stuff. You won't be able to keep your  
  0.8.x drivers once you switch to libdbi 0.9. You'll probably notice  
  problems only if you build from cvs regularly (and only if you update  
  one but not the other), but I expect those users to know what they're  
  doing.
  

Hi,

I've implemented the suggested changes and checked in the updated
files. If you want to speed test the code, please check out, build,
and install the current cvs revisions of both libdbi and
libdbi-drivers. I've updated the mysql, pgsql, and sqlite3 drivers at
this time, but making the remaining drivers compile is trivial. I've
added the suggested check to speed up sequential row fetching in
mysql. I do see a small but significant decrease in the time required
to run gmake check (11.78 vs. 12.51 s), although the test code does
not test retrieving boatloads of rows specifically. I'd appreciate if
someone with a nice testcase (Olivier?) could give the changes a try
and report some numbers.

regards,
Markus

-- 
Markus Hoenicka
http://www.mhoenicka.de
AQ score 38

--
Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
and much more. Keep your Java skills current with LearnJavaNow -
200+ hours of step-by-step video tutorials by Java experts.
SALE $49.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122612 
___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users


Re: [libdbi-users] libdbi+mysql speed issues

2011-07-09 Thread Adam Jacob Muller
Hi Markus,

I definitely can, anecdotal evidence suggests to me that this issue does not 
occur on postgres (I use rrdtool+postgres similarly and havne't seen the 
issue), but I'll make some actual tests and report back.

-Adam


On Jul 9, 2011, at 8:39 AM, markus.hoeni...@mhoenicka.de wrote:

 Adam Jacob Muller writes:
 Hi,
 I'm developing an application that utilizes rrdtool (and the lidbdi 
 interface into rrdtool) heavily and having some performance issues that seem 
 to center around libdbi,
 
 
 Hi Adam,
 
 I can't see any obvious issues with your C code, so I assume the speed
 differences really stem from libdbi, one way or another. Unfortunately
 I currently lack the time for any deeper analysis, but I'm somewhat
 surprised about your results. libdbi has originally been developed as
 a wrapper for MySQL and PostgreSQL and should work fairly well with
 these database engines (other engines are a *lot* harder to fit). The
 only overhead that the mysql driver contains is the column type
 analysis and data conversion stuff. Some other engines have to buffer
 the data and thus create way more overhead.
 
 Did you have a chance to verify your results with any of the other
 database drivers?
 
 regards,
 Markus
 
 -- 
 Markus Hoenicka
 http://www.mhoenicka.de
 AQ score 38
 
 --
 All of the data generated in your IT infrastructure is seriously valuable.
 Why? It contains a definitive record of application performance, security 
 threats, fraudulent activity, and more. Splunk takes this data and makes 
 sense of it. IT sense. And common sense.
 http://p.sf.net/sfu/splunk-d2d-c2
 ___
 libdbi-users mailing list
 libdbi-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/libdbi-users


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
libdbi-users mailing list
libdbi-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libdbi-users