Re: DBI - SCROLL CURSOR support - outline implementation

Jonathan Leffler Sun, 12 Jan 2003 22:42:55 -0800

I've mercilessly truncated the message to which I'm responding with [...] marking most omissions. It's still a fairly long response, though.

Dean Arnold wrote:
>Jonathan Leffler wrote:

Dean Arnold wrote:

1. Will this support updatable cursors? [...]

[...]Informix does not support updatable scrollable cursors [but
they should exist according to] the [JO]DBC standards.  [...]

I guess if ODBC is the LCD for most platforms, then following
whatever model it provides should be sufficient...I must admit
I'm not a big fan of ODBC, and I hope Mssr. Bunce will *never*
let DBI get as bloated with committee think as ODBC!

Whether it's the LCD or something else, ODBC is the general direction that DBI is headed.

[...] As you may be aware, JDBC uses explicit function methods
prev(), next(), first(), last(), etc. to move the cursor, and
then individual getXXX()'s to retrieve the fields. [...]
SQLFetchScroll is the ODBC function [...] I'm not clear how you
update a row via the cursor without fetching it first [...]
Since a cursor is "positioned", strictly speaking, a fetch is not needed, ie, just move the cursor to some row and apply your update/delete...and in JDBC, they add the "insert row" notion to
allow inserts thru the cursor (a bit clumsy in my opinion, but
consistent with the notion of positioning).

Well, maybe, but the traditional way of positioning a cursor is by fetching a row of data. It starts to sound more and more like a bad case of CODASYL reviviscit - currency indicators and all. However, I've not used JDBC enough to have a worthwhile opinion on its details.

And any SQL which doesn't explicitly reference a cursor doesn't use
the cursor; visibility to a cursor of the changes applied by other
apps updating the datasource is platform and/or app specific,
based on all the isolation issues you mention below. For that
matter, updates made thru a cursor may not even be visible to that cursor, depending on how the dbms/driver/app behaves.

With Informix, you can only update via non-scroll cursors, and you can't refetch a row, so you can't see the update via the cursor (though if the update moves the row from row 123 to row 6923 of the result set, you might see it again in its updated form, though that's dependent on more details than are relevant to this forum).

[...]

4. Where do attributes end and SQL begin? Rather, who wins when
it's SQL vs. DBI attribute vs.DBI method call ? Ie, if I
prepare()/execute() an "SELECT * FROM MyTable FOR UPDATE", does
the $sth implicitly behave as a scrollable if my platform
supports scrollable?

I'm not sure I understand the preamble to the last question.

I was wondering if native SQL already provides most of the
cursor functionality, and DBI's current definition is sufficient
to provide most of the scaffolding...I guess the answer is
"No", since arbitrary positioning in a result set isn't available
in any SQL dialect I'm aware of.

I guess I need to raise your awareness of Informix. It has had scroll cursor support since about 1988. Granted, those are not updatable scroll cursors, but it has support for 'next', 'previous' (or 'prior'), 'first', 'last', 'current', 'relative', 'absolute' fetching. That's why I'm interested in this -- I want to provide access to native Informix functionality via DBI, but Tim (not wholly unreasonably) won't provide such methods unless there's a semi-decent backup implementation that will work for drivers where the database does not provide the support. And the normal technique for dealing with non-updatable scroll cursors is actually simple -- you only select the primary key columns and any necessary ordering columns in the scroll cursor (which can have an ORDER BY clause, therefore), and you select rows of data in two steps: collect the PK information from the scroll cursor, and then collect the rest of the data in a second operation using a cursor FOR UPDATE with primary key access (which should be fast). One advantage of this is that it reduces the amount of data that has to be stored in the scroll cursor. One disadvantage is the double trip to the DBMS to get the full data.

Incidentally, it will be critical with scroll cursors to use $sth->finish() to release the stored data! Sorry Tim, it's the function that won't die.

My concern is API bloat: if SQL can do it within the current DBI,
then defining a new set of functions/attributes is probably not
desirable (tho there seems to be a significant audience for the
counter argument - "SQL bad, Perl good" -, hence DBIx::Recordset
and its kin). My POV derives from DBD::Teradata's support for
simple updatable cursors...but it doesn't support arbitrary positioning, so I guess it doesn't really apply.

I wasn't aware of how ODBC did it when I sent the original mailing. Tim has indicated that what would happen is that a new function, $sth->setattr($$) would be added which would be called between the $sth->prepare() and $sth->execute() to indicate that this is going to be a scrollable cursor. That makes sense. I still don't like the way that ODBC bundles the ESQL/C statements EXECUTE with DECLARE and OPEN that occurs within $sth->execute (SQLExecute), but I have to live with that. So, there would be the new $sth->fetchrow_scroll_*($$) family of functions, where the two parameters at the calling side would be the access mode ('next', 'prior', etc) and the offset (for 'relative' and 'absolute' only). Inside the implementation, there'd be three parameters, of course - the $sth would also appear. And there'd be the new $sth->setattr($$) function.

[...]

One interesting bit is 'how can DBI intercept the relevant
$dbh->get_info calls when the default implementation is in use and
provide an accurate description of the capabilities'.
I'm a little confused by the semantics of get_info in this case; if
I ask get_info() if scrolled cursors are supported, and my DBMS doesn't support them, does DBI say "unsupported", even tho its
going to try to provide an emulation, and so (as far as the app is
concerned) it is supported?

That is a relevant question, to which I don't know Tim's answer. It is a tricky issue. One way to view it is "$dbh->get_info should indicate what the DBMS and its connectivity API can do". That way, the discerning application can deduce that a client-side emulation of scroll cursors will be used, probably the version provided by DBI as a default. This is also by far the easiest to implement, which counts heavily in its favour! The alternative view, which I temporarily suggested (but now retract), says "$dbh->get_info should indicate what the DBMS and its connectivity API and DBI and its DBD::Driver can do". This then gets very tricky - in fact, I don't think it is feasible or sensible, which is why I retract the suggestion.

> In fact, is that even possible? ie, if the target datasource can't
> do arbitrary positioning, can DBI actually safely create an
> emulation (e.g, the resultset is very large, and there's no
> explicit ordering of the resultset, assuming you can't pull the
> entire resultset into the client, how do you reposition to the
> first row once you've consumed 3 bufferful's?) Or is there some
> error to be raised in the case when scrollbable cursors can't be
> safely emulated?

Memory exhaustion is always a risk - client-side especially. You can do what the server would do, and use disk as a backing store (with lots of attendant problems). You can be careful to minimize the amount of data in each row as discussed above. And you can only use scroll cursors when the result set is going to be small -- we could even impose a configurable upper limit on the number of rows that can be stored in a scroll cursor.

from your POD:
"Previously fetched rows are cached (possibly in the server,
possibly in the client). When a row is to be fetched, if it is
already in the cache, it is returned from the cache. [...]"


Can DBI resurrect that first row after I've processed 10,000,000
rows, assuming my datasource doesn't have native support for
scrollable cursors?

If no limits get in the way, then yes. But scroll cursors are really only meaningful when used with relatively small data sets -- if you get to 1000 rows, I would debate whether the users is really going to sit there stepping through that list. OK; yes, I sometimes have to poke through about six hundred feature requests, stepping back and forth, but I hate it. If it turns out that there are more rows in the result set than a thousand, I'd be off trying to refine the query somehow.

> Can you elaborate on how you envision handling that case? Do you
> expect some unique row ID, or unique set of key fields, to be
> available to be able to do that?

If an application really needs to consider 10,000,000 rows on a regular basis (which I think is itself implausible), then I would surely expect the scroll cursor to only store primary key data and to uuse the double fetch outlined above. Unless, possibly, the other data was so small that there was a signinficant penalty in the double fetch. For example, suppose the PK info was 20 bytes and the non-PK info was only 4 bytes; then I'd probably not bother with the double fetch. But a more typical scenario would be that the PK info would be 4-16 bytes and the non-PK info would be 50-1000 bytes, and then the penalty for the double fetch (especially with a client-side emulation of scroll cursors and re-fetching previously fetched data!) would be much smaller, and usually negligible.

Maybe I'm reading too much into this API, and there won't be any emulation provided, and execute_scrolled() will return undef if DBI
doesn't get a positive reading from get_info(SUPPORTS_SCROLLED) or somesuch?

No, you're not reading too much into the specification. The intention is that scroll cursors would be available via DBI regardless of whether or not the DBMS supports the functionality, and regardless of whether the driver implements the support code when the DBMS does support the functionality.

And the intention is to provide users with enough rope to hang themselves if they don't pay attention to how to use scroll cursors sensibly (which is not the same as correctly).

For example, you should not use a scroll cursor for report writing, or most processes where there isn't a human attached to the program. There is an overhead associated with scroll cursors (explicitly visible when DBI is doing an emulation, not so visible when the DBMS is managing it), and if you aren't going to use the random access functionality, then it is a waste of (DBMS or DBI) effort to use scroll cursors. But we can't stop you doing that. Further, there isn't a reliable way to detect that the application is abusing scroll cursors -- it could be that the application allows the user to jump from first to last and all stops in between, but the user driving the application is unaware of (or simply chooses not to use) the facilities and simply does next, next, next, over and over again, even though the application would willingly jump them through the list.
I note that many web-ish applications would not use scroll cursors, though they do provide a method for chunking data into pages (rows 1..20, then 21..40, 41..60, etc) - but I'd expect applications to abuse them for that purpose, simply because there isn't a reliable (portable) alternative mechanism. If the statement handle and positioning information is kept current as the chunks are returned, then this is an OK reason for using scroll cursors, especially if you support going back a page. But if you have to re-establish the result set for each chunk, then it gets expensive, and quadratically expensive as you process rows 981..1000, 1001..1020, etc.

>[...]

--
Jonathan Leffler ([EMAIL PROTECTED], [EMAIL PROTECTED]) #include <disclaimer.h>
Guardian of DBD::Informix 1.04.PC1 -- http://dbi.perl.org/

Re: DBI - SCROLL CURSOR support - outline implementation

Reply via email to