William,

You make some excellent points.  In my case I am not so concerned about the
consistency of the result set however this should certainly be a conscious
decision when designing your paging strategy.

You may see some advantage of combining your approach with the database
server side implementation of setLimit()/setOffset() for Oracle that I
mention in my "reasonably complete review [that skips some important issues
raised by William ;-)]" message that I posted a few minutes ago.

Cheers,

Scott
-- 
Scott Eade
Backstage Technologies Pty. Ltd.
Web: http://www.backstagetech.com.au


> From: William Webber <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> Date: Tue, 16 Jul 2002 12:35:38 +1000
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED], [EMAIL PROTECTED]
> Subject: Re: Fwd: Pagination strategies
> 
> Re: paginating of query results.
> 
> Another issue is that the data can change while the user is
> paginating through it.  Earlier records being deleted can
> mean that later records are skipped in paginating forward;
> records being added eariler mean that later records are seen
> on two different pages.  So the naive setLimit()/setOffset()
> approach has issues with this.
> 
> As you say, the problem with loading all the data and
> storing it in temporary storage is the memory consumption,
> which could be prohibitive.  Also, a large amount of time is
> spent converting the SQL data to Torque objects, most of
> which will never be used if the user does not go past the
> first couple of pages (which is common enough user
> behaviour).
> 
> What we've done (following a suggestion dug up by Rodney on,
> I think, an Oracle list) is write all the results matching
> the query to a "results cache" table in the database, keyed
> by a search id that the client holds as a reference.  We
> only write those fields that are actually going to be
> displayed in the results.  This is done with a single
> "INSERT INTO Cache_Table SELECT ... FROM Original_Table"
> statement, so nothing comes across the wire into javaland at
> this stage.  This result cache is then paginated through
> using setLimit()/setOffset().
> 
> You've then got the issue of how to expire the cache.  We
> handle this by having a cache manager that is an
> HttpBindingEventListener attach itself to the user's
> session, which deletes the cached search when the user logs
> out.  We then clean up any cached results that are missed as
> part of our regular maintenance scripts.
> 
> Performance-wise, using HEAP tables under MySQL, time taken
> is similar to loading all records into Torque objects
> (except, of course, the cost is moved to the MySQL process).
> The advantage is reduced memory usage.  (The other
> consideration for us was the requirement that the user be
> able to re-sort the order of results at any stage, and we
> preferred to have the database do this for us than have to
> re-sort in Java.)
> 
> The other approach we considered was to retrieve the list of
> IDs of records meeting the search criteria, then as we
> paginate through results only fetch the IDs on the current
> "page".  This reduces the initial start-up hit of either
> loading all records into Java, or into a cache table in the
> database, and it also deals with the issue of new records
> being added during pagination (they won't turn up in the
> result list).  But it didn't solve the issue of records
> being deleted during pagination.  (For other reasons, we
> no longer physically delete records, only mark them as
> "deleted", so this would no longer be an issue for us.)
> 
> In general, I'd suggest:
> 
> 1.) if you're not too worried about results consistency, use
> the naive setLimit()/setOffset() method.  This is the
> easiest, the most efficient (I think), and the least likely
> to bring down your server through out-of-memory problems.
> 
> 2.) if results consistency matters, but you're not too
> worried about running out of memory, load all the results
> into Java and store them in the user's temporary storage.
> 
> 3.) if results consistency matters AND you're worried about
> running out of memory, but processing time is not a huge
> concern, write the results to a cache table.
> 
> 4.) if you're worried about results consistency, memory
> usage, and processing time, and you're willing to put a lot
> of effort in (there are quite a few gotchas), then breath
> deep, and think about retrieving a list of IDs and working
> with them.
> 
> Hope this is of some use.  I would be interested to hear of
> other approaches, too.
> 
> William


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to