Jessy and I were ranting about adding the ability to the GeoTools DataStore 
api of being required for a page of data/features, rather than the whole 
dataset.
That would immensely improve (and actually allow) some common scenarios like 
presening a result set in tabular form.

On the client side (uDig in this case, but could be JSF, swing, whatever), it 
is easy to, for example, implement a lazy List, as long as the underlying 
data api allows for pagination.

Doing so in geotools would be actually easy!

No API change would be needed beyond adding two fields to Query:
fromIndex and pageSize. OrderBy is already present in the Filter spec.

here is the conversation, hope to get some comments.

Gabriel
-----------------------
[18:40:00]Gabriel Roldán was thinking what should be needed on the geotools 
front in order to make TableView lazily loaded. On the uDig front it is easy, 
problem would be with the insufficient geotools Data api
[18:40:50] … I thought the DataStore api could be augmented in a similar 
fashion as the catalog queries are done, aka, you can request by "pages" of 
data
[18:40:56] Jesse Eichar I think Jody envisioned using the FeatureList API
[18:41:01] … you can get all the fids.
[18:41:16] … then using the FeatureList API load up the features by fid
[18:42:27] Gabriel Roldán I mean, on the uDig side you can use FeatureList for 
sure. The lazy loading strategy would be transparent for uDig, but still the 
datastore api should be more friendly
[18:42:35] … getting all the fids could still be a pain
[18:43:02] Jesse Eichar I see.
[18:43:12] Gabriel Roldán an approach that's proven to work is getting the 
feature/element count and then request by pages of a user defined size
[18:43:42] … (proven to work == I'm doing it on the catalog implementation)
[18:43:54] Jesse Eichar I can't think of any problems with that right off.
[18:44:25] … Where would that fit in?
[18:44:34] … WE have a feature collection
[18:44:38] … feature list..
[18:44:54] … would you add a method to feature store? 
[18:45:06] … New type of query?
[18:45:10] Gabriel Roldán you should treat FeatureList as a normal list, use 
the get(int) or iterator() as normal
[18:45:38] … the featurelist impl uses a paging strategy to retrieve content 
on the back
[18:46:19] … provided that you pass it the Query and the list size in the 
constructor, for example
[18:46:58] Jesse Eichar Makes sense.  Would you add a new method in 
FeatureSource?  getFeatures(Query, pageSize)
[18:46:57] … ?
[18:47:18] … or make a new type of Query that has that information.
[18:47:29] Gabriel Roldán getFeature(Query, startIndex, pageSize)
[18:48:26] Jesse Eichar I think I prefer that too.  But could be hard to get 
it to fly because it will eventually require negotiation with GeoAPI.
[18:49:11] Gabriel Roldán Well, could encapsulate it in Query as well, after 
all Query _is_ a parameter object
[18:50:14] … and I guess something like that could certainly be in future 
versions of wfs spce at least, since they already recognized the problem and 
designed it for catalog 2.0
[18:50:57] Jesse Eichar I didn't know that.
[18:53:52] Gabriel Roldán now, implementing getFeatures(Query, from, size) or 
whatever would have its implications. Some RDBMS backed datastores could 
manage it easyly I guess
[18:54:32] Jesse Eichar It does.  
[18:54:34] … WFS for example
[18:55:16] … The problem is that things aren't inherently ordered
[18:55:46] … so index 3 could (at least theoretically) be a different item 
between calls.
[18:56:31] … WFS 1.1 I think has some sort-by functionality I think but 1.0 
doesn't
[18:57:49] … For that one it makes sense to get all fids in the query.  
Shapefile and other file based ones I the will be ok.
[18:58:13] Gabriel Roldán just 1'
[18:58:39] Jesse Eichar sur
[19:00:34] Gabriel Roldán sorry, had a phone call
[19:00:42] Jesse Eichar np
[19:00:44] Gabriel Roldán you're completelly true
[19:01:10] … so a requirement would be an order being explicitly set in the 
Query
[19:01:37] Jesse Eichar For WFS we could obtain all fids and manage the paging 
on the client, at least until 1.1. 
[19:01:39] Gabriel Roldán what I'm doing in catalog is ordering by ID if the 
Query has no orderBy
[19:02:05] Jesse Eichar we can order by fid if not specified.
[19:02:14] … seems reasonable.
[19:02:19] Gabriel Roldán problem with fids is that getting two million fids 
could still be quite killer
[19:03:07] Jesse Eichar I know it.  I'm open to suggestions...  
[19:03:09] Gabriel Roldán what raises me another concern I was thinking on
[19:03:41] … I know we've defined feature ids to be String as to be friendly 
with the WFS spec
[19:03:58] … still it makes no much sense on the pure java side of things
[19:04:10] … I would like to see FID as an interface
[19:04:33] … so implementors could optimize as needed, instead of creating 
millions of strings by prepending the feature type name, etc
[19:05:07] … but that's another concern, I tend to ramp :P
[19:05:16] Jesse Eichar :D You best jump on the FM discussion with that.  I 
don't think that's going to happen too soon.
[19:05:34] Gabriel Roldán yeah, I guess so
[19:06:11] Jesse Eichar But back to the point.  I'm completely in agreement 
with you on the Paging requirement.
[19:06:30] … I'd be happy to do some of the implementations for you.
[19:06:49] Gabriel Roldán cool, that kind of stratagies works just great for 
presenting huge amounts of tabular data in other domains
[19:06:56] … so it should work for us too
[19:07:19] Jesse Eichar I think it has to be done.  Its impossible to deal 
with this amount of data otherwise.
[19:07:47] Gabriel Roldán I like the idea that the FeatureSource interfaces 
doesn't needs to be touched
[19:07:53] … just Query
[19:08:19] Jesse Eichar It'll be much quicker and easier to get it integrated 
with geotools that way.  Pretty clean too.  
[19:08:19] Gabriel Roldán we already have order by in Filter, so just from and 
page size are needed
[19:09:12] … note that FeatureCollection.size() still should return the whole 
query size (aka, hits), and not the page size
[19:09:31] Jesse Eichar Yes.
[19:09:47] Gabriel Roldán in that case I'm wondering what's the easiest way of 
knowing when you're done fetching content
[19:09:51] Jesse Eichar and get() can get any feature, not just those in the 
current page (for FeatureList)
[19:09:57] Gabriel Roldán other by requiring client code to use a counter
[19:10:02] … sure
[19:10:32] … I did that for presenting catalog results using Java Server Faces 
and works great
[19:10:57] … a custom list impl that queries the required page of data if the 
index isn't on the current page
[19:11:38] Jesse Eichar Do you cache the fetched features so you don't have to 
get them more than once?  or get a fresh copy each time?
[19:12:36] Gabriel Roldán in the catalog case I fetch the whole page onto 
memory. In our case I guess we could be even smarter and maintain the 
streamed nature of stuff even on the pages
[19:13:12] … not sure if I'm explaining me well enough
[19:13:25] Jesse Eichar You're doing fine
[19:16:02] Gabriel Roldán cool, do you mind if I post this to the list?
[19:16:15] Jesse Eichar No at all
[19:16:26] Gabriel Roldán better said, do you think it is something that 
worths being posted?
[19:16:37] Jesse Eichar haha
[19:16:44] … Yeah I think it should be.
[19:16:56] … People will have comments for sure.
[19:17:34] Gabriel Roldán nice, forgot to jump on the geoserver irc meeting, 
uhg
[19:17:49] Jesse Eichar shoot me too

-- 
Gabriel Roldán ([EMAIL PROTECTED])
Axios Engineering (http://www.axios.es)
Tel. +34 944 41 63 84
Fax. +34 944 41 64 90

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to