Sounds like a good idea to me. Additional thoughts and comments inline.

Andrea Aime wrote:
> Hi,
> I've been looking at the directory data store again and
> the more I delve into it, the more I believe a rewrite
> is the only way to go.
> 
> The current one tries to delegate to secondary data stores,
> relies on those datastore factories to implement
> a specific interface, does not support namespaces (crucial
> to have GeoServer use the datastore), assumes one feature type
> per child datastore, has serious caching issues, does not
> deal properly with datastore disposal and... well... shall
> I go on?? ;-)
> 
> So I've been thinking about DirectoryDataStore v2.
> 
> What it should do:
> - given a directory, find all the feature types stored inside it, in
>    whatever format
> - the directory might contain files that do store more
>    than one feature type and datastores that might catch more than one
>    file (shapefile, property ds), that should be handled gracefully
> - eventually handle recursive scan (but only as an option, it might be
>    expensive)
> - support proper namespace setting
> - the user should not be concerned with the native datastore serving a
>    certain file, yet it could be useful to have access to it on occasion
> 
> How:
> - take a directory, a namespace (eventually a recursion flag)
> - scan all the files in the specified directory
> - get rid of all the file data store assumptions, just look for
>    datastores that can open a certain URL (the current file) with a
>    certain namespace and load all the feature types that are inside of
>    them
> 
How will this work? How can one ask a generic DataStoreFactorySpi if it 
can handle a file? Which I thought was the entire reason for 
FileDataStoreFactorySpi.
> Issues:
> - a certain datastore can open multiple files (shapefile, property ds),
>    we want to avoid keeping duplicate datastores around
> - a directory (or worse, a tree) can hold a massive amount of feature
>    types, there are legitimate scalability/memory consumption concerns.
> 
> Using a lightweight (soft reference based) cache has issues with
> datastore disposal, as the datastore we're trying to dispose might be 
> the holder or a resource that  might be in use by a reader or a feature
> source, closing it might kill the current user...
> This one is hard actually, the api does not give us any clue on
> whether a datastore generated object is still being used or not...
> To avoid it we'd have to keep strong references to all datastores
> that have returned a feature source, a reader or a writer at
> least once. Maybe we can add a custom api to this datastore to
> force some resource release (stop gap measure for the lack of a better
> way).
> 
Yeah tricky. It seems to me what we lack is a dispose on FeatureSource. 
With a dispose method we could easily track which delegate DataStores's 
still are in use and not kick them out of the cache. I seem to remember 
dispose on FeatureSource being proposed in the past? Perhaps by Jesse? I 
could be way off.
> Suggestions, reactions?
> Cheers
> Andrea
> 


-- 
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Geotools-devel mailing list
Geotools-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to