Re: [JPP-Devel] Comments on FeatureCache & the JUMP DataStore API

Larry Becker Wed, 06 Jun 2007 19:18:54 -0700

Sunburned,

  Just curious - how many parcels are in your county?  How many MB and
how long does it take to load, if it does?  I've loaded the data from
several Texas counties into just 512 MB in   a fairly short time.


regards,
Larry

On 6/6/07, Sunburned Surveyor <[EMAIL PROTECTED]> wrote:
> Martin wrote: "It sounds to me like you are trying to build the same
> functionality that currently exists in spatially-enable DBs - that is,
> fast access to
> ranges of features."
>
> Sort of, but not exactly. I'm really creating something that
> accomplishes the same thing as GeoTools indexed Shapefile reader, but
> with a slightly different approach. Instead of using the Shapefile
> itself to store the data I read the Shapefile and separate it into
> "chunks". Each chunk is either a set of attributes for a Feature or a
> Feature geometry. There are two reasons why I took this approach,
> instead of either using (A) the GeoTools indexed Shapefile reader, or
> (B) an embedded database. The reasons are:
>
> [1] I hope to eventually break away from the limitations of ESRI's
> shapefile specification. To do this I need a system that supports
> Shapefiles, but that has its own internal storage format. I was going
> to use an xml format for this initially, but after talking with some
> of the OpenJUMP developers I realized a binary data format would be
> much quicker. So I'm going to store the features inside the
> FeatureCache using a simplified form of BOFF. (Binary Open Feature
> Format) This will allow me to kill two birds with one stone. I get my
> fast binary storage format, and I've taken a serious step towards a
> file format that can replace Shapefiles. (I know that is an ambitious
> goal for a closet developer like myself, but I always dream bigger
> than my britches.) :]
>
> [2] I want to support other data formats in the future, not just
> Shapefiles or BOFF. That's why I couldn't use GeoTools indexed
> Shapefile reader. The FeatureCahce will be able to import Features
> from any data source that supports Single-Feature-Access.
>
> [3] I want to avoid the complexity of JDBC that must come with the use
> of a embedded database. I realize I loose some of the rich feature set
> offered by a RDBMS with this move, but for me the extra work isn't
> worth what is gained. (Besides, I think some other teams, at least
> Sigle, are working on this technique.)
>
> Martin wrote: "That's fine - it's appealing to have a lightweight,
> free FeatureCache implementation, which is perfectly matched to the job
> of storing JUMP Features (unlike most SDB's, which all have their own
> idiosyncracies of storage model)."
>
> That's my goal with the FeatureCahce.
>
> Martin wrote: "My point about the DataStore is that you need to expose
> your great new Cache to JUMP in some way.  The DataStore API is
> designed to do just
> this - it let's the JUMP renderer query a range of features from a
> datastore in a streaming fashion."
>
> I guess that I am taking a totally different approach to the
> FeatureCache design. (Maybe I need to rethink this.) I'm not even
> messing with the DataStore API. I'll be exposing the FeatureCache to
> OpenJUMP as an implementation of the FeatureCollection interface.
> Internally, OpenJUMP won't be able to tell the difference between a
> FeatureCollection backed by my FeatureCache and a normal "in-memory"
> FeatureCache. (Except that it will be a little slower. I'm trying to
> work in a buffer to help with this. The user will be able to control
> the size of the buffer depending on their computer's RAM and their
> need for speed.) I will need to expose some UI components so that
> users can manage FeatureCahche's, but other than that they'll act just
> like a normal FeatureCollection.
>
> Martin wrote: "This is a good discussion - I'll be interested to hear other's
> comments."
>
> Yup. You know me. All talk and no real work. :] (Just ask Stefan...)
> All joking aside, I've laid out the class diagram for the FeatureCache
> and have started coding the low-level interfaces. I just need to
> decide on the source of my Shapefile code and then I'l start cranking
> away on this thing. It has been my pet peeve in OpenJUMP for a long
> time.
>
> Martin wrote: "My main goal is to have a single API/framework for accessing
> large datastores. That way, all additional work benefits everyone."
>
> This is a good goal. I'll do whatever I can to avoid introducing a new
> DataStore API to OpenJUMP. When I implement Single-Feature-Access to
> Shapefiles using GeoTools or Deegree I will try to make this available
> under the DataStore API.
>
> I'm really looking for a good way to read in a Shapefile of all the
> parcels in my County without OpenJUMP choking. Even Autodesk Map has a
> hard time with that task! :]
>
> The Sunburned Surveyor
>
>
> On 6/6/07, Martin Davis <[EMAIL PROTECTED]> wrote:
> >
> >
> > Sunburned Surveyor wrote:
> > >
> > > I guess I envision the FeatureCache as more of an internal data
> > > manager than I do a data source. The idea is that you suck in features
> > > from any of OpenJUMP's existing DataStores and store them in the
> > > FeatureCache. You'll only need to do this for Layers that will contain
> > > lots of features and might possible max out of your RAM.
> > >
> > > If you have suggestions on how I can implement this particular aspect
> > > of the FeatureCache I'd like to hear them. At this point I plan on
> > > using GeoTools or Deegree ESRI Shapefile code so that I can access one
> > > feature at a time from a Shapefile. (I can't do this with the current
> > > Shapefile driver.) I can do this in an new Shapefile driver that is
> > > written and used just in my FeatureCache, or I can rewrite the
> > > exsiting Shapefile DataStore to provide this access.
> > >
> > > In the long run, I'd really like to see a DataStore interface that
> > > allowed single-feature-access. For example, at some point I may want
> > > to read in huge DXF files as well.
> > It sounds to me like you are trying to build the same functionality that
> > currently exists in spatially-enable DBs - that is, fast access to
> > ranges of features.  That's fine - it's appealing to have a lightweight,
> > free FeatureCache implementation, which is perfectly matched to the job
> > of storing JUMP Features (unlike most SDB's, which all have their own
> > idiosyncracies of storage model).  If it was me I might investigate the
> > possibility of using an existing Java DB, with possibly some spatial
> > enhancements. (You yourself suggested this in a different context...).
> > You're going to develop most of the same functionality anyway, and
> > you'll never get the maturity of the existing DBs.
> >
> > My point about the DataStore is that you need to expose your great new
> > Cache to JUMP in some way.  The DataStore API is designed to do just
> > this - it let's the JUMP renderer query a range of features from a
> > datastore in a streaming fashion.  I think you should target your
> > FeatureCache to be simply another kind of DataStore.  That way, you get
> > all the existing functionality of DataStores - streaming, interruptible
> > drawing, querying by range, connection lifecycle, UI, etc.  Otherwise,
> > you're going to have to build all this yourself, in yet another new
> > framework.  If you need more functionality exposed, this should become
> > enhancements to the DS framework, since likely other use cases will need
> > this as well.
> >
> > Now, one baby step in this direction is to expose the GT indexed
> > Shapefile as a DataStore.  This should be easy to do, and right away we
> > will have the ability to view large shapefiles.
> >
> > If you are wanting DXF as well, then you have a choice to make.  DXF
> > doesn't have any indexing AFAIK, so you'll either have to build this, or
> > move the data into some other format which does.
> >
> > This is a good discussion - I'll be interested to hear other's
> > comments.  My main goal is to have a single API/framework for accessing
> > large datastores. That way, all additional work benefits everyone.
> >
> > HTH - M
> >
> > >
> > > I'm not sure what the best way is to achieve this. If you have
> > > suggestions, please let me know.
> > >
> > > Thanks again for all of your help.
> > >
> > > Landon
> > >
> > >
> > > On 6/6/07, Martin Davis <[EMAIL PROTECTED]> wrote:
> > >> First of all, I don't think you (or JUMP) has to commit to a single
> > >> library for different pieces of functionality.  I think the Feature
> > >> model question is separate from the issue of which Shapefile I/O
> > >> implementation to use.  So if you like the deegree one, just use that.
> > >> I don't really know which library is likely to be better quality or more
> > >> stable.  I guess for Shapefiles I'd probably vote for using GT, but not
> > >> really for any hard reason.  But I agree about your comment about
> > >> deegree being more supportive of OJ - we probably want to encourage
> > >> that!
> > >>
> > >> Another point in favour of deegree is that the GT change process seems
> > >> very cumbersome and slow.  I've always prefereed a smaller group to work
> > >> with, so change can be made faster.
> > >>
> > >> The important thing is to isolate these various components behind
> > >> well-defined interfaces in JUMP, so that they can be swapped out if
> > >> need be.
> > >>
> > >> By the way, are you looking at using the DataStore framework as the API
> > >> to your "Shapefile cache"?  It doesn't make sense to use the existing
> > >> DataSource Reader, since it only reads an entire collection at a time.
> > >> Or are you making yet another API? In that case, I'd be concerned, since
> > >> OJ would then have three different data access APIs.
> > >>
> > >> M
> > >>
> > >> Sunburned Surveyor wrote:
> > >> > Martin,
> > >> >
> > >> > After poking around a little in the Javadoc for the Deegree project I
> > >> > see that it has everything I need for single feature access to an ESRI
> > >> > Shapefile. The library also seems a little more stable than GeoTools,
> > >> > and I think the folks at Deegree are more supportive of the work in
> > >> > OpenJUMP than the folks at GeoTools. (I don't even think any of the
> > >> > GeoTools folks are subscribed to the JPP mailing list, except for you
> > >> > of course.) :]
> > >> >
> > >> > Do you have any thoughts on which library would be the best to use? I
> > >> > was pretty set on getting more familiar with GeoTools, but Markus made
> > >> > some good arguments in his e-mail.
> > >> >
> > >> > What would be the best for OpenJUMP in the long run? What would be the
> > >> > best for the open source Java GIS community in the long run?
> > >> >
> > >> > I know you are busy, and I appreciate the time you take to read and
> > >> > respond to my e-mail.
> > >> >
> > >> > Landon
> > >> >
> > >> > P.S. - I might also bounce this off of David at Vivid Solutions to see
> > >> > if he has any comments.
> > >>
> > >> --
> > >> Martin Davis
> > >> Senior Technical Architect
> > >> Refractions Research, Inc.
> > >> (250) 383-3022
> > >>
> > >>
> >
> > --
> > Martin Davis
> > Senior Technical Architect
> > Refractions Research, Inc.
> > (250) 383-3022
> >
> >
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Jump-pilot-devel mailing list
> Jump-pilot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
>


-- 
http://amusingprogrammer.blogspot.com/

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Jump-pilot-devel mailing list
Jump-pilot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

Re: [JPP-Devel] Comments on FeatureCache & the JUMP DataStore API

Reply via email to