Regarding index "formats", one option to consider is MapDB. With it you can 
create *HUGE* hashmaps and other collections backed by disk or off-heap memory 
storage. I used it extensively a year ago to process OSM planet data. I believe 
the H2 guys were/are using it too. 

http://www.mapdb.org/
MapDB is free under Apache License.
Peter



      From: Jody Garnett <jody.garn...@gmail.com>
 To: A Huarte <ahuart...@yahoo.es> 
Cc: Ian Turton <ijtur...@gmail.com>; Geotools-Devel list 
<geotools-devel@lists.sourceforge.net>
 Sent: Wednesday, December 2, 2015 2:12 PM
 Subject: Re: [Geotools-devel] Voting for 
"Implement-a-pure-java-Dbase-indexing-to-optimize-shapefile-access" proposal
   
Thanks Andrea / Alvaro for working through this.
Alvaro I need to take some time with your proposal from a design perspective 
and make sure we are not making GeoTools unduly complicated. I do like the 
approach being taken, but we will need to document it for others.
Aside: I am a bit sad that the index format you have chosen cannot be produced 
by an open source tool - taking it to an unsupported module is a great move. 
Ian made noises in yesterdays meeting about research a better index format, if 
he can find one would you be interested in working on it with him?
--Jody Garnett


On 2 December 2015 at 03:22, A Huarte <ahuart...@yahoo.es> wrote:

Hi Andrea, I have changed the proposed pull applying your advices.
Now, the shapefile module has not code about CDX-indexes. It is implemented in 
a new unsupported shapefile-cdx module.
https://github.com/geotools/geotools/pull/1056
Best regardsAlvaro

     De: Jody Garnett <jody.garn...@gmail.com>
 Para: Andrea Aime <andrea.a...@geo-solutions.it> 
CC: Geotools-Devel list <geotools-devel@lists.sourceforge.net>; A Huarte 
<ahuart...@yahoo.es>
 Enviado: Domingo 29 de noviembre de 2015 19:49
 Asunto: Re: [Geotools-devel] Voting for 
"Implement-a-pure-java-Dbase-indexing-to-optimize-shapefile-access" proposal
   
Thank you Andrea, that is a sensible -1 vote with a viable workaround for 
Alvaro.
Alvaro is there another index format available with an open source 
implementation (so we could create one). It is hard when working with DBF (a 
file format from the 1980s) to find any modern custodian. I think Excel and 
Access can import the format, but I doubt they support an index.
Reading OGR docs:

Currently the OGR Shapefile driver only supports attribute indexes for looking 
up specific values in a unique key column. To create an attribute index for a 
column issue an SQL command of the form "CREATE INDEX ON tablename USING 
fieldname". To drop the attribute indexes issue a command of the form "DROP 
INDEX ON tablename". The attribute index will accelerate WHERE clause searches 
of the form "fieldname = value". The attribute index is actually stored as a 
mapinfo format index and is not compatible with any other shapefile 
applications.

So some kind of mapinfo format index.
--Jody Garnett
On 29 November 2015 at 10:36, Andrea Aime <andrea.a...@geo-solutions.it> wrote:



Hi,regarding this 
proposal:https://github.com/geotools/geotools/wiki/Implement-a-pure-java-Dbase-indexing-to-optimize-shapefile-access
After reviewing the proposal and checking the current pull request, I'm going 
to cast a -1, fair and square.
I'm quite happy to see work on indexing the DBF files, but the current 
implementation choice is not acceptable for a few reasons:* It bakes a 
large-ish amount of new untested code into a core supported module that's used 
by many users* The current implementation causes evident performance 
regressions if the CDX file is not there* The chosen format, CDX, has been 
created by Microsoft for Visual Foxpro, and can only be created   by Visual 
FoxPro* Visual FoxPro has been completely abandoned by Microsoft, the last 
release of it being dated 2007  
https://msdn.microsoft.com/it-it/vfoxpro/bb190225.aspx* The current 
implementation of the CDX support cannot create an index, just use an existing 
one, so you   either own a Visual FoxPro license and can run windows, or you're 
toast
Long story short, the implementation is of interest of a niche within a niche, 
the subset of users thatcannot use a proper spatial database, nor H2 with 
spatial extensions, nor GeoPackage, but demand usage of shapefiles, and still 
own a 8 years old license of Visual FoxPro to create theCDX files.
That said, I don't want to reject the code, just make it manageable so that 
only the few interestedparties can be affected by its presence, and without 
burdening myself (as the shapefile modulemaintainer) with code that in the 
current state only helps a minuscule portion of the user base.
Alvaro already made modifications to the shapefile store to support generic 
attribute indexing,with interfaces to support other index types (e.g., MDX), in 
order to make the contribution acceptable all that's required
is to make it actually pluggable, via SPI, so that the CDX index implementation 
can reside in another module.And then create a gt-shapefile-cdx unsupported 
module that contains the CDX indexing support, which willbe maintained by 
Alvaro himself, so that when that jar is plugged in, CDX files will be usedfor 
attribute indexing.This will also lower the bar for entry, as an unsupported 
module does not have particular requirementsfor code quality or testing level, 
it's there for everybody to try at their own risk.
Alvaro, in order to make the plugin system work you'll need to write a factory 
class for each index typeand a finder helping to locate the factories. The 
latest extension point we added is for projection handlersin the referencing 
subsystem, you can find the commit introducing it 
here:https://github.com/geotools/geotools/commit/d206957bc4d8de1f7d33e06501aa3b9c95496d7d

For reference, the moving parts of the SPI system used by ProjectionHandler 
are:* The object being created, in this case, an implementation of the 
ProjectionHandler interface: 
https://github.com/geotools/geotools/blob/master/modules/library/render/src/main/java/org/geotools/renderer/crs/ProjectionHandler.java*
 The factory that creates it: 
https://github.com/geotools/geotools/blob/master/modules/library/render/src/main/java/org/geotools/renderer/crs/ProjectionHandlerFactory.java
* The finder that can be invoked to locate all the factories: 
https://github.com/geotools/geotools/blob/master/modules/library/render/src/main/java/org/geotools/renderer/crs/ProjectionHandlerFinder.java*
 The registration files listing which factories are available, in META-INF, 
using the same name as the factory itself: 
https://github.com/geotools/geotools/blob/master/modules/library/render/src/main/resources/META-INF/services/org.geotools.renderer.crs.ProjectionHandlerFactory
You already have the interface and implementation, and will have to add the 
three other bits (for a single implementationit should be quick).
The new unsupported module will give the CDX code an occasion to be tested by 
the interested user basewithout affecting the general user population, and will 
greatly expedite the inclusion in the code base (you willreceive a detailed 
review only of the changes in gt-shapefile, to make sure there are no 
functional and performanceregressions for the index-less case).
CheersAndrea
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel



  


------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel


 
------------------------------------------------------------------------------
Go from Idea to Many App Stores Faster with Intel(R) XDK
Give your users amazing mobile app experiences with Intel(R) XDK.
Use one codebase in this all-in-one HTML5 development environment.
Design, debug & build mobile apps & 2D/3D high-impact games for multiple OSs.
http://pubads.g.doubleclick.net/gampad/clk?id=254741911&iu=/4140
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to