Hi Martin, +1 on the approach you used for GeoTK and what is proposed for SIS. Would it be possible have it work "in memory" without having to write out to disk for a use case like Hadoop?
Thanks, Travis On Thu, Aug 29, 2013 at 7:53 PM, Martin Desruisseaux < [email protected]> wrote: > Hello Adam > > Le 30/08/13 00:55, Adam Estrada a écrit : > > Thanks a lot, Martin. Where do you envision the database of identifier >> codes living? I know in GDAL, we typically read from a directory full of >> CSV's[1] that holds several thousand (not sure of the exact number right >> now) codes along with their transformations. >> > > A lot (maybe most) of those information are derived from the EPSG database > [1]. GDAL extracted some information from the EPSG tables as CSV files. > Indeed, the first row of some files are EPSG column names. The EPSG > database contains definitions for about 5000 Coordinate Reference Systems. > > In Geotk - and what is proposed for SIS - we do not use such CSV files. > Instead, we use a real EPSG database. The EPSG SQL scripts for creating the > database are embedded in the JAR file (we are allowed to redistribute > them), and the database is created the first time that the library is used. > The database engine is at user choice - it would be Derby by default (an > Apache project), but it works also on HSQL, PostgreSQL and MS-Access. > > In Geotk, information not related to EPSG (for example projection names > used by ESRI) were hard-coded in Java. For SIS, I would like to store them > in the database too. Inconvenient is that a database would soon become > somewhat mandatory for many SIS usages. However I think that a database > could hardly be avoided anyway for most medium or advanced usages, and this > can be made transparent for the user if we default to some embedded > database like Derby or HSQL. > > What do you think? > > Martin > > > [1] http://www.epsg.org/ - click on "geodetic dataset" > >
