Hi Vitali, Some comments.
H2GIS is an improvement and refactoring of the H2spatial extension used and developed since 2005 at CNRS. The first release was presented during the GvSIG days in 2006 (https://halshs.archives-ouvertes.fr/halshs-01145771). In fact, the H2 spatial extension was developed to support hydrological spatial analysis methods during my thesis (2002-2005). The first architecture of H2 spatial was very very chaotic :-(. In July 2006, we discover the Chris Holmes (Open plan projet) approach's, available from Derby and HSQL ( http://old.geoserver.org/SpatialDBBox.html). The second architecture 2006-2010 follows this approach. A custom blog data type was used to store geometry in H2 database. In 2011, we decided to contact Thomas to talk with him about spatial index in H2. Thomas was (as usual) very receptive to our needs and he has added a Rtree index storage in H2. Since 2011, we collaborate with the H2 community about the geometry type and a new extension called H2GIS is born. H2GIS is used in my team to process huge data and create advanced spatial analysis or simulation like noisemap (http://noisemap.orbisgis.org/). For example, we are able to process billion of noise sources located on road network. So yes H2 database is very robust and efficient. You said « Alltogether very likely I will do refactoring of geodb, hatbox and GeoTools to work with GEOMETRY type... » I'm working on a extension to connect H2GIS with Geoserver using the Geotools datastore model (https://github.com/ebocher/geoserver-h2gis, thanks to geotools community). May be a good option for you... So you can take profit of all H2GIS functionalities (ST_ConstainedDelunay, Network Analysis, SPHAPEFILE TABLE access...). My colleague Nicolas Fortin will definitely answer to the technical points (memory usage, data type…). Best regards Erwan Le lundi 4 mai 2015 16:07:52 UTC+2, Vitali a écrit : > > > > On Monday, May 4, 2015 at 3:32:39 PM UTC+3, Thomas Mueller wrote: >> >> Hi, >> >> > But already for many years the spatial support was provided by a >> combination of geodb + hatbox libraries and integration in GeoTools world >> >> Yes. However, those don't use the built-in R tree. Do they use an >> external R tree? >> > > Hatbox provides R-Tree. It is based on H2 infrastructure (some auxiliary > table is created where nodes are stored). What is "built-in" R-Tree would > mean? > >> >> > All these was done on BLOB type where a geometry WKB is stored. >> >> A small BLOB is stored inline, so it might not be that bad. >> >> > any access of BLOB value makes a copy of it >> > Yes, that is what I meant. For certain scenarios it has significant > performance issues anyway as I experienced. Millions of temporary LOB > entries when you have just dozens of thousands of spatial records and some > not very optimized spatial query. > >> >> Access is making a copy of the reference of a large BLOB. >> >> > Isn't it 2Gb is a limit for binary types? >> >> I reality the problem is the memory usage (heap memory). >> > > That should not be a problem. Typically in GIS application the biggest > result sets extracted from the database are not hold or cached long time > but rather used to render spatial features and immediately any references > released in JVM. Whether it's BLOB or BINARY anyway it's loaded to memory > to parse Geometry from WKB. May be with VARBINARY a bit more data is kept > during short period of time in memory than would be with BLOBs. May be I > would consider an approach in ValueGeometry that bytes are kept just until > geometry is requested, then lazily Geometry is parsed and bytes are > released. So that at any point of time whether bytes are hold or Geometry > as an object. From bytes to Geometry, from Geometry to bytes when necessary. > > In SELECT scenarious bytes are needed until Geometry object is created and > then it is used outside of result set or locally during command execution. > I am not sure how relevant in scope of the whole database infrastructure > does this sound. > Am I right that until local result set data structure is fully composed it > is not returned to caller? Then if result set is huge then all bytes are > anyway kept in memory until result set is delivered and the client starts > to request Geometry objects when bytes would be cleaned... > > > Vitali. > > >> Regards, >> Thomas >> >> >> >> On Sun, May 3, 2015 at 9:51 PM, Vitali <[email protected]> wrote: >> >>> Hello. >>> >>> I would like to share some observations. Recently H2 got a Geometry >>> type, logic around it seems is growing, also some extra tiers like H2GIS >>> are under development. All together this seems as a future of spatial >>> support in H2. But already for many years the spatial support was >>> provided by a combination of geodb + hatbox libraries and integration in >>> GeoTools world (as H2 data store interface for storing/managing spatial >>> features with geometries). >>> All these was done on BLOB type where a geometry WKB is stored. >>> >>> BLOB became completely useless as a type for handling WKB of geometries. >>> Because of this change that any access of BLOB value makes a copy of it. >>> HATBOX and GEODB libs based on JTS library provide functions to work with >>> WKB. But any call of these functions makes a read of BLOB value which makes >>> a copy in memory. Some spatial conflation operations being not-optimized >>> (having polynomial complexity with applying spatial predicates between any >>> combination of input geometries from 2 tables e.g.) now have a >>> catastrophic performance and memory consumption. Cases where old H2 just >>> worked 10 secods performing some kind of spatial operation between 2 layers >>> (tables) now runs 2 hours , 3Gb of database file (instead of 400Mb >>> normally) and outofmemory error finally. And long cleanings of temporary >>> LOB storage on app start, app close, transaction commit after such >>> operations. >>> >>> I understand real reasons of this BLOB copying approach. But the >>> conclusion is that BLOB is not a right type for geometries. In typical GIS >>> (like UDIG) thousands of records are extracted every second for multiple >>> layers during rendering and other types of requests need geometries. Now >>> BLOB became inefficient. >>> >>> Alltogether very likely I will do refactoring of geodb, hatbox and >>> GeoTools to work with GEOMETRY type which is basically VARBINARY kind of >>> which means WKB is just read to memory. But it is what usually is needed to >>> GIS app - to get a geometry almost every time when data is read. Also >>> because JTS geometry is lazily cached in ValueGeometry various logic in H2 >>> (like custom spatial functions call multiple times) gets benefits. I think >>> H2GIS toolkit more or less uses this approach already. >>> >>> The only concern is that are there any limitations for cases like "lake >>> boundary" that consists from hundreds of thousands of vertices.. Isn't it >>> 2Gb is a limit for binary types? Then it's fine.. But how do older >>> PageStore and modern MVStore handle this type? Any performance issues? >>> >>> Vitali. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "H2 Database" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/h2-database. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/h2-database. For more options, visit https://groups.google.com/d/optout.
