Hi Jukka,

Wow, this is a huge table. How much memory do you have on your JVM ?
Indeed, PostGISDataSource already uses batches for bulk uploads (which explains why you get your data into your database) but not for queries. Uploads are immediately followed by a new query (to
get the database feature id). That's where you get the OOM Exception.
(in StackTrace you can see reloadDataFromDataStore, then executeQuery)

At the end of the loading phase, I think we can have as much 3 datasets :
- the old FeatureCollection
- the new FeatureCollection nearly completed (if we remove the old collection before getting the new one, we take the risk to loose all) - the raw ResultSet from the database which is not yet entirely converted to the new FeatureCollection

I think the only place where we can save a bit of memory by using batches (as you suggested) is in the transformation from ResutSet to the new FeatureCollection (but we still will have to handle more than two times the dataset size).

Michaël


Le 03/11/2016 à 17:05, Rahkonen Jukka (MML) a écrit :

Hi,

I tried to write a large layer with 91 million vertices into PostGIS with the Save to PostGIS (new) driver. It took a few hours and ended with Out of memory error:

java.lang.OutOfMemoryError: Java heap space

at com.vividsolutions.jts.io.WKBReader.hexToBytes(WKBReader.java:70)

at com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesValueConverterFactory$WKBGeometryValueConverter.getValue(SpatialDatabasesValueConverterFactory.java:96)

at com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesResultSetConverter.getFeature(SpatialDatabasesResultSetConverter.java:52)

at com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesFeatureInputStream.getFeature(SpatialDatabasesFeatureInputStream.java:99)

at com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesFeatureInputStream.readNext(SpatialDatabasesFeatureInputStream.java:95)

at com.vividsolutions.jump.io.BaseFeatureInputStream.hasNext(BaseFeatureInputStream.java:31)

at org.openjump.core.ui.plugin.datastore.postgis2.PostGISDataStoreDataSource.createFeatureCollection(PostGISDataStoreDataSource.java:93)

at org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource.access$000(WritableDataStoreDataSource.java:38)

at org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource$1.executeQuery(WritableDataStoreDataSource.java:160)

at org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource$1.executeQuery(WritableDataStoreDataSource.java:171)

at org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource.reloadDataFromDataStore(WritableDataStoreDataSource.java:543)

at org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource.access$500(WritableDataStoreDataSource.java:38)

at org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource$1.executeUpdate(WritableDataStoreDataSource.java:228)

at com.vividsolutions.jump.workbench.datasource.AbstractSaveDatasetAsPlugIn.run(AbstractSaveDatasetAsPlugIn.java:28)

at com.vividsolutions.jump.workbench.ui.task.TaskMonitorManager$TaskWrapper.run(TaskMonitorManager.java:152)

at java.lang.Thread.run(Unknown Source)

However, there is a new table in the database and it may even contain all the data. I could not check that because OpenJUMP runs also out of memory when I try to read the table with Run Datastore Query. I can load the same data from JML so it feels like OpenJUMP puts all the data from the SQL query to memory and converts it then to OpenJUMP feautures which requires perhaps two times more memory. I admit that this dataset is quite big but I wonder if reading data from PostGIS could be made more memory-savvy perhaps by reading data in chunks with repeated paged requests “limit 10000 offset xxx”.

-Jukka Rahkonen-



------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi


_______________________________________________
Jump-pilot-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Jump-pilot-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel

Reply via email to