Hi Jukka,
Wow, this is a huge table. How much memory do you have on your JVM ?
Indeed, PostGISDataSource already uses batches for bulk uploads (which
explains why you get your
data into your database) but not for queries. Uploads are immediately
followed by a new query (to
get the database feature id). That's where you get the OOM Exception.
(in StackTrace you can see reloadDataFromDataStore, then executeQuery)
At the end of the loading phase, I think we can have as much 3 datasets :
- the old FeatureCollection
- the new FeatureCollection nearly completed (if we remove the old
collection before getting the new one, we take the risk to loose all)
- the raw ResultSet from the database which is not yet entirely
converted to the new FeatureCollection
I think the only place where we can save a bit of memory by using
batches (as you suggested) is in the transformation
from ResutSet to the new FeatureCollection (but we still will have to
handle more than two times the dataset size).
Michaël
Le 03/11/2016 à 17:05, Rahkonen Jukka (MML) a écrit :
Hi,
I tried to write a large layer with 91 million vertices into PostGIS
with the Save to PostGIS (new) driver. It took a few hours and ended
with Out of memory error:
java.lang.OutOfMemoryError: Java heap space
at com.vividsolutions.jts.io.WKBReader.hexToBytes(WKBReader.java:70)
at
com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesValueConverterFactory$WKBGeometryValueConverter.getValue(SpatialDatabasesValueConverterFactory.java:96)
at
com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesResultSetConverter.getFeature(SpatialDatabasesResultSetConverter.java:52)
at
com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesFeatureInputStream.getFeature(SpatialDatabasesFeatureInputStream.java:99)
at
com.vividsolutions.jump.datastore.spatialdatabases.SpatialDatabasesFeatureInputStream.readNext(SpatialDatabasesFeatureInputStream.java:95)
at
com.vividsolutions.jump.io.BaseFeatureInputStream.hasNext(BaseFeatureInputStream.java:31)
at
org.openjump.core.ui.plugin.datastore.postgis2.PostGISDataStoreDataSource.createFeatureCollection(PostGISDataStoreDataSource.java:93)
at
org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource.access$000(WritableDataStoreDataSource.java:38)
at
org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource$1.executeQuery(WritableDataStoreDataSource.java:160)
at
org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource$1.executeQuery(WritableDataStoreDataSource.java:171)
at
org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource.reloadDataFromDataStore(WritableDataStoreDataSource.java:543)
at
org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource.access$500(WritableDataStoreDataSource.java:38)
at
org.openjump.core.ui.plugin.datastore.WritableDataStoreDataSource$1.executeUpdate(WritableDataStoreDataSource.java:228)
at
com.vividsolutions.jump.workbench.datasource.AbstractSaveDatasetAsPlugIn.run(AbstractSaveDatasetAsPlugIn.java:28)
at
com.vividsolutions.jump.workbench.ui.task.TaskMonitorManager$TaskWrapper.run(TaskMonitorManager.java:152)
at java.lang.Thread.run(Unknown Source)
However, there is a new table in the database and it may even contain
all the data. I could not check that because OpenJUMP runs also out of
memory when I try to read the table with Run Datastore Query. I can
load the same data from JML so it feels like OpenJUMP puts all the
data from the SQL query to memory and converts it then to OpenJUMP
feautures which requires perhaps two times more memory. I admit that
this dataset is quite big but I wonder if reading data from PostGIS
could be made more memory-savvy perhaps by reading data in chunks with
repeated paged requests “limit 10000 offset xxx”.
-Jukka Rahkonen-
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Jump-pilot-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Jump-pilot-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jump-pilot-devel