Hello again,

I've tried creating a transaction and I've got good and bad news.
On the one hand, performance has really improved (which can be mostly
noticed for small shp sizes).
On the other hand, if the output is a bit larger (~ 2 MB is enough), I
can observe a performance degradation:
- The first 4000 features take 25 seconds to be written
- The next 4000 features take 75 seconds to be written
- The next 4000 features take 123 seconds
- The remaining features (about 3000) take 119 seconds

Note that all the features are rectangles (so the size/complexity is
the same for all of them).

As a summary, now it takes 5 minutes to create a 2 MB SHP (14896
features) using a ShapefileDataStore and a single transaction, which
is the same time I got using the MemoryStore (writing it to disk
afterwards).
While this is faster than the initial 63 minutes, I still think it is
really a lot of time to create a 2MB SHP.
If I want to create a 10 million records shapefile, it would take (not
counting performance degradation) more than 2 days.
I think I would need to use the low-level GeoTools classes to solve this...

For the record, now the code looks like this:
create method (object initialization):
        transaction = new DefaultTransaction("addFeature");
        transactionFeatures = 0;
        store.setTransaction(transaction);
        m_featureBuilder = new SimpleFeatureBuilder(store.getSchema());


addFeature method (which is called repeatedly):
        FeatureCollection<SimpleFeatureType, SimpleFeature> collection =
FeatureCollections
                                .newCollection();
        m_featureBuilder.add(geom);  // cached feature builder
        m_featureBuilder.addAll(values);
        SimpleFeature feat = m_featureBuilder.buildFeature(null);
        collection.add(feat);
        store.addFeatures(collection);

postProcess method (called just once after finishing adding features):
        transaction.commit();
        transaction.close();


Thanks for all your guidance,
Best regards,

César


El día 22 de diciembre de 2009 12:11, Andrea Aime <[email protected]> escribió:
> César Martínez Izquierdo ha scritto:
>
>> That sounds more interesting for me. I'm going to try this, but I'd
>> like to know the consequences of it.
>> By doing so, does it mean that features are not written to disk until
>> the transaction is complete?
>> This is important for me, as I'll create sometimes shapefiles > 1GB,
>> so I shouldn't do this on memory.
>> If that is really the effect, I guess I could create a transaction
>> every 1000 features or so.
>
> They are not kept in memory, we create a separate shapefile for the
> transaction, and copy it back when the transaction is committed.
> That is why your code is so slow, you don't set an explicit
> transaction so an auto-commit one is created every time you do
> addFeatures, meaning the shapefile is fully copied twice for
> each of those calls
>
> Cheers
> Andrea
>
>
> --
> Andrea Aime
> OpenGeo - http://opengeo.org
> Expert service straight from the developers.
>



-- 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   César Martínez Izquierdo
   GIS developer
   -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
   ETC-LUSI: http://etc-lusi.eionet.europa.eu/
   Universitat Autònoma de Barcelona (SPAIN)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Geotools-gt2-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users

Reply via email to