Thanks Jody,

So, I came up with this code, which gets an append writer and doesn't
use transaction. Can you confirm if that's what you meant to indicate?

    ...
    try {
      writer = shpDataStore.getFeatureWriterAppend(
          shpDataStore.getTypeNames()[0], null);

      while (jsonIt.hasNext()) {

        SimpleFeature feature = jsonIt.next();
        addFeature(feature, writer, featureStore);
      }
    } finally {
      if (writer != null) {
        writer.close();
      }
    }
    ...

  /**
   * Copied over from {@link ContentFeatureStore} as a way of writing features
   * directly into a {@link FeatureWriter}
   */
  private static FeatureId addFeature(SimpleFeature feature,
      FeatureWriter<SimpleFeatureType, SimpleFeature> writer,
      SimpleFeatureStore featureStore) throws IOException {

    SimpleFeature toWrite = writer.next();
    for (int i = 0; i < toWrite.getType().getAttributeCount(); i++) {
      String name = toWrite.getType().getDescriptor(i).getLocalName();
      toWrite.setAttribute(name, feature.getAttribute(name));
    }

    // copy over the user data
    if (feature.getUserData().size() > 0) {
      toWrite.getUserData().putAll(feature.getUserData());
    }

    // pass through the fid if the user asked so
    boolean useExisting = Boolean.TRUE.equals(feature.getUserData().get(
        Hints.USE_PROVIDED_FID));
    if (featureStore.getQueryCapabilities().isUseProvidedFIDSupported()
        && useExisting) {
      ((FeatureIdImpl) toWrite.getIdentifier()).setID(feature.getID());
    }

    // perform the write
    writer.write();

    // copy any metadata from the feature that was actually written
    feature.getUserData().putAll(toWrite.getUserData());

    // add the id to the set of inserted
    FeatureId id = toWrite.getIdentifier();
    return id;
  }

On Fri, Sep 6, 2013 at 12:12 PM, Jody Garnett <[email protected]> wrote:
> It is more that shapefile does not offer a database session, so we are
> faking it to make the editing story easier for desktop clients.
>
> Using AUTO_COMMIT is a terrible idea as it will involve writing out your
> file many times (ie each time you add a feature).
>
> I tried to indicate a better way in my email, and in the docs, but it is not
> coming through.
>
>
> On Fri, Sep 6, 2013 at 11:29 AM, William Voorsluys <[email protected]>
> wrote:
>>
>> Hi Jodi,
>>
>> Did you mean to reply this to the list?
>>
>> It seems clear that transactions are not meant to be used efficiently
>> on shapefiles. I'm settling on using AUTO_COMMIT and writing a feature
>> a time using a writer. Do you mean there is no better way of more
>> efficiently writing features in bulk to the file, say 1000 at a time?
>> It seems the operation of getting an append writer is the greatest
>> bottleneck in the operation.
>>
>> Will
>>
>> On Fri, Sep 6, 2013 at 1:28 AM, Jody Garnett <[email protected]>
>> wrote:
>> > I really need a better way to communicate this one, or a special case
>> > when
>> > the shapefile is empty or something.
>> >
>> > The goal here is to use an append feature writer, directly, and write
>> > the
>> > content out as you go in a streaming fashion.
>> >
>> > This is what ShapefileDataSource does internally when you call
>> > transaction.commit(). It goes through the changes that it has collected
>> > in
>> > memory and writes out a new file. It then renames the old file out of
>> > the
>> > way, renames the new file into the correct place, and deletes the old
>> > file.
>> >
>> >
>> >
>> > On Thu, Sep 5, 2013 at 8:15 PM, William Voorsluys
>> > <[email protected]>
>> > wrote:
>> >>
>> >> Dear All,
>> >>
>> >> I've been trying a few solutions to efficiently convert GeoJSON into a
>> >> shapefile without having to store all features in memory. I'm using
>> >> GeoTools 9.2.
>> >>
>> >> The problem is not so much in how to stream the JSON but how to
>> >> efficiently write the features into the shapefile. I use
>> >> FeatureJSON#streamFeatureCollection to obtain an iterator. After some
>> >> googling, I found 3 different ways of writing a shapefile, namely:
>> >>
>> >> 1. Repeatedly calling FeatureStore#addFeatures with a collection
>> >> containing say 1000 features, within a transaction.
>> >>       -----
>> >>       ListFeatureCollection coll = new ListFeatureCollection(type,
>> >> features);
>> >>       Transaction transaction = new DefaultTransaction("create");
>> >>       featureStore.setTransaction(transaction);
>> >>       try {
>> >>         featureStore.addFeatures(coll);
>> >>         transaction.commit();
>> >>       } catch (IOException e) {
>> >>         transaction.rollback();
>> >>         throw new IllegalStateException(
>> >>             "Could not write some features to shapefile. Aborting
>> >> process", e);
>> >>       } finally {
>> >>         transaction.close();
>> >>       }
>> >>       -----
>> >>
>> >>
>> >> This option is extremely slow. By profiling a few runs, I noticed that
>> >> about 50% of CPU time is spent on the method
>> >> ContentFeatureStore#getWriterAppend, presumably in order to reach the
>> >> end of the file before each transaction commit.
>> >>
>> >> 2. Obtaining an append writer directly from ShapefileDataStore, and
>> >> write 1000 features at a time within a transaction.
>> >>
>> >> This options suffers from the same problems as number one.
>> >>
>> >> 3. Obtaining a feature writer from ShapefileDataStore, and write one
>> >> feature at a time using Transaction.AUTO_COMMIT.
>> >>
>> >>      -----
>> >>      FeatureWriter<SimpleFeatureType, SimpleFeature> writer =
>> >> shpDataStore
>> >>         .getFeatureWriter(shpDataStore.getTypeNames()[0],
>> >>             Transaction.AUTO_COMMIT);
>> >>
>> >>      while (jsonIt.hasNext()) {
>> >>
>> >>       SimpleFeature feature = jsonIt.next();
>> >>       SimpleFeature toWrite = writer.next();
>> >>       for (int i = 0; i < toWrite.getType().getAttributeCount(); i++) {
>> >>         String name =
>> >> toWrite.getType().getDescriptor(i).getLocalName();
>> >>         toWrite.setAttribute(name, feature.getAttribute(name));
>> >>       }
>> >>       writer.write();
>> >>     }
>> >>     writer.close();
>> >>     ----
>> >>
>> >>
>> >> Option 3 is the fastest, but I feel there would a way of efficiently
>> >> adding a greater number of features at a time to the shapefile within
>> >> a transaction. On the other hand, a previous comment in this lists
>> >> noted:
>> >>
>> >> > The above would work for mid-sized data transafers, for massive ones
>> >> > against
>> >> > databases it's better to adopt some sort of batching to avoid having
>> >> > a
>> >> > single
>> >> > transaction with one million inserts, e.g., insert 1000, commit the
>> >> > transaction,
>> >> > insert another 1000, and so on.
>> >> > This would work better against databases and against WFS servers,
>> >> > but not against shapefiles, which instead work better with the
>> >> > massive
>> >> > insert...
>> >> > to each his own.
>> >>
>> >> Does this mean that the most efficient way of writing to a shapefile
>> >> is having all features in memory, rather than being able to append
>> >> features?
>> >> I appreciate if someone could suggest a better way of achieving this
>> >> or point to any documentation that would help me.
>> >>
>> >> Best regards,
>> >>
>> >> Will
>> >>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
>> >> Discover the easy way to master current and previous Microsoft
>> >> technologies
>> >> and advance your career. Get an incredible 1,500+ hours of step-by-step
>> >> tutorial videos with LearnDevNow. Subscribe today and save!
>> >>
>> >>
>> >> http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
>> >> _______________________________________________
>> >> GeoTools-GT2-Users mailing list
>> >> [email protected]
>> >> https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users
>> >
>> >
>
>

------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
GeoTools-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to