Bryce L Nordgren a écrit :
The representation of gridded data in ISO19123 is a minimum of 175 times
larger than the data it represents.  (No joke: that is conservative.)  The
call "DiscreteGridPointCoverage.list().toArray()" on a 1000x1000 grid of
single-band byte data (0.95Mb) seems innocent enough, but will inspire the
creation of nine million objects consuming at least ~173Mb.  The problem,
of course, gets worse with large grids, and exists in addition to the
efficiencies or inefficiencies of any backing implementations. This factor
of >=175 will always exist if the representation specified in 19123 is
adhered to.

--- (Note: just noticed that you explained exactly what I'm saying below in a 
next paragraph.
           It sound like that we are in agreement) ---

Not exactly. ISO 19123 (or at least GeoAPI interfaces derived from ISO 19123) do not impose to us any constraint about how the data should be represented. It only said what the accessor methods should be. The internal data representation is entirely left to implementors.

In the specific case of "DiscreteGridPointCoverage.list().toArray()", this is true that such call will generates a huge amount of objects. We should put a warning in the "list()" javadoc saying that users should never invokes "toArray()" on the returned list (actually, we may encourage implementors to throw UnsupportedOperationException in such case). But "DiscreteGridPointCoverage.list()" alone (without the ".toArray()" part) can be very cheap. It should returns some wrapper (not the usual java.util.ArrayList implementation) around the underlying DiscreteGridPointCoverage internal data. GridPointValuePair will be created on the fly when "List.get(i)" is invoked, and discarted as soon as the garbage collector can.

In the specific case of Geotools implementation, I do not see the ISO 19123 work has a new implementation which would deprecate the old one. Actually, all those ISO 19123 work can be seen at a first stage as "cosmetic" changes: we will change accessors methods in GridCoverage2D, but I expect that all the internal mechanism and the underlying data storage will stay exactly the same (i.e. JAI images). The main difference will be that GridCoverage2D would become only a specific case implementing one of the ISO 19123 subclasses (quadrilateral grid, footprint == pixel, etc.), and ISO 19123 will give us a lot more freedom for implementing more elaborated cases in some future version (e.g. hexagonal grids, more complex footprints, etc.).



Suggested Mitigations:
======================
1]  I propose that any Set<GeometryValuePair> returned by any gridded
coverage throw UnsupportedOperationException for the toArray() call, which
is a mandatory feature of the Collections interface.  This contractual
violation should be specified in GeoAPI space, as all clients should expect
this restriction, and no implementation can get around it.  Further, the
iterator() method shall return an iterator which dynamically creates the
objects of the Set, and NO BACKING STORE is permitted.

Yes, exactly. We could said that as a note in the javadoc, but I don't think that we should words that as a mandatory contract. I would rather words that as recommandation (warning the users that most implementations are likely to behave that way), but would let implementors do whatever they want. After all, some implementors way have some reasons to allow calls to "toArray()" for small lists (debugging purpose, etc.).


3]  We must "pollute" the GeoAPI interfaces with method signatures which
enable/encourage object re-use (e.g., iterator.next(current)) .  The
creation and destruction of 9 million objects when iterating over the grid
is likely to impact efficiency, ergo, a means to avoid it should be
proffered.

I agree that we may need some way to allows object reuse here. I don't think that it is polluting the API in this case, since there is a need for object reuse.


4] We will want to provide an interface for Set implementations with no
backing store.  Implementations of this Set employ specialized iterators
(also specified by GeoAPI interfaces) to create these objects dynamically,
on demand, and NEVER all at once.  Implementations of this Set interface
shall be guaranteed to produce their elements on demand and never store
them in something like an array.

But why a new interface for that? Aren't the current Java Collection interfaces suffisient (except maybe for an Iterator.next(Object) method)? I agree that we may need a new Iterator interface extending java.util.Iterator, but shouldn't it be the only one?


6]  Additionally, it becomes imperative to provide implementations (say, of
interpolators) which never cause the creation of GridCells or GridPoints.
This may require the creation of implementation-specific accelerator
objects (e.g., recognizing that the DiscreteGridPointCoverage is backed by
RenderedImage, and knowing how to obtain the RenderedImage object,
translate the request into a JAI operation.)

The old Coverage interface already provided an indirect way for that:

  Coverage.getRenderableImage(0,1).createDefaultRendering();

In the specific case of GridCoverage2D implementation for example, it returns directly the underlying RenderedImage. However, I agree that the above may not be suffisient. For example it doesn't say if the returned RenderedImage is the backing store or a dynamically created image. But does the user need to know that?


7] The 19123 interfaces should be regarded as low-traffic command and
control links between implementation and client.  Gridded data transport
across these interfaces should be kept to a minimum.  This of course has
implications for renderers as well as data processing code.

Agree.


I would like more brainpower on this one.  Therefore I'd like to ask those
who have an interest in gridded data to start becoming familiar with the
new 19123 framework.  How do we make it usable?  Do we need to design a
Tiling scheme at the 19123 level?  Is a lazy creation scheme enough?  Will
that "Proposed Addition to 19123" help?  Are there other "gotchas" which
could trigger the ISO19123 representation of the entire grid and tank a
system?

We need some way to provide interoperability with Java2D, but note sure that we need tiling at the interface level. They are managed by JAI. I will try to select some methods from the legacy OGC 01-004 that still of some use for ISO 19123 interfaces (exactly like GeoAPI interfaces derived from ISO 19111 kept some OGC 01-009 methods), and generate a new javadoc for discussion purpose. And will look at the proposed extensions in ISO 19123 primer :)

        Martin.


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to