I don't think 19123 had a reference implementation. :) Synopsis: ========== The representation of gridded data in ISO19123 is a minimum of 175 times larger than the data it represents. (No joke: that is conservative.) The call "DiscreteGridPointCoverage.list().toArray()" on a 1000x1000 grid of single-band byte data (0.95Mb) seems innocent enough, but will inspire the creation of nine million objects consuming at least ~173Mb. The problem, of course, gets worse with large grids, and exists in addition to the efficiencies or inefficiencies of any backing implementations. This factor of >=175 will always exist if the representation specified in 19123 is adhered to.
Suggested Mitigations: ====================== 1] I propose that any Set<GeometryValuePair> returned by any gridded coverage throw UnsupportedOperationException for the toArray() call, which is a mandatory feature of the Collections interface. This contractual violation should be specified in GeoAPI space, as all clients should expect this restriction, and no implementation can get around it. Further, the iterator() method shall return an iterator which dynamically creates the objects of the Set, and NO BACKING STORE is permitted. 2] We must carefully control how much of the grid is represented by this expensive framework at any one time. (Design tiling at 19123 level? Object pooling? Model-View pattern?) 3] We must "pollute" the GeoAPI interfaces with method signatures which enable/encourage object re-use (e.g., iterator.next(current)) . The creation and destruction of 9 million objects when iterating over the grid is likely to impact efficiency, ergo, a means to avoid it should be proffered. 4] We will want to provide an interface for Set implementations with no backing store. Implementations of this Set employ specialized iterators (also specified by GeoAPI interfaces) to create these objects dynamically, on demand, and NEVER all at once. Implementations of this Set interface shall be guaranteed to produce their elements on demand and never store them in something like an array. 5] It becomes imperative to offer the client an implementation with a lazy object creation scheme. 6] Additionally, it becomes imperative to provide implementations (say, of interpolators) which never cause the creation of GridCells or GridPoints. This may require the creation of implementation-specific accelerator objects (e.g., recognizing that the DiscreteGridPointCoverage is backed by RenderedImage, and knowing how to obtain the RenderedImage object, translate the request into a JAI operation.) 7] The 19123 interfaces should be regarded as low-traffic command and control links between implementation and client. Gridded data transport across these interfaces should be kept to a minimum. This of course has implications for renderers as well as data processing code. 8] This observation makes critical the scrutiny of the suggestion in section "Proposed Addition to ISO 19123" (p. 30) of the ISO 19123 Primer. This may represent a level of abstraction at which we may realize a significant amount of code re-use without bloating the JVM with millions of objects. If it's not good enough, it can be changed. Request: ========== This efficiency issue is the most critical issue facing the implementation and use of 19123. Everything else is just details. I would like more brainpower on this one. Therefore I'd like to ask those who have an interest in gridded data to start becoming familiar with the new 19123 framework. How do we make it usable? Do we need to design a Tiling scheme at the 19123 level? Is a lazy creation scheme enough? Will that "Proposed Addition to 19123" help? Are there other "gotchas" which could trigger the ISO19123 representation of the entire grid and tank a system? You have time. I'm going to be fleshing out the interfaces themselves first of all. I will then turn my attention to the mitigation of bloated-ness in the implementation. I want to start guiding your attention to this part of the problem so that it receives the attention it deserves. References: ============ GeoAPI Pending javadocs has Martin's hard work coding 19123 interfaces: http://geoapi.sourceforge.net/pending/javadoc/ ISO 19123 Primer: http://docs.codehaus.org/download/attachments/31212/ISO19123+Primer.pdf Bryce ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Geotools-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/geotools-devel
