Re: [Geotools-devel] Ease of Use vs Reality tradeoffs and the direction of FeatureCollection

Andrea Aime Tue, 02 Jan 2007 01:43:29 -0800

Jody Garnett ha scritto:

...


> I think I am understanding you, I just have a different focus then you.
> 
>  From my perspective as long as you are programming against readers and 
> writers, and not something like
> FeatureCollection I cannot set you up to be optimized. FeatureStore 
> right now has methods (such as addFeatures) to do "bulk" operations, 
> these are the same methods that GeoServer uses - and represent the best 
> "optimization" we got.

Here we are on the same line, I agree with the principle

> I would like to see this as a single discrete object: with the list I 
> made above (FeatureType, Features, Summary) boiled down to a single class.

Hum, FeatureType does not seem to be boiled down effectively, the 
FeatureType of a FeatureCollection simply does not make sense to me?
I think I would expect getFeatureType returned the feature type of 
features contained in the collection, but that's not what happens 
(because the current implementation says FeatureCollection IsA Feature,
something I don't believe it's right despite what OGC may say).

>> I'm not questioning the details, I'm questioning the design.
> With respect to the specifics of "are we a Collection" we got a couple 
> points against:
> - Collection does not seem to fit on a mathematical level (is this true?)

Not my concern. Collection does not fit because our implementation does
not compare performance and usage wise to a normal collection.
Is there a mathematical definition of Collection btw?

> - Iterator hides the possibility for exceptions (this seems to be your 
> major concern?)

Indeed it is, but that's all of it. See later.

> - unclear how to either fail early, or fail late (probably as a 
> consequence of the above)

What's worse, unclear if you're reading or writing!!! This is tragicomic.

> We also have a benefit:
> - accessible to new users
> 
> I agree with these concerns, and the benefit, but they are details that 
> do not effect my design one way or the other.

I feel the design is wrong because it's supposed ease of use it's just
apparent, and it's real usage patterns are much more complex than what
it's apparent.
This reminds me of some GNOME discussion, where UI designers insisted 
that GNOME should not try to imitate Windows, because if you do, users
start pretending its like Windows in all respect.
FeatureCollection tries to give you the illusion of a collection when 
it's not:
* you cannot expect the same performance characteristics, so the way
   to write your algorithms must be profoundly different to scale.
   FeatureCollection hides this, if you program like with a normal
   collection you'll get bad scalability and bad performance even on
   small data sets at the same time;
* you cannot program the same way. An idiomatic for loop is not
   something you can do, because you have to close the iterator. So
   you're left with a while loop, just like with FeatureIterator
* you cannot pass the collection around to external algorithms without
   extra care (the purge thing, which btw breaks other iterators you
   may have open against the collection as well)

These are design problems, that will lead to bad applications.
In my opinion, something that vaguely looks like a collection may be
interesting, but it should not be a Collection: this avoids you having
to implement all the extra methods, and does not raise expectations we
cannot match.

I thought about it a little more on the mantainance perspective too.
When you take code that used FeatureResults and replace 
FeatureCollection all seems fine, because
you already got in place all the exception handling and closing, so
you just do a replace.
The issue is when you write new code: this is where you'll see
people starting to misuse the API without noticing.

Ah, I've taken a look at the implementation (FeatureWriterIterator, 
FeatureReaderIterator), and I've seen something
good and something bad at the same time.
Good: iterators do close things properly when they get to the end of the
iteration, and before they do throw exceptions.
Bad: all exceptions become NoSuchElementException, which is even worse 
than not declaring the exceptions, because you cannot distinguish
anymore between bad data (IllegalArgumentException)  and real world IO
issues (IOException of some kind).
Also, no exception chaining is used, a detail, but one that makes
debugging a lot harder.

> Andrea do you have any suggestions on how reality and ease of use can be 
> balanced; FeatureReader really does not appear in the GeoAPI 
> FeatureCollection so it is not going to be around for you to fall back 
> on.  It does support invalid data so some of the pain is taken away.

My suggestion is simple: do not hide IOException, never. This is fake 
ease of use. Do not hide the two/more orders of magnitude performance 
difference behind an API that has been designed to deal with in-memory
stuff.
GeoAPI FeatureCollection, as it stands now, it's a little better than
ours because:
* iterator() explicitly returns FeatureIterator(), which has a close
   method for early iteration bail out.
* all methods of FeatureIterator() do throw a BackingStoreException,
   which while still being a runtime exception (still bad idea imho)
   does at least state clearly that you're dealing with I/O and other
   causes in its javadoc, clearly states that there is a cause (implies
   chaining) and more important can be the root of an exception hierarchy
   that provides you good information about what happened. Reference, as
   examples, Spring and Hibernate root exceptions: 
http://static.springframework.org/spring/docs/2.0.x/api/org/springframework/dao/DataAccessException.html
http://www.hibernate.org/hib_docs/v3/api/org/hibernate/HibernateException.html
(and yes, I know these are RuntimeExceptions, I don't like that, but at 
least all code states explicitly it's throwing them, and they do allow 
for real world exception handling).
* I see it implements Iterable... from bad to worse? If you do a 
for(Feature: collection) loop, how in the world do you close the 
iteration should you break out of the loop before it ends? Bzzzz!
Blooper! Another sign Collection is a bad match for data access imho.

FeatureCollection designed like this may be something you can use to
hold one-to-many associations just like Hibernate does (because if 
you're bound to get an exception, you will get it only during the
first collection access), but not something you can use to handle
results of a query. For  example, does Hibernate give you fake 
collections as a result when you do queries?
Nope, it does not. When it returns List, it's a real, in memory
List. Otherwise, it returns you something else, like ScrollableResults.
If you look at the reference manual, you'll see that they do not even 
feel like documenting ScrollableResults and Iterators, because you're
supposed to use paged access instead (which may throw an exception
when you ask for the Page, but not when you really access it).

In all these respects, the old FeatureResults interface was light years
ahead, and could have been extended to provide all the extra 
optimizations you need, and writing capabities as well.
Converting FeatureResults to FeatureCollection in Geoserver would not be 
a QA activity, quite the opposite in my opinion.

Cheers
Andrea



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Re: [Geotools-devel] Ease of Use vs Reality tradeoffs and the direction of FeatureCollection

Reply via email to