Re: [Geoserver-users] Feature chaining vs denormalized tables

Rini Angreani Fri, 01 Oct 2010 01:15:26 -0700

Hi Andrea,

I also thought that the main reason to use feature chaining like you said,
is to avoid managing large denormalised views. The table/view gets really
big when you have deeply nested features (sub features that have also have
features), especially when multiple multi-valued properties are involved (in
sub features). Although, according to Ben, it wasn't possible to map
multiple multi-valued properties using grouping alone (without feature
chaining). Sorry, I don't fully remember the limitations of grouping in old
app-schema. Anyway, the whole grouping mechanism has been removed now, but
still remains in 1.6, so if multiple multi-valued properties are involved,
feature chaining must be used for 2.0 and beyond.


The other thing feature chaining allows is polymorphism, that is when a
complex attribute has an abstract type, and the sub-type is determined at
run time. Usually, this is determined using CQL function in feature chaining
(i.e. if the value of a column is x -> feature chain to table A, otherwise
feature chain to table B). 
Basically, it's because multiple tables are involved. The tables could even
come from different data sources, like Gabriel said.

I played around with the alternative solution you proposed a while ago, i.e.
querying by fids, but when the dataset gets really large, we ran out of
memory pretty quickly (having to store the fids somewhere). 
The only current plan to improve performance is to possibly halve the time
it takes, since the queries are run twice (1st one to get size or
numberOfFeatures printed on the header, and then to actually encode the
features). This makes sense for simple features, because they're streamed,
so getting numberOfFeatures before encoding is straight forward. Anyway,
even when this is fixed, it's still not fast enough.
There is a plan for improving the performance properly (on the database join
level or using hibernate), but not in the near future. 

Cheers
Rini

 



Andrea Aime-5 wrote:
> 
> Hi,
> I'm looking into the application schema support and I'm wondering about
> the
> differences between using feature chaining vs using denormalized
> tables/views
> to create complex features.
> 
> As far as I understand feature chaining is sort of a more natural model
> but has
> severe performance issues (N+1 query problem) that appear not to make it
> viable for large data sets (and large results).
> 
>>From the documentation it is clear that the denormalized approach works
fine
> for multivalued feature attributes but I have difficulties
> understanding exactly when does
> it start to break and forces the user to switch to chaining.
> 
> Of course there is the issue of managing a large ugly table. But I'm
> wondering,
> can it handle the case of a feature with a collection of sub-features?
> What if the sub-features have in turn multi valued attributes or
> attributes that
> are in turn collections?
> Also, how does one state the identifier of the sub-features, assuming
> there
> is no interest in publishing them directly? For what I can see it may
> be possible
> to use a client property "gml:id", but will it be recognized as the
> way to perform
> the grouping?
> (sorry for the apparently silly questions, I'm preparing to use app-schema
> but
> atm I still don't have the input data handy to make tests)
> 
> As a final note, is there any plan to try alternative strategies to
> improve the
> feature chaining approach?
> For example, joining could be performed by first getting just the list
> of fids of the
> top features, and then issuing a query to the sub-feature type using those
> ids
> (and ordering the result by that same field) turning the N+1 problem into
> a
> 1(ids)+1(topmost data)+1(subfeatures) problem instead.
> In case there is filtering on the subfeatures the thing could be
> reversed, get the
> list of top level feature ids from the sub-features applying the filter
> and then
> start back from the previous approach adding the ids in the main data
> filter.
> Of course database level joining would be preferrable, but it would not
> spare
> us from having generic (and possibly decently performing) strategies to
> handle
> the heterogeneous source data case.
> 
> Cheers
> Andrea
> 
> 
> -----------------------------------------------------
> Ing. Andrea Aime
> Senior Software Engineer
> 
> GeoSolutions S.A.S.
> Via Poggio alle Viti 1187
> 55054  Massarosa (LU)
> Italy
> 
> phone: 
> fax:     +39 0584962313
> 
> http://www.geo-solutions.it
> http://geo-solutions.blogspot.com/
> http://www.linkedin.com/in/andreaaime
> http://twitter.com/geowolf
> 
> -----------------------------------------------------
> 
> ------------------------------------------------------------------------------
> Start uncovering the many advantages of virtual appliances
> and start using them to simplify application deployment and
> accelerate your shift to cloud computing.
> http://p.sf.net/sfu/novell-sfdev2dev
> _______________________________________________
> Geoserver-users mailing list
> Geoserver-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geoserver-users
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Feature-chaining-vs-denormalized-tables-tp29837803p29855608.html
Sent from the GeoServer - User mailing list archive at Nabble.com.


------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Re: [Geoserver-users] Feature chaining vs denormalized tables

Reply via email to