Hi all,
I will try to keep it short :)

As you may know, Apache Solr is a well know scalable solution to preform text based searches on documents [1].

Use case, let's say we have a relational database (e.g. PostgresSQL) with a schema that contains a fair number of related tables that contain several millions of rows. Some of this tables contain text columns (like description, name, comments, report, etc ...), and the user typically query the schema by issuing a query that performs a LIKE search on those text fields.

Although relational databases have their own technologies \ methods to deal with text searches, a possible solution, and often mandatory when we reach a certain amount of data, to improve the texts searches performance is to use Apache Solr.

A possible workflow to handle this use case could be:

    1. index the necessary columns in Apache Solr using the Data Import Handler 
[2]

    2. perform the text searches against Apache Solr

    3. using the results from Apache Solr, build the entities from the relational database applying the other filters

So, my idea is to extend App-Schema to allow us to do something similar for WFS GetFeature queries hitting complex features.

In the mappings we would declare that a certain field is indexed somewhere (the implementation will not be specific to Apache Solr), something like this:

   <AttributeMapping>
      <targetAttribute>st:description</targetAttribute>
      <index>
        <store>solr-stations</store>
        <source>stations</source>
        <attribute>description</attribute>
        <id>id</id>
        <filters>...</filters>
      </index>
   </AttributeMapping>

Then when building the SQL query to send to the relational database, App-Schema could decide (when possible) to use the configured index to speed some parts of the filter.

We would basically retrieve from the index the ids of the entities that match a particular filter and rewrite the SQL query using those ids in a IN clause, pagination will be used for long list of IDs.

This would allow someone publishing complex feature to speed up certain queries by indexing the appropriate fields in Apache Sorl and editing the App-Schema mappings accordingly.

This will require creating a few extension points in App-Schema, mainly in the DataAccessMappingFeatureIterator. The idea is to allow different stores to plug-in their index support ... and of curse avoid jeopardizing App-Schema core :)

Any comments on this are welcomed :)

Regards,

[1] http://lucene.apache.org/solr/
[2] https://lucene.apache.org/solr/guide/6_6/uploading-structured-data-store-data-with-the-data-import-handler.html

--
Regards,
Nuno Oliveira
==
GeoServer Professional Services from the experts!
Visit http://goo.gl/it488V for more information.
==

Nuno Miguel Carvalho Oliveira
@nmcoliveira
Software Engineer

GeoSolutions S.A.S.
Via di Montramito 3/A
55054  Massarosa (LU)
Italy
phone: +39 0584 962313
fax:      +39 0584 1660272

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Con riferimento alla normativa sul trattamento dei dati
personali (Reg. UE 2016/679 - Regolamento generale sulla
protezione dei dati “GDPR”), si precisa che ogni
circostanza inerente alla presente email (il suo contenuto,
gli eventuali allegati, etc.) è un dato la cui conoscenza
è riservata al/i solo/i destinatario/i indicati dallo
scrivente. Se il messaggio Le è giunto per errore, è
tenuta/o a cancellarlo, ogni altra operazione è illecita.
Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to
which it is addressed and may contain information that
is privileged, confidential or otherwise protected from
disclosure. We remind that - as provided by European
Regulation 2016/679 “GDPR” - copying, dissemination or
use of this e-mail or the information herein by anyone
other than the intended recipient is prohibited. If you
have received this email by mistake, please notify
us immediately by telephone or e-mail.

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to