Hi Marco,

Thanks for your quick reaction,

In this mail I’ll concentrate on the Filtering issue, as it is the most 
blocking for us.  The schemaless plugin is interesting and I’ll check it out. 
Fixing the filtering bug causes the dates to partially work.  So I’ll see what 
happens after the core bug is fixed.

I’ve investigated the code wednesday evening, and this is what I came up with:

A lot of the unit tests simply don’t run when no database is attached.  In 
fact, I see a lot of tests without @Test annotation, so maybe they’re from the 
Junit 3 era and never work anymore.  If you have an easy way to run them, e.g. 
on a geoserver build machine,  I’d be interested

Because of this, I tried to create junit tests that does not require mongodb.   
  To make this possible, I had to make 2 dirty changes in the src/main/java 
code:

  1.  In MongoDataStore, declare the method isMongoVersionLessThan2_6  
protected instead of private
  2.  Add a class MongoSchemaMemoryStore, comparable to MongoSchemaFileStore, 
backed by a ConcurrentHashMap.  The MongoDB schemaless plugin might have made 
this obsolete.

I’d like your idea about both dirty changes.

At this point, the problematic cases become testable.  I created some infra and 
a checkSplitter method that checks if the splitter creates the correct output:

public class FilterSplitterTest {
    private MongoFeatureSource source;
    @Before
    public void setup() {
        MongoDataStore ds =
                new MongoDataStore("mongodb://dummy/dum", "mem:") {
                    @Override
                    protected boolean isMongoVersionLessThan2_6(MongoClientURI 
dataStoreClientURI) {
                        return false;
                    }
                };
        ContentEntry entry = new ContentEntry(ds, new NameImpl("dummyEntry"));
        this.source = new MongoFeatureSource(entry, null, null);
    }

    @Test
    public void testProperty() throws Exception {
        checkSplitter("a=1", "a=1", "INCLUDE");
    }

    @Test
    public void testAndMongoOnly() throws Exception {
        checkSplitter("a=1 AND b=1", "a=1 AND b=1", "INCLUDE");
    }

    @Test
    public void testAndBoth() throws Exception {
               //FIXME: CQL.toFilter(before) has wrong date format
        checkSplitter("a=1 AND b BEFORE 1980-09-03T00:00:00", "a=1", "c before 
1980-09-03T00:00:00");
    }

    @Test
    public void testJSonSelect() throws Exception {
               //FIXME this makes no sense, but nothing I try makes more sense
        //checkSplitter("a=jsonSelect('a')", "a=1 AND b=1", "INCLUDE");

        //FIXME need a working example of a function.
        //CQL.toFilter("strConcat(NAME, 'suffix')"); does not work even if it 
javadoc claims it should
    }

    /**
     * @param beforeSplit  The input to the splitter
     * @param toMongo The part of the filter given to mongodb
     * @param toPostprocess The part of the filter done by postprocessing
     * @throws CQLException
     */
    private void checkSplitter(String beforeSplit, String toMongo, String 
toPostprocess)
            throws CQLException {
        //FIXME getCountInternal ignores the toPostprocess, is this correct?
        Filter[] split = source.splitFilter(CQL.toFilter(beforeSplit));
        Assert.assertEquals(CQL.toFilter(toMongo), split[0]);
        Assert.assertEquals(CQL.toFilter(toPostprocess), split[1]);
    }
}

These tests demonstrate some interesting point:

  *   testProperty/ testAndMongoOnly trigger the bug as they should
  *   testAndBoth does not trigger the bug.  As it happens, the AND filter can 
paper over part of the impact if the stacks are not balanced.
  *   Bonus bug?  testAndBoth demonstrates that BeforeImpl.toString() uses the 
default java date format and not ISO.
  *   Bonus bug? testJSonSelect: As the JSonSelect patch is the root cause of 
all these troubles, I try to write a test that finds out if this feature is not 
damaged by my bugfix.  Unfortunately, I found out that I can’t.  Every usage  I 
try crashes with a ClassCastException while parsing.  Even the strConCat demo 
mentioned in the CQL Javadoc crashes.  I saw these same crashes on our dev 
geoserver instance.  But maybe it’s just me not knowing what CQL is supposed to 
do.

If you can comment on the 2 bonus bugs and the supposed way to use the 
JSonSelect function, I would be grateful.

All of this is preparation for the real bugfix.    I agree with your assessment 
of the bug, but with some caveats:

  *   There is a class MongoFilterSplitter that duplicates the defective 
filter, with slightly different but also faulty behaviour.  I propose to delete 
it.
  *   MongoFeatureSource.splitFilter is the core of the bug.  Both 
BinaryComparisonOperator and PropertyIsLike seem impacted, and as you say, 
should defer to their parent implementation.
  *   But:  What if e.g. there is a JSonSelectFunction  but no Literal. The 
behaviour from super.visit…() needs to be verified for this function, it might 
not be as intended.  But I need clarification on the 2nd bonus bug to write 
some unit tests that demonstrate correct behaviour before

If I suppose optimistically that you agree with all the above and fill in the 
2nd bonus bug, I can provide a PR.  Do you want 1 PR containing everything, or 
would you prefer me to split it up in multiple PRs for ease of code review?

Hans


From: Marco Volpini <marco.volp...@geosolutionsgroup.com>
Sent: vrijdag 8 juli 2022 9:14
To: Hans Yperman <hans.yper...@vliz.be>
Cc: geotools-devel@lists.sourceforge.net
Subject: Re: [Geotools-devel] geotools mongodb bugs in filtering

Dear Hans,
all improvements and bug fixes are welcome on both GeoTools and GeoServer. You 
can either provide the bug fix/improvement by yourself by opening a pull 
request on the GeoServer or Geotools repo, either by having someone else doing 
it for you (see the commercial support  page<https://geoserver.org/support/>).
In terms of help that can be provided to fix the issues that you have found, I 
can give you some code pointers (see below) and review the pull requests once 
they are opened.
Please see my replies below to your points:


·        MongoDB is schemaless, but it seems geoserver needs 2 schema 
definitions: 1 defining the attributes for layers, a 2nd specific for mongodb 
cached on disk and invisible if you don't know about its existence.  If an 
attribute is added to mongodb, we'll need to delete/adapt the second, and 
restart the geoserver.
This is due the fact that GeoTools and GeoServer are schema driven based on the 
gml standard that always needs a schema upfront in order to properly handle the 
data. Moreover the gt mongodb data store manages SimpleFeatures. Simple Feature 
specs doesn't allow nested objects and arrays to be encoded. The addition 
internal JSON schema is indeed also used to flattenize any nested structure to 
make it SimpleFeature compatible. However if you are interested in serving 
feature only as WFS GeoJSON and as WMS you might think to try the MongoDB 
schemaless 
plugin<https://docs.geoserver.org/latest/en/user/community/schemaless-features/schemaless-mongo/index.html>
 that instead will serve mongodb document as features as they are in the db, 
without flattening properties and without the need to provide a schema upfront. 
Mind that GML output will not be available in this case.


·        The mongodb password needs to be hardcoded in the URL.  This means 
geoserver can't hide/encrypt it.  It is readable for everyone who can see the 
datastore.
Yes currently the mongodb pwd can be only provided in the connection string 
input field in plain text, although GeoServer allows to externalize the 
connection string using environment 
properties<https://docs.geoserver.org/master/en/user/datadirectory/configtemplate.html>.
 Feel free to improve the mongodb connection string configuration of the 
MongoDB DataStore. Here some code pointers that you might find usefull: 
MongoDataStoreFactory<https://github.com/geotools/geotools/blob/1fed12f88ab0c9d4ef92aa9cf73af8f94d5c1291/modules/plugin/mongodb/src/main/java/org/geotools/data/mongodb/MongoDataStoreFactory.java#L30>
 and 
MongoDataStore<https://github.com/geotools/geotools/blob/1e036c73a4e420beeeb86c381c797a898a68c8a7/modules/plugin/mongodb/src/main/java/org/geotools/data/mongodb/MongoDataStore.java#L74>.

·        I can't get Filtering to work. e.g.  I add cql_filter=datatype%3D'C' 
which adds a filter on an attribute datatype='C'.   Depending on the attribute, 
I get the whole mongodb collection or nothing.  I debugged it to a function 
splitFilter that basically just throws my filter away.  Issue GEOT-5911 has the 
same conclusions as I have and a partial fix, but it seems dead.

This indeed is a bug. I've reproduced it with the mongo db data store (while on 
the mongodb schemaless plugin the same filter works fine). I gave just a quick 
look so I'm not 100% sure but to fix it I believe that at the end of this visit 
method in the PostPreFilter 
Splitter<https://github.com/geotools/geotools/blob/1e036c73a4e420beeeb86c381c797a898a68c8a7/modules/plugin/mongodb/src/main/java/org/geotools/data/mongodb/MongoFeatureSource.java#L340>
 in an else statement a call to the superclass visit(BinaryComparisonOperator 
filter) should be added.

·         I have the impression filtering on dates/times is either not 
implemented (throw new UnsupportedOperationException() ) or done by geoserver 
instead of mongo (costing me the mongodb indexes).  Can't test this because of 
the issue above so I might be wrong here.

Currently you can filter by dates/times only if the date time value is stored 
as a string in ISO format on MongoDB. In a cql filter then you can specify a 
time filter using >,<,>=,<= operators like in the following example: 
cql_filter= mydateproperty >'2021-03-15T09:54:59.000Z'. MongoDB Date type is 
not currently supported as it is not supported the translation of cql temporal 
operators to mongodb filters.

Regards,

Marco Volpini

==
GeoServer Professional Services from the experts!

Visit http://bit.ly/gs-services-us for more information.
==

Marco Volpini

Software Engineer

GeoSolutions Group
phone: +39 0584 962313

fax:     +39 0584 1660272


https://www.geosolutionsgroup.com/

http://twitter.com/geosolutions_it

-------------------------------------------------------


Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE 
2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa 
che ogni circostanza inerente alla presente email (il suo contenuto, gli 
eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i 
destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per 
errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei 
comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is addressed 
and may contain information that is privileged, confidential or otherwise 
protected from disclosure. We remind that - as provided by European Regulation 
2016/679 “GDPR” - copying, dissemination or use of this e-mail or the 
information herein by anyone other than the intended recipient is prohibited. 
If you have received this email by mistake, please notify us immediately by 
telephone or e-mail.



On Thu, Jul 7, 2022 at 5:04 PM Hans Yperman 
<hans.yper...@vliz.be<mailto:hans.yper...@vliz.be>> wrote:
Hello everybody, and especially Marco Volpini ,

We are testing the geoserver/geotools/mongodb combination.  We have a number of 
filtering defects that block us from continuing our tests.

There are a few issues open for this, e.g., GEOT-5911 GEOT-6570 GEOT-6348 
GEOT-6810 .   We debugged the underlying cause, and we think we have enough 
knowledge to fix this bug.

We want to know your position as maintainer:
* Do we agree that there is a problem related to filtering worth fixing.
* Can you provide necessary help(e.g., guidance)  to fix this, if resources 
available.
* Are you willing to accept a pull request , after aligning with your coding 
standards?

Unfortunately, we already sent 2 messages to the geoserver and geotools user 
list and did not receive any reaction on either.

See:

https://sourceforge.net/p/geoserver/mailman/geoserver-users/thread/DB9PR05MB898397B5B85AAF69FA48213A9CB59%40DB9PR05MB8983.eurprd05.prod.outlook.com/#msg37671813

https://sourceforge.net/p/geotools/mailman/geotools-gt2-users/thread/DB9PR05MB89834EE49A41E3744010B51D9CBA9%40DB9PR05MB8983.eurprd05.prod.outlook.com/#msg37674928

Kind regards,

Hans
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net<mailto:GeoTools-Devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/geotools-devel
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to