Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG
Have you looked at the lucene documentation? Neo4j doesn't touch any of that, it lets Lucene do the sorting. Do your values have lots of words in them (snippets of text)? You could probably copy-paste the exception message and google it! 2010/10/29 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de Sorry for the brackets. In my test, they weren't there! Thank you for fixing the problem. It works much better. Now I have an additional question: sorting a result. Is it possible to sort fulltext results when the values do have space characters? Currently I get an exception, that the field has too much tokens to be sorted. I'm expecting that sort will look in the queryresult for the whole field value (not only its tokens) and sort it then. -Ursprüngliche Nachricht- Von: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] Im Auftrag von Mattias Persson Gesendet: Dienstag, 26. Oktober 2010 13:34 An: Neo4j user discussions Betreff: Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG I just found the problem... fulltext defaults to being case-insensitive (by converting added values as well as string queries to lower case). There's a quirk in the Lucene QueryParser where you must specifically set whether or not range/wildcard queries should have their terms converted into lower case or not, and by that ignoring what the analyzer has to say about it, which I feel is a poor design decision in Lucene. Because now if you specify your own custom analyzer class in the configuration you must also set the to_lower_case parameter to how the analyzer is implemented, otherwise you cannot expect to get the correct results back. Anyways, range/wildcard terms are lower cased more correctly now. Your queries will work with the latest SNAPSHOT, however your second query doesn't look like a proper lucene query. Maybe you meant Arn* w/o the brackets? Main difference between exact and fulltext is that a fulltext index tokenizes your values into words and indexes each word individually (and also by default converting them into lower case). 2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de Hello, I'm giving the new LuceneIndexProvider a trial und try to become acquainted with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand some differences between their quering-results.. For example: String nameArnold = Arnold Aronson; String key = name; Node nodeArnold = neo.createNode(); nodeArnold.setProperty(key, nameArnold); index.add(nodeArnold, key, nameArnold); -- index.query(key, [A TO Z]; //(1) index.query(key, [Arn*]; //(2) These querys work only with EXACT. FULLTEXT returns no matches. But a RangeQuery (1) for String would be quiet interesting with FULLTEXT. It should return the same matches as EXACT at least. Furthermore, the Query Parser Syntax of Lucene should be enabled in FULLTEXT (2). So here is the question: Am I not seeing the trick to use them similarly or are these configurations that different as they seem to be? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG
Sorry for the brackets. In my test, they weren't there! Thank you for fixing the problem. It works much better. Now I have an additional question: sorting a result. Is it possible to sort fulltext results when the values do have space characters? Currently I get an exception, that the field has too much tokens to be sorted. I'm expecting that sort will look in the queryresult for the whole field value (not only its tokens) and sort it then. -Ursprüngliche Nachricht- Von: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] Im Auftrag von Mattias Persson Gesendet: Dienstag, 26. Oktober 2010 13:34 An: Neo4j user discussions Betreff: Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG I just found the problem... fulltext defaults to being case-insensitive (by converting added values as well as string queries to lower case). There's a quirk in the Lucene QueryParser where you must specifically set whether or not range/wildcard queries should have their terms converted into lower case or not, and by that ignoring what the analyzer has to say about it, which I feel is a poor design decision in Lucene. Because now if you specify your own custom analyzer class in the configuration you must also set the to_lower_case parameter to how the analyzer is implemented, otherwise you cannot expect to get the correct results back. Anyways, range/wildcard terms are lower cased more correctly now. Your queries will work with the latest SNAPSHOT, however your second query doesn't look like a proper lucene query. Maybe you meant Arn* w/o the brackets? Main difference between exact and fulltext is that a fulltext index tokenizes your values into words and indexes each word individually (and also by default converting them into lower case). 2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de Hello, I'm giving the new LuceneIndexProvider a trial und try to become acquainted with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand some differences between their quering-results.. For example: String nameArnold = Arnold Aronson; String key = name; Node nodeArnold = neo.createNode(); nodeArnold.setProperty(key, nameArnold); index.add(nodeArnold, key, nameArnold); -- index.query(key, [A TO Z]; //(1) index.query(key, [Arn*]; //(2) These querys work only with EXACT. FULLTEXT returns no matches. But a RangeQuery (1) for String would be quiet interesting with FULLTEXT. It should return the same matches as EXACT at least. Furthermore, the Query Parser Syntax of Lucene should be enabled in FULLTEXT (2). So here is the question: Am I not seeing the trick to use them similarly or are these configurations that different as they seem to be? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG
I just found the problem... fulltext defaults to being case-insensitive (by converting added values as well as string queries to lower case). There's a quirk in the Lucene QueryParser where you must specifically set whether or not range/wildcard queries should have their terms converted into lower case or not, and by that ignoring what the analyzer has to say about it, which I feel is a poor design decision in Lucene. Because now if you specify your own custom analyzer class in the configuration you must also set the to_lower_case parameter to how the analyzer is implemented, otherwise you cannot expect to get the correct results back. Anyways, range/wildcard terms are lower cased more correctly now. Your queries will work with the latest SNAPSHOT, however your second query doesn't look like a proper lucene query. Maybe you meant Arn* w/o the brackets? Main difference between exact and fulltext is that a fulltext index tokenizes your values into words and indexes each word individually (and also by default converting them into lower case). 2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de Hello, I'm giving the new LuceneIndexProvider a trial und try to become acquainted with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand some differences between their quering-results.. For example: String nameArnold = Arnold Aronson; String key = name; Node nodeArnold = neo.createNode(); nodeArnold.setProperty(key, nameArnold); index.add(nodeArnold, key, nameArnold); -- index.query(key, [A TO Z]; //(1) index.query(key, [Arn*]; //(2) These querys work only with EXACT. FULLTEXT returns no matches. But a RangeQuery (1) for String would be quiet interesting with FULLTEXT. It should return the same matches as EXACT at least. Furthermore, the Query Parser Syntax of Lucene should be enabled in FULLTEXT (2). So here is the question: Am I not seeing the trick to use them similarly or are these configurations that different as they seem to be? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs FULLTEXT_CONFIG
Oh yeah, that's right... you can control it via to_lower_case property in the config. F.ex. you could do your own fulltext config like this: MapString, String caseSensitiveFulltextConfig = new HashMapString, String( FULLTEXT_CONFIG ); caseSensitiveFulltextConfig.put( to_lower_case, false ); When type=fulltext (as in FULLTEXT_CONFIG map) the to_lower_case defaults to true. All this is quite experimental, so sorry for inconveniences. 2010/8/27 Balazs E. Pataki pat...@dsd.sztaki.hu Thanks for the clarification. Indeed, wildcards work in both modes, however in FULLTEXT mode it only allows lowercase search strings, while in EXACT mode the search is case sensitive. Regards, --- balazs On 8/27/10 1:12 PM, Mattias Persson wrote: Hi Balazs, maybe the names aren't that great... but EXACT means that it indexes your data as it is without chopping it up, whereas FULLTEXT chops up the data into words and indexes every word separately. Both support wildcards, as lucene supports wildcards for both those modes. 2010/8/27 Balazs E. Patakipat...@dsd.sztaki.hu Hi, could someone please explain me when to use EXACT_CONFIG and when FULLTEXT_CONFIG when using the nodeIndex() of the LuceneIndexProvider? It seems to me that one cannot execute wildcard searches in FULLTEXT_CONFIG mode, it only works when EXACT_CONFIG is used. But what is actually exact then about this config. Thanks for any hints in advance! --- balazs ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user