Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG

2010-10-31 Thread Mattias Persson
Have you looked at the lucene documentation? Neo4j doesn't touch any of
that, it lets Lucene do the sorting. Do your values have lots of words in
them (snippets of text)? You could probably copy-paste the exception message
and google it!

2010/10/29 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de

 Sorry for the brackets. In my test, they weren't there!
 Thank you for fixing the problem. It works much better.
 Now I have an additional question: sorting a result.
 Is it possible to sort fulltext results when the values do have space
 characters? Currently I get an exception, that the field has too much tokens
 to be sorted.
 I'm expecting that sort will look in the queryresult for the whole field
 value (not only its tokens) and sort it then.

 -Ursprüngliche Nachricht-
 Von: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] Im
 Auftrag von Mattias Persson
 Gesendet: Dienstag, 26. Oktober 2010 13:34
 An: Neo4j user discussions
 Betreff: Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG

 I just found the problem... fulltext defaults to being case-insensitive (by
 converting added values as well as string queries to lower case). There's a
 quirk in the Lucene QueryParser where you must specifically set whether or
 not range/wildcard queries should have their terms converted into lower
 case
 or not, and by that ignoring what the analyzer has to say about it, which I
 feel is a poor design decision in Lucene. Because now if you specify your
 own custom analyzer class in the configuration you must also set the
 to_lower_case parameter to how the analyzer is implemented, otherwise you
 cannot expect to get the correct results back. Anyways, range/wildcard
 terms
 are lower cased more correctly now.

 Your queries will work with the latest SNAPSHOT, however your second query
 doesn't look like a proper lucene query. Maybe you meant Arn* w/o the
 brackets?

 Main difference between exact and fulltext is that a fulltext index
 tokenizes your values into words and indexes each word individually (and
 also by default converting them into lower case).

 2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de

  Hello,
  I'm giving the new LuceneIndexProvider a trial und try to become
 acquainted
  with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand
 some
  differences between their quering-results..
  For example:
  String nameArnold = Arnold Aronson;
String key = name;
 
  Node nodeArnold = neo.createNode();
  nodeArnold.setProperty(key, nameArnold);
index.add(nodeArnold, key, nameArnold);
  --
  index.query(key, [A TO Z]; //(1)
  index.query(key, [Arn*]; //(2)
 
  These querys work only with EXACT. FULLTEXT returns no matches. But a
  RangeQuery (1) for String would be quiet interesting with FULLTEXT. It
  should return the same matches as EXACT at least. Furthermore, the Query
  Parser Syntax of Lucene should be enabled in FULLTEXT (2).
  So here is the question: Am I not seeing the trick to use them similarly
 or
  are these configurations that different as they seem to be?
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 



 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG

2010-10-29 Thread Konstanze . Lorenz
Sorry for the brackets. In my test, they weren't there!
Thank you for fixing the problem. It works much better. 
Now I have an additional question: sorting a result.
Is it possible to sort fulltext results when the values do have space 
characters? Currently I get an exception, that the field has too much tokens to 
be sorted.
I'm expecting that sort will look in the queryresult for the whole field value 
(not only its tokens) and sort it then.

-Ursprüngliche Nachricht-
Von: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] Im 
Auftrag von Mattias Persson
Gesendet: Dienstag, 26. Oktober 2010 13:34
An: Neo4j user discussions
Betreff: Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG

I just found the problem... fulltext defaults to being case-insensitive (by
converting added values as well as string queries to lower case). There's a
quirk in the Lucene QueryParser where you must specifically set whether or
not range/wildcard queries should have their terms converted into lower case
or not, and by that ignoring what the analyzer has to say about it, which I
feel is a poor design decision in Lucene. Because now if you specify your
own custom analyzer class in the configuration you must also set the
to_lower_case parameter to how the analyzer is implemented, otherwise you
cannot expect to get the correct results back. Anyways, range/wildcard terms
are lower cased more correctly now.

Your queries will work with the latest SNAPSHOT, however your second query
doesn't look like a proper lucene query. Maybe you meant Arn* w/o the
brackets?

Main difference between exact and fulltext is that a fulltext index
tokenizes your values into words and indexes each word individually (and
also by default converting them into lower case).

2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de

 Hello,
 I'm giving the new LuceneIndexProvider a trial und try to become acquainted
 with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand some
 differences between their quering-results..
 For example:
 String nameArnold = Arnold Aronson;
   String key = name;

 Node nodeArnold = neo.createNode();
 nodeArnold.setProperty(key, nameArnold);
   index.add(nodeArnold, key, nameArnold);
 --
 index.query(key, [A TO Z]; //(1)
 index.query(key, [Arn*]; //(2)

 These querys work only with EXACT. FULLTEXT returns no matches. But a
 RangeQuery (1) for String would be quiet interesting with FULLTEXT. It
 should return the same matches as EXACT at least. Furthermore, the Query
 Parser Syntax of Lucene should be enabled in FULLTEXT (2).
 So here is the question: Am I not seeing the trick to use them similarly or
 are these configurations that different as they seem to be?
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG

2010-10-26 Thread Mattias Persson
I just found the problem... fulltext defaults to being case-insensitive (by
converting added values as well as string queries to lower case). There's a
quirk in the Lucene QueryParser where you must specifically set whether or
not range/wildcard queries should have their terms converted into lower case
or not, and by that ignoring what the analyzer has to say about it, which I
feel is a poor design decision in Lucene. Because now if you specify your
own custom analyzer class in the configuration you must also set the
to_lower_case parameter to how the analyzer is implemented, otherwise you
cannot expect to get the correct results back. Anyways, range/wildcard terms
are lower cased more correctly now.

Your queries will work with the latest SNAPSHOT, however your second query
doesn't look like a proper lucene query. Maybe you meant Arn* w/o the
brackets?

Main difference between exact and fulltext is that a fulltext index
tokenizes your values into words and indexes each word individually (and
also by default converting them into lower case).

2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de

 Hello,
 I'm giving the new LuceneIndexProvider a trial und try to become acquainted
 with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand some
 differences between their quering-results..
 For example:
 String nameArnold = Arnold Aronson;
   String key = name;

 Node nodeArnold = neo.createNode();
 nodeArnold.setProperty(key, nameArnold);
   index.add(nodeArnold, key, nameArnold);
 --
 index.query(key, [A TO Z]; //(1)
 index.query(key, [Arn*]; //(2)

 These querys work only with EXACT. FULLTEXT returns no matches. But a
 RangeQuery (1) for String would be quiet interesting with FULLTEXT. It
 should return the same matches as EXACT at least. Furthermore, the Query
 Parser Syntax of Lucene should be enabled in FULLTEXT (2).
 So here is the question: Am I not seeing the trick to use them similarly or
 are these configurations that different as they seem to be?
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs FULLTEXT_CONFIG

2010-08-27 Thread Mattias Persson
Oh yeah, that's right... you can control it via to_lower_case property in
the config. F.ex. you could do your own fulltext config like this:

MapString, String caseSensitiveFulltextConfig = new HashMapString,
String( FULLTEXT_CONFIG );
caseSensitiveFulltextConfig.put( to_lower_case, false );

When type=fulltext (as in FULLTEXT_CONFIG map) the to_lower_case defaults to
true. All this is quite experimental, so sorry for inconveniences.


2010/8/27 Balazs E. Pataki pat...@dsd.sztaki.hu

 Thanks for the clarification.

 Indeed, wildcards work in both modes, however in FULLTEXT mode it only
 allows lowercase search strings, while in EXACT mode the search is case
 sensitive.

 Regards,
 ---
 balazs

 On 8/27/10 1:12 PM, Mattias Persson wrote:
  Hi Balazs,
 
  maybe the names aren't that great... but EXACT means that it indexes your
  data as it is without chopping it up, whereas FULLTEXT chops up the data
  into words and indexes every word separately.
 
  Both support wildcards, as lucene supports wildcards for both those
 modes.
 
  2010/8/27 Balazs E. Patakipat...@dsd.sztaki.hu
 
  Hi,
 
  could someone please explain me when to use EXACT_CONFIG and when
  FULLTEXT_CONFIG when using the nodeIndex() of the LuceneIndexProvider?
 
  It seems to me that one cannot execute wildcard searches in
  FULLTEXT_CONFIG mode, it only works when EXACT_CONFIG is used. But what
  is actually exact then about this config.
 
  Thanks for any hints in advance!
  ---
  balazs
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 
 
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user