>>Maybe we could do something similar to declare that agiven field uses Trie*, 
>>and with what datatype.

With the current implementation you can at least test for the presence of a 
field called:
     [fieldName]#trie

..which tells you some form of trie is used but could be extended to include 
precision step value e.g.
     [fieldName]#trie_8

But - overall, I can't help but feel we will struggle to offer facilities like 
this when there is a lack of a formal schema for Lucene indexes. 
Solr obviously includes some form of index definition - my Lucene-based apps 
tend to use a custom config and Luke could certainly benefit from some form of 
definition stored with the index.
Time for some standardised index metadata?

This trie/parser issue is an example of a broader issue for me.

Mark



----- Original Message ----
From: Michael McCandless <luc...@mikemccandless.com>
To: java-user@lucene.apache.org
Sent: Monday, 9 March, 2009 13:10:32
Subject: Re: Lucene 2.9


Uwe Schindler wrote:

>> Or perhaps we should move Trie* into core Lucene, and then build a
>> real (ootb) integration with QueryParser.
> 
> The problem is that the query parser does not know if a field is encoded as
> trie or is just a normal text token. Furthermore, the new trie API does not
> differentiate between dates, doubles, longs (same for 32bit) because every
> trie field is identical (it is the application's task to keep track on the
> encoding when indexing and searching, TrieRange only supports the conversion
> of these data types to sortable integers), but the "datatype" itself is not
> stored in index. Solr has support for this in its "schema", but for Lucene
> all fields are identical. For the query parser there is no possibility to
> differentiate between a long, double or date.

Could we add APIs to QueryParser so the application can state the disposition
toward certain fields?

EG QueryParser now tries to guess whether a range query's upper/lower bound
should be parsed as dates, and there are methods exposed to set the resolution
on a per-field basis.  Maybe we could do something similar to declare that a
given field uses Trie*, and with what datatype.

Just thinking aloud really... but since we haven't yet released Trie*, now (for 
2.9)
is a good time to think hard about how we expose/integrate it... and making it
easier to use ootb seems important.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to