>>Maybe we could do something similar to declare that agiven field uses Trie*, >>and with what datatype.
With the current implementation you can at least test for the presence of a field called: [fieldName]#trie ..which tells you some form of trie is used but could be extended to include precision step value e.g. [fieldName]#trie_8 But - overall, I can't help but feel we will struggle to offer facilities like this when there is a lack of a formal schema for Lucene indexes. Solr obviously includes some form of index definition - my Lucene-based apps tend to use a custom config and Luke could certainly benefit from some form of definition stored with the index. Time for some standardised index metadata? This trie/parser issue is an example of a broader issue for me. Mark ----- Original Message ---- From: Michael McCandless <luc...@mikemccandless.com> To: java-user@lucene.apache.org Sent: Monday, 9 March, 2009 13:10:32 Subject: Re: Lucene 2.9 Uwe Schindler wrote: >> Or perhaps we should move Trie* into core Lucene, and then build a >> real (ootb) integration with QueryParser. > > The problem is that the query parser does not know if a field is encoded as > trie or is just a normal text token. Furthermore, the new trie API does not > differentiate between dates, doubles, longs (same for 32bit) because every > trie field is identical (it is the application's task to keep track on the > encoding when indexing and searching, TrieRange only supports the conversion > of these data types to sortable integers), but the "datatype" itself is not > stored in index. Solr has support for this in its "schema", but for Lucene > all fields are identical. For the query parser there is no possibility to > differentiate between a long, double or date. Could we add APIs to QueryParser so the application can state the disposition toward certain fields? EG QueryParser now tries to guess whether a range query's upper/lower bound should be parsed as dates, and there are methods exposed to set the resolution on a per-field basis. Maybe we could do something similar to declare that a given field uses Trie*, and with what datatype. Just thinking aloud really... but since we haven't yet released Trie*, now (for 2.9) is a good time to think hard about how we expose/integrate it... and making it easier to use ootb seems important. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org