What about if we define an id field (like in solr)?
Last time I floated the idea of supporting primary keys as a core concept in Lucene (in the context of helping doc updates, not linking) there were objections along the lines of "lucene shouldn't try to be a database" On 8 Nov 2010, at 20:47, Ryan McKinley <[email protected]> wrote: On Mon, Nov 8, 2010 at 2:52 PM, mark harwood <[email protected]> wrote: I came to the conclusion that the transient meaning of document ids is too deeply ingrained in Lucene's design to use them to underpin any reliable linking. What about if we define an id field (like in solr)? Whatever does the traversal would need to make a Map<id,docID>, but that is still better then then needing to do a query for each link. While it might work for relatively static indexes, any index with a reasonable number of updates or deletes will invalidate any stored document references in ways which are very hard to track. Lucene's compaction shuffles IDs without taking care to preserve identity, unlike graph DBs like Neo4j (see "recycling IDs" here: http://goo.gl/5UbJi ) oh ya -- and it is even more akward since each subreader often reuses the same docId ryan --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
