What about if we define an id field (like in solr)?

Last time I floated the idea of supporting primary keys as a core concept in 
Lucene (in the context of helping doc updates, not linking) there were 
objections along the lines of "lucene shouldn't try to be a database" 


On 8 Nov 2010, at 20:47, Ryan McKinley <[email protected]> wrote:

On Mon, Nov 8, 2010 at 2:52 PM, mark harwood <[email protected]> wrote:
I came to the conclusion that the transient meaning of document ids is too
deeply ingrained in Lucene's design to use them to underpin any reliable
linking.

What about if we define an id field (like in solr)?

Whatever does the traversal would need to make a Map<id,docID>, but
that is still better then then needing to do a query for each link.


While it might work for relatively static indexes, any index with a reasonable
number of updates or deletes will invalidate any stored document references in
ways which are very hard to track. Lucene's compaction shuffles IDs without
taking care to preserve identity, unlike graph DBs like Neo4j (see "recycling
IDs" here: http://goo.gl/5UbJi )


oh ya -- and it is even more akward since each subreader often reuses
the same docId

ryan

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]






---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to