Hi,
The question is: Why would one want to do this inside the codecs? As Robert said, you can add a DocValues field to do all of this – we already refactored document norms to be docvalues fields, so the IDs are also the same, a variable with docvalues field. It would survive merging, is random accessible and can be used to link to the outside. Inside Lucene you would always need an integer ID to reference docs across several documents. There is no need to change anything inside Lucene to get those ids. Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen <http://www.thetaphi.de/> http://www.thetaphi.de eMail: [email protected] From: Grant Ingersoll [mailto:[email protected]] Sent: Wednesday, July 10, 2013 1:11 PM To: [email protected] Subject: Re: Refactoring Lucene to Variable-Width DocIds On Jul 10, 2013, at 2:37 AM, Uwe Schindler <[email protected]> wrote: The internal integers are only to be used *inside* the Lucene API and are not stable at all. I think what Ed is getting at is what if you threw out those assumptions and that instead the _internal_ ids were variable width (and perhaps stable?). Could you then forgo having to do this mapping that everyone is talking about? -Grant
