Thank you Peter.
Actually, I am using the method you suggested. I was thinking that
having another field for the record identification is an overhead since
the doc_id is the minimal and the fastest (if I am not mistaken)
possible way to retrieve records.
Regards,
Alex
On 2014-1-14, 6:18 PM, Peter Karman wrote:
On 1/14/14 3:03 AM, Aleksandar Radovanovic wrote:
Hi there,
I was wondering is it possible to get doc_id during the indexing
process, or can I simply assume that doc_id starts from 0 and increments
with each record added?
Even if you could, I would not recommend that approach for solving
your problem. The doc_id is an internal implementation detail.
Instead, why not assign a unique term (like a URI) to each document in
your index, and reference that externally?
You could also, post indexing, iterate over the Lexicons in an index
and create a new index based on your keyword identification. Note that
'keyword' might be a misnomer depending on what Analysis classes you
apply to your documents: i.e., you might have phrases, etc., not just
single terms.