By 'order', I mean that I'm adding the documents in the big index
sorted
by date (in order to increase the sorting process). I wanna preserve
this sorting after the merging process.
I'm not using the internal lucene ID in the code field. The code
field
contains my own IDs. I was asking, if I can do the merge using my own
IDs (the code field), and not the lucene internal IDs, for example:
luceneID_0, code_x, title_x, content_x, language_x, date_x
luceneID_1, code_y, title_y, content_y, language_y, date_y
luceneID_0, code_y, cluster_y
luceneID_1, code_x, cluster_x
Will the prevous index structure procude an unconsistent merged
index?
I wanna achieve the following merged index:
luceneID_0, code_x, title_x, content_x, language_x, date_x, cluster_x
luceneID_1, code_y, title_y, content_y, language_y, date_y, cluster_y
Thanks
Otis Gospodnetic wrote:
Albert,
--- Albert Vila <[EMAIL PROTECTED]> wrote:
Thanks Otis, but I can merge two indexes with different fields?
Yes. Documents with different Fields can be stored in the same
index.
Not every Document has to have all fields, and it can even have a
completely different set of Fields.
My big index has this fields, code, title, content, language and
date. I add the new documents incrementally.
The clustering index only contains the fields code, and cluster.
Merging
the big index with the clustering one will preserve the order of
the
big one?
I don't fully understand what you mean by 'order'. If you are
asking
whether internal document Ids will remain the same, the answer is
negative. If you have deleted some documents, there will be gaps in
document Id sequence, which Lucene will fill, thus re-assigning
internal document Ids.
For example, if I have the following indexes:
Big index
code_1, title_1, content_1, language_1, date_1
code_2, title_2, content_2, language_2, date_2
....
Clustering index
code_1, cluster_1
code_2, cluster_2
....
then the new merged index will be:
Merged index
code_1, title_1, content_1, language_1, date_1, cluster_1
code_2, title_2, content_2, language_2, date_2, cluster_2
....
If I can do that then fine, but I think the merging process uses
the
lucene internal ID to match the documents. I wanna use the code
field
to
do that matching, is that possible?. I cannot be sure the lucene
internal ID's are the same for the same codes in both indexes.
Are you storing the internal Lucene Document Id in the 'code' field?
If you are, I suggest you change your application to use its own set
of
unique Ids to serve as 'primary keys' in your indices.
Otis
Thanks again,
Albert
Otis Gospodnetic wrote:
(re-directing to lucene-user list)
Albert,
If I understand your question correctly... You could run a query
like
the one you gave on both indices, but if one of them contains
documents
that have only one of those fields (cluster), then there will
never
be
any matches in the second index.
However, why not leave your big index along, add documents to a
new,
smaller index, and then merge them periodically. I may be off
with
this; it sounds like this is what you want to do, but I'm not
certain I
understood you fully.
Otis
--- Albert Vila <[EMAIL PROTECTED]> wrote:
Hi all,
I was wondering If I can search using the MultiSearcher over two
diferent indexes at the same time (with diferent fields).
I've got one big index, with the code, title, content, language,
etc
fields (new documents are added incrementally). Now, I have to
introduce
a clustering field. The problem is that I have to update the
whole
index
each time the clusters change, and I have no enought time to do
it
(I
wanna check for new clusters every 10 minuts and I spent 25
minutes
to
reindex the whole index).
A query example could be: language:0 and title:java and cluster:0
Can I leave the big index whitout any changes and create a new
index
with only the following fields, code and cluster, and perform the
searches using this two indexes? I think I cannot do that without
changing the code. It would need a postprocess, matching all
returning
codes from index 1 with index 2.
Anyone have a solution for this problem? I would appreciate that.
--
Albert Vila
Director de proyectos I+D
http://www.imente.com
902 933 242
[iMente “La información con más beneficios”]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Albert Vila
Director de proyectos I+D
http://www.imente.com
902 933 242
[iMente “La información con más beneficios”]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]