Ok sorry for not explaining my problem clearly earlier. We have around 5
fields in each document. ID, ISBN, author, title and the category which
this book falls under. ( You are right about point 3, we are indeed storing
multiple genre against the book, which means 1 book 1 doc.)
doc.add(new Fie
Hi Anish,
So am I getting something wrong here? You said "I have created a search
index on book Id , title ,and author from a database of books which fall
under various categories." so those are 3 fields, right?
1. How do you filter the doc types (as in the genres) at search time? Do you
even need
Hi Zhangchi
Thanks for your reply.
We have about 3 million records (different isbns) in the database and
documents little more than that, and we wouldn't want to do the deduping at
indexing time, because one book ( one isbn ) can be available under 2 or
more categories( like fiction, comics &
Hi Ian,
Thanks for your reply. We had actually done what you had suggested first,
and it wasn't working, so I was hoping for some sample code. But then we
found out that the field name on which we wanted the duplicate filter to be
applied was not actually indexed while adding it into the document
i think you should check the index first.using the lukeall to see if there
is the duplicate books.
On Thu, 04 Mar 2010 20:43:26 +0800, ani...@ekkitab
wrote:
Hi there, Could someone help me with the usage of DuplicateFilters. Here
is
my problem
I have created a search index on book
If the field you want to use for deduping is ISBN, create a
DuplicateFilter using whatever your ISBN field name is as the field
name and pass that to one of the search methods that takes a filter.
If your index is large I'd be worried about performance and would look
at deduping at indexing time i