I am guessing your Enterprise system deletes/updates tables in RDBMS,
and your SOLR indexes that data. Additionally to that, you have
front-end interacting with SOLR and with RDBMS. At front-end level, in
case of a search sent to SOLR returning primary keys for data, you may
check your database using primary keys returned by SOLR before
committing output to end users.
To remove records from an index... best-by performance is to have
Master-Slave SOLR instances, remove data from Master SOLR, and
commit/synchronize with Slave nightly (when traffic is lowest). SOLR
won't be in-sync with database, but you can always retrieve PKs from
SOLR, check database for those PKs, and 'filter' output...
--
Thanks,
Fuad Efendi
416-993-2060(cell)
Tokenizer Inc.
==============
http://www.linkedin.com/in/liferay
Quoting sundar shankar <[EMAIL PROTECTED]>:
Hi,
We have an index of courses (about 4 million docs in prod) and
we have a nightly that would pick up newly added courses and update
the index accordingly. There is another Enterprise system that
shares the same table and that could delete data from the table too.
I just want to know what would be the best practice to find out
deleted records and remove it from my index. Unfortunately for us,
we dont maintain a history of the deleted records and thats a big
bane.
Please do advice on what might be the best way to implement this?
-Sundar
_________________________________________________________________
Movies, sports & news! Get your daily entertainment fix, only on live.com
http://www.live.com/?scope=video&form=MICOAL