A common approach (for web search engines) is to use HBase [1] as a "Document Repository". Each document indexed inside Solr will have an entry (row, identified by the document URL) in the HBase table. This works great when you deal with a large data collection (it scales better than a SQL database). The counterpart is that it is slightly slower than a local database.

[1] http://hadoop.apache.org/hbase/
--
Renaud Delbru

roberto wrote:
Hello,

We are indexing information from diferent sources so we would like to
centralize the information content so i can retrieve using the ID
provided buy solr?

Does anyone did something like this, and have some advices ? I
thinking in store the information into a database like mysql ?

Thanks,

Reply via email to