Hi All, I am trying to figure out a way to implement following use case with lucene/solr.
In order to support simple incremental updates (master) I need to index and store UID Field on 300Mio collection. (My UID is a 32 byte sequence). But I do not need indexed (only stored) it during normal searching (slaves). The problem is that my term dictionary gets blown away with sheer number of unique IDs. Number of unique terms on this collection, excluding UID is less than 7Mio. I can tolerate resources hit on Updater (big hardware, on disk index...). This is a master slave setup, where searchers run from RAMDisk and having 300Mio * 32 (give or take prefix compression) plus pointers to postings and postings is something I would really love to avoid as this is significant compared to really small documents I have. Cutting to the chase: How I can have Indexed UID field, and when done with indexing: 1) Load "searchable" index into ram from such an index on disk without one field? 2) create 2 Indices in sync on docIDs, One containing only indexed UID 3) somehow transform index with indexed UID by droping UID field, preserving docIs. Kind of tool smart index-editing tool. Something else already there i do not know? Preserving docIds is crucial, as I need support for lovely incremental updates (like in solr master-slave update). Also Stored field should remain! I am not looking for "use MMAPed Index and let OS deal with it advice"... I do not mind doing it with flex branch 4.0, nut being in a hurry. Thanks in advance, Eks --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org