How about splitting the 32 byte field into for example 16 subfields of 2 bytes each? Then any direct query on that field needs to be transformed into a boolean query requiring all 16 subfield terms. Would that work?
Regards, Paul Elschot Op donderdag 21 oktober 2010 21:44:34 schreef eks dev: > Hi All, > I am trying to figure out a way to implement following use case with > lucene/solr. > > > In order to support simple incremental updates (master) I need to index and > store UID Field on 300Mio collection. (My UID is a 32 byte sequence). But I > do > not need indexed (only stored) it during normal searching (slaves). > > > The problem is that my term dictionary gets blown away with sheer number of > unique IDs. Number of unique terms on this collection, excluding UID is less > than 7Mio. > I can tolerate resources hit on Updater (big hardware, on disk index...). > > This is a master slave setup, where searchers run from RAMDisk and having > 300Mio * 32 (give or take prefix compression) plus pointers to postings and > postings is something I would really love to avoid as this is significant > compared to really small documents I have. > > > Cutting to the chase: > How I can have Indexed UID field, and when done with indexing: > 1) Load "searchable" index into ram from such an index on disk without one > field? > > 2) create 2 Indices in sync on docIDs, One containing only indexed UID > 3) somehow transform index with indexed UID by droping UID field, preserving > docIs. Kind of tool smart index-editing tool. > > Something else already there i do not know? > > Preserving docIds is crucial, as I need support for lovely incremental > updates > (like in solr master-slave update). Also Stored field should remain! > I am not looking for "use MMAPed Index and let OS deal with it advice"... > I do not mind doing it with flex branch 4.0, nut being in a hurry. > > Thanks in advance, > Eks > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org