Hey, In Lucene 3.1 we've introduced PayloadProcessorProvider which allows you to rewrite payloads of terms during merge. The main scenario is when you merge indexes, and you want to rewrite/remap payloads of the incoming indexes, but one can certainly use it to rewrite the payloads of a term, in a given index.
When we worked on it, we thought of two ways the user can rewrite payloads when he merges indexes: 1) Set PPP on the target IW, call addIndexes(IndexReader), while PPP will be applied on the incoming directories only. 2) Set PPP on the source IW, call IW.optimize(), then use targetIW.addIndexes(Directory). The latter is better since in both cases the incoming segments are rewritten anyway, however in the first case you might run into merging segments of the target index as well, something you might want to avoid (that was the purpose of optimizing addIndexes(Directory)). But it turns out the latter is not so easy to achieve. If the source index has only 1 segment (at least in my case, ~100% of the time), then calling optimize() doesn't do anything because the MP thinks the index is already optimized and returns no MergeSpec. To overcome this, I wrote a ForceOptimizeMP which extends LogMP and forces optimize even if there is only one segment. Another option is to set the noCFSRation to 1.0 and flip the useCompoundFile flag (ie if source is compound, create no compound and vice versa). That can work too, but I don't think it's very good, because the source index will be changed from compound to non (or vice versa), which is something that the app didn't want. So I think option 1 is better, but I wanted to ask if someone knows of a better way to achieve this? Shai