> OK, I think see what you're up to. Might be pretty viable > for me as well. > Can you talk about anything in your mappings.txt files that > is an > important part of the solution?
It is not important. I just copied it. Plus html strip char filter does not have mappings parameter. It was a copy paste mistake. > Also, isn't there another piece? Don't you need to force it > to return the > whole document, rather than its usual context chunks? Yes you are right. &hl.fragsize=0 is needed. > We have another requirement I forgot to mention, about > wanting to > associate a sequence number with each hit, but I imagine I > can deal with > that by putting some sort of identifiable char sequence in > a custom prefix > for the highlighting, then replacing that with a sequence > number in > postprocessing. > > I'm also wondering about the performance of this approach > with large > documents, vs. something like what Ludovic is talking > about, where you > would just get positions back from Solr, and fetch the > document separately > from a filestore. Highlighting large documents takes time. Storing termVectors can be used to speedup. I don't know the answer to performance comparison. Perhaps someone familiar with highlighting can answer this.