I wonder how the "field collapsing" patch holds up on an index that contains 3 
million documents, probably larger than your EAD-only one, but thinking about 
combining EAD in an index with many many other documents (like with a library 
catalog).  Might be fine, might not.

(Even without field collapsing, my solr index is really straining against the 
numerous facets I'm making it calculate and the dismax queries involving a 
dozen or more fields -- I plan to reduce my fields, reduce my facets if 
possible, and most importantly give my Solr a LOT more RAM than it has now. 
Complex queries with complex facetting on a several-million-doc index requires 
giving Solr a LOT more RAM for caches etc than we initially expected, I throw 
this in as a note to anyone else in the planning stages). 

I've been brainstorming other weird ways to do this. This one is totally wacky 
and possibly a bad idea, but I'll throw it out there anyway. What if you only 
indexed the entire EAD as one document, BUT threw the entire EAD in a stored 
field, and used solr highlightning on that field.  NOT to show the highlighter 
results to the user, but to sort of trick the highlighter, using 
hl.fragmenter/fragmentsBuilder (possibly with a custom component in a jar) to 
telling you _which_ sub-sections of the EAD matched, and your software could 
then display the matching sub-sections (possibly with direct links to display) 
in the search results, under the actual document hit. 

Possibly a really screwy idea, just throwing it out there. Solr highlightning 
can be a performance problem on very large stored documents too, not sure if 
typical EAD is 'very large' for these purposes, or if it's something that can 
be solved by throwing enough RAM at caches. But I guess something about the 
field collapsing patch makes me nervous, comments about it's performance being 
uncertain on very large result sets, or just nervousness about applying a patch 
to solr and counting on someone else to keep it working against solr master as 
it develops. 

Jonathan

Reply via email to