By examining solr.log, I can see that Nutch is using the /update request handler rather than /update/extract. So, this may be a more appropriate question for the nutch mailing list. OTOH, y'all know the anwser off the top of your head.
Will Nutch boost text occurring in h1, h2, etc. more heavily than text in a normal paragraph? Can this weighting be tuned without writing a plugin? Is writing a plugin often needed because of the flexibility that is needed in practice? I wanted to call this post *Anatomy of a small scale search engine*, but lacked the nerve ;) Thanks, all and many, Dan Davis, Systems/Applications Architect National Library of Medicine
