Thanks Nick, I'll try it over there. Thanks and Regards, Swapna.
-----Original Message----- From: Nick Burch [mailto:[email protected]] Sent: Tuesday, December 20, 2011 10:26 AM To: [email protected] Subject: RE: Capture and map div tags On Tue, 20 Dec 2011, Swapna Vuppala wrote: > Can someone please suggest the method to capture the content within the > "div" tag of a particular class ? You'll likely have more luck asking on the SOLR list, as this looks to be a SOLR specific query and not Tika related Nick > From: Swapna Vuppala [mailto:[email protected]] > Sent: Thursday, December 15, 2011 12:30 PM > To: [email protected] > Subject: Capture and map div tags > > Hi, > > I understand that we can specify parameters in ExtractingRequestHandler in > solrconfig.xml to capture HTML tags of a particular type and map them to > desired solr fields, like something below. > > <str name="capture">div</str> > <str name="fmap.div">mysolrfield</str> > > The above setting will capture content in "div" tags and copy to the solr > field "mysolrfield". > > What am interested is in capturing div tags with a particular class name to a > solr field. When extracting content from outlook messages, I would like to > capture the content within <div class="message-body"> to go into a solr field > and the content within <div class="attachment-entry"> to go into another solr > field. > > Can someone please let me know how to achieve this ? > > Thanks and Regards, > Swapna. > > ____________________________________________________________ > Electronic mail messages entering and leaving Arup business > systems are scanned for acceptability of content and viruses >
