Thank you for your help Jack. I just wanted to know if there is any ready made solution for this because i really don't know about extracting meta information.
awaiting reply.. Thank you On Tue, Feb 19, 2013 at 12:48 PM, Jack Krupansky <j...@basetechnology.com>wrote: > Use the standard update handler and pass the entire HTML page as literal > text in a Solr XML document for the field that has the HTML strip filter, > but be sure to escape the HTML (angle brackets, ampersands, etc.) syntax. > > You'll have to process meta information yourself. > > > -- Jack Krupansky > > -----Original Message----- From: Divyanand Tiwari > Sent: Monday, February 18, 2013 10:52 PM > To: solr-user@lucene.apache.org > Subject: Re: How can i instruct the Solr/ Solr Cell to output the original > HTML document which was fed to it.? > > > Thank you for replying sir !!! > > I have two queries related with this - > > 1) So in this case which request handler I have to use because > 'ExtractingRequestHandler' by default strips the html content and the > default handler 'UpdateRequestHandler' does not accepts the HTML contrents. > > 2) How can I 'Extract' & 'Index' META information in the HTML document > separately. > > Awaiting your reply.... > Thank you!!! > -- Regards, Divyanand Tiwari