I could crawl a bunch of urls using Nutch 2.2.1 with data stored in MySQL and I could index it using Solr. Now, when I want to display the search results on the front-end(using 'ajax-solr'), I am not sure how to display a snippet below the title just like the way google does.
Nutch crawler when it crawls a site, it grabs all the data on a site including the text in a banner, navigation, etc into a field called 'text'(earlier it used to be 'content'). If I want to use that 'text' column to serve as a snippet on the search results page, it looks odd as the snipped looks something like this - *Publications [Jump to the main content of this page] Home Publications Home Author's Corner All Publications Advanced Search Site Map Search Online Publications Ordering printed copies. Electronic Mailing List : Keep informed about our new publications. Technical Help : Problems or questions with our site? * As you see above sample snippet - it shows the text included in banner of a site along with navigation '[Jump to the main content of this page] ' and lot of unncessary information rather than the description of a site as a snippet. I have to crawl sites with a unknown/poor structure on which I have no control. How to achieve displaying a proper snippet and less of garbage on a search result snippet (something similar to snippet on google search result )?

