Re: Generic xsl parser plugin

2014-09-25 Thread Albin Vigier
Hello everybody, I'm just wondering if it is possible to fetch specific metadata with an existing nutch plugin. Let's take an example. I want to extract some metadata from "div" or "td" tags from html pages that have specific ids and name them the way I like (this is done at parser time). Then, a

Re: Generic xsl parser plugin

2014-09-26 Thread Albin Vigier
gt;>>>> http://www.atlantbh.com/precise-data-extraction-with-apache-nutch/ >>>>>>> Is the nutch community going to use this? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Sep 25, 2014

Re: Generic xsl parser plugin

2014-09-27 Thread Albin Vigier
d in an > external file is definitely a good idea and would make a great contribution > to the project. In a nutshell, you haven't missed anything and that wheel > definitely needs inventing ;-) > > Best > > Julien > > > On 25 September 2014 09:24, Albin Vigi

Re: [jira] [Updated] (NUTCH-1644) Should have a parser that uses xpath

2014-11-05 Thread Albin Vigier
Hello Sebastian, I'll look at the xjb failure, so glad to see that it will be integrated into ivy! For the examples part, I normally added some commented tests in the tests folders. I'll look to provide a conf also if not already existing. I'll keep you in touch. Thanks, Albin On Mon, Nov 3, 2