I looked at HtmlParseFilter . I think that thats exactly what i need but for other file types as well, not just html. Any reason why this behaivour was implemented only for html files?
I'm thinking of extending this implementation so it would be available for other types. any advice on that? On Wed, Mar 7, 2012 at 11:14 PM, Ferdy Galema <[email protected]>wrote: > Hi, > > Do you mean running multiple parsers in a single parse action? That is > currently only possible for html types. Take a look at HtmlParseFilter for > that. You can chain multiple parsers for a single url, in addition to > regular html parsing. For other types it's not possible. > > If this is about running a parse implementation on all urls regardless of > mimetype, you have to change the parser mappings in parse-plugins.xml > and the parser's plugin.xml. But again there is only support for running > one Parser on a single document. > > Ferdy. > > On Wed, Mar 7, 2012 at 2:34 PM, [email protected] < > [email protected] > > wrote: > > > Hi > > I've looked at nutch's code in ParseUtil and it seems that it was > designed > > so only one parses is eventually activated on a single url. > > What's the reason for this? > > What should I do if I want, in addition to the existing parsers, add a > > parser that will get a certain field out of the url, an run this > behaivour > > on all the urls? > > Do I have to add this code to all the parsers? > > > > > > thanks. > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/Multiple-parsers-tp3806721p3806721.html > > Sent from the Nutch - User mailing list archive at Nabble.com. > > >

