Thanks for quick reply.

Actually I needed some plugin for ATOM feed parsing so while searching in
the source I found FeedParser but it was giving compilation errors. Later I
tried Tika parser and was able to parse ATOM feed. I am not sure if I am
missing something. Basically the tika parser extracted urls and created new
entries in the database and later when I ran fetch job again I was able to
fetch those urls.

So the question is does FeedParser provides some additional functionality
which is missing in Tika parser? As far as I know Tika parser uses ROME
which is well known library for parsing feeds.

Regards,
Anand.

On 1 March 2013 03:38, kiran chitturi <[email protected]> wrote:

> Lewis,
>
> On the same note, the following plugins needs to be ported when i tried to
> build 2.x with Eclipse
>
> i)   Feed
> ii)  parse-swf
> iii) parse-ext
> iv) parse-zip
> v) parse-metatags ( I wrote patch for this earlier, NUTCH-1478)
>
> The above plugins need to be ported to build 2.x successfully with plugins.
>
>
>
> On Thu, Feb 28, 2013 at 4:58 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
> > honestly, I think we should get this fixed.
> > Can someone please explain to me why we don't build every plugin within
> > Nutch 2.x?
> > I think we should.
> >
> >
> > On Thu, Feb 28, 2013 at 12:58 PM, kiran chitturi
> > <[email protected]>wrote:
> >
> > > This is a problem with the feed plugin. It is not yet ported to 2.x.
> > >
> > > The FeedIndexingFilter Class extends the IndexingFilter whose interface
> > and
> > > method changed from 1.x to 2.x
> > >
> > > I fixed a similar one in Parse-metaTags which extends the ParseFilter
> > > interface.
> > >
> > > [Nutch-874] was opened related to these issues but we do not know still
> > > what plugins need to be ported due to the API changes.
> > >
> > >
> > >
> >
> https://issues.apache.org/jira/browse/NUTCH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> > >
> > >
> > >
> > > On Thu, Feb 28, 2013 at 3:26 PM, Lewis John Mcgibbney <
> > > [email protected]> wrote:
> > >
> > > > This shouldn't be happening but we are aware (the Jira instance
> > reflects
> > > > this) that there are some existing compatibility issues with Nutch
> 2.x
> > > > HEAD.
> > > > IIRC Kiran had a patch integrated which dealt with some of these
> > issues.
> > > > What I have to ask is what JDK are you using? I use 1.6.0_25 (I
> really
> > > need
> > > > to upgrade) on my laptop and we run the Apache Nutch nightly builds
> for
> > > > both 1.x trunk and 2.x branch on the latest 1.7 version of Java.
> > > > Unless I have broken my code whilst writing some patches, my code
> > > compiles
> > > > flawlessly locally and as a project we do not have regular compiler
> > > issues
> > > > with our development nightly builds.
> > > >
> > > > On Wed, Feb 27, 2013 at 10:15 PM, Anand Bhagwat <
> [email protected]
> > > > >wrote:
> > > >
> > > > > Hi,
> > > > > I want to use FeedParser plugin which comes as part of Nutch 2.1
> > > > > distribution. When I am trying to build it  its giving compilation
> > > > errors.
> > > > > I think its using some classes from Nutch 1.6 which are not
> > available.
> > > > Any
> > > > > suggestions as to how I can resolve this issue?
> > > > >
> > > > >   *[javac]
> > > > >
> > > > >
> > > >
> > >
> >
> /home/adminibm/Documents/workspace-sts-3.1.0.RELEASE/nutch2/src/plugin/feed/src/java/org/apache/nutch/indexer/feed/FeedIndexingFilter.java:28:
> > > > > cannot find symbol
> > > > >     [javac] symbol  : class CrawlDatum
> > > > >     [javac] location: package org.apache.nutch.crawl
> > > > >     [javac] import org.apache.nutch.crawl.CrawlDatum;
> > > > >     [javac]                              ^
> > > > >     [javac]
> > > > >
> > > > >
> > > >
> > >
> >
> /home/adminibm/Documents/workspace-sts-3.1.0.RELEASE/nutch2/src/plugin/feed/src/java/org/apache/nutch/indexer/feed/FeedIndexingFilter.java:29:
> > > > > cannot find symbol
> > > > >     [javac] symbol  : class Inlinks
> > > > >     [javac] location: package org.apache.nutch.crawl
> > > > >     [javac] import org.apache.nutch.crawl.Inlinks;
> > > > >     [javac]                              ^
> > > > >     [javac]
> > > > >
> > > > >
> > > >
> > >
> >
> /home/adminibm/Documents/workspace-sts-3.1.0.RELEASE/nutch2/src/plugin/feed/src/java/org/apache/nutch/indexer/feed/FeedIndexingFilter.java:36:
> > > > > cannot find symbol
> > > > >     [javac] symbol  : class ParseData
> > > > >     [javac] location: package org.apache.nutch.parse
> > > > >     [javac] import org.apache.nutch.parse.ParseData;
> > > > >     [javac]                              ^*
> > > > >
> > > > > Thanks,
> > > > > Anand.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Lewis*
> > > >
> > >
> > >
> > >
> > > --
> > > Kiran Chitturi
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>
>
>
> --
> Kiran Chitturi
>

Reply via email to