Hi,

I still failed upgrading nutch 2.1 with Tika 1.2 :(
I followed to copy as mentioned on NUTCH-1433 patch, execute "ant runtime".
But too many errors!

========================================
.... part of:
    [javac]
/home/bayu/Downloads/solr/apache-nutch-2.1/src/java/org/apache/nutch/util/PrefixStringMatcher.java:50:
warning: [rawtypes] found raw type: Iterator
    [javac]     Iterator iter= prefixes.iterator();
    [javac]     ^
    [javac]   missing type arguments for generic class Iterator<E>
    [javac]   where E is a type-variable:
    [javac]     E extends Object declared in interface Iterator
    [javac]
/home/bayu/Downloads/solr/apache-nutch-2.1/src/java/org/apache/nutch/util/SuffixStringMatcher.java:44:
warning: [rawtypes] found raw type: Collection
    [javac]   public SuffixStringMatcher(Collection suffixes) {
    [javac]                              ^
    [javac]   missing type arguments for generic class Collection<E>
    [javac]   where E is a type-variable:
    [javac]     E extends Object declared in interface Collection
    [javac]
/home/bayu/Downloads/solr/apache-nutch-2.1/src/java/org/apache/nutch/util/SuffixStringMatcher.java:46:
warning: [rawtypes] found raw type: Iterator
    [javac]     Iterator iter= suffixes.iterator();
    [javac]     ^
    [javac]   missing type arguments for generic class Iterator<E>
    [javac]   where E is a type-variable:
    [javac]     E extends Object declared in interface Iterator
    [javac]
/home/bayu/Downloads/solr/apache-nutch-2.1/src/java/org/apache/nutch/util/ToolUtil.java:48:
warning: [unchecked] unchecked cast
    [javac]     Map<String,Object> jobs =
(Map<String,Object>)results.get(Nutch.STAT_JOBS);
    [javac]                                                              ^
    [javac]   required: Map<String,Object>
    [javac]   found:    Object
    [javac] 100 errors
    [javac] 52 warnings

BUILD FAILED
/home/bayu/Downloads/solr/apache-nutch-2.1/build.xml:97: Compile failed;
see the compiler error output for details.

Total time: 18 seconds
========================================

Anyone can give me a hint?

In parallel I changed to use nutch 1.6 binary and works well.
But curious to use the latest of nutch 2.1.

Thanks in advance!

On Sun, Dec 30, 2012 at 1:46 PM, Bayu Widyasanyata
<[email protected]>wrote:

> Hi,
>
> Thank you for suggestions.
> And I was try to upgrade the Tika to 1.2 as mentioned on
> https://issues.apache.org/jira/browse/NUTCH-1433
>
> I will try your suggestions and/or upgrade tika.
>
> On Sun, Dec 30, 2012 at 6:07 AM, Dave Meikle <[email protected]> wrote:
> > Hi,
> >
> > Tika should parse those formats, so unless there is something peculiar
> > with all your files or setup, have you tried the:
> >
> > - Size of the files to see if they are over configured limits
> > - used the nutch parsechecker command to test individual files
> >
> > Cheers,
> > Dave
> >
> > On 25 Dec 2012, at 01:34, Bayu Widyasanyata <[email protected]>
> wrote:
> >
> >> Hi,
> >>
> >> ==Update==
> >>
> >> Checking hadoop.log found some interesting info that the parsing was
> >> not completed successfully.
> >>
> >> ...
> >> 2012-12-25 08:15:09,480 INFO  parse.ParserJob - Parsing
> >> http://localhost/sapi/Akhirat%20Lebih%20Utama%20Daripada%20Dunia.odt
> >> 2012-12-25 08:15:09,480 INFO  parse.ParserFactory - The parsing
> >> plugins: [org.apache.nutch.parse.tika.TikaParser] are enabled via the
> >> plugin.includes system property, and all claim to support the content
> >> type application/vnd.oasis.opendocument.text, but they are not mapped
> >> to it  in the parse-plugins.xml file
> >> 2012-12-25 08:15:09,517 WARN  parse.ParseUtil - Unable to successfully
> >> parse content
> http://localhost/sapi/Akhirat%20Lebih%20Utama%20Daripada%20Dunia.odt
> >> of type application/vnd.oasis.opendocument.text
> >> 2012-12-25 08:15:09,520 INFO  parse.ParserJob - Parsing
> >> http://localhost/sapi/Akhirat%20Lebih%20Utama%20Daripada%20Dunia.pdf
> >> 2012-12-25 08:15:09,521 INFO  parse.ParserFactory - The parsing
> >> plugins: [org.apache.nutch.parse.tika.TikaParser] are enabled via the
> >> plugin.includes system property, and all claim to support the content
> >> type application/pdf, but they are not mapped to it  in the
> >> parse-plugins.xml file
> >> 2012-12-25 08:15:09,545 WARN  parse.ParseUtil - Unable to successfully
> >> parse content
> http://localhost/sapi/Akhirat%20Lebih%20Utama%20Daripada%20Dunia.pdf
> >> of type application/pdf
> >> 2012-12-25 08:15:09,551 INFO  parse.ParserJob - Parsing
> >> http://localhost/sapi/Akhirat_Lebih_Utama_Daripada_Dunia.odt
> >> 2012-12-25 08:15:09,560 WARN  parse.ParseUtil - Unable to successfully
> >> parse content
> http://localhost/sapi/Akhirat_Lebih_Utama_Daripada_Dunia.odt
> >> of type application/vnd.oasis.opendocument.text
> >> 2012-12-25 08:15:09,563 INFO  parse.ParserJob - Parsing
> >> http://localhost/sapi/nospasi_Akhirat_Lebih_Utama_Daripada_Dunia.pdf
> >> 2012-12-25 08:15:09,590 WARN  parse.ParseUtil - Unable to successfully
> >> parse content
> http://localhost/sapi/nospasi_Akhirat_Lebih_Utama_Daripada_Dunia.pdf
> >> of type application/pdf
> >> 2012-12-25 08:15:09,597 INFO  parse.ParserJob - Parsing
> >>
> http://localhost/sapi/spasi%20Akhirat%20Lebih%20Utama%20Daripada%20Dunia.pdf
> >> 2012-12-25 08:15:09,652 WARN  parse.ParseUtil - Unable to successfully
> >> parse content
> http://localhost/sapi/spasi%20Akhirat%20Lebih%20Utama%20Daripada%20Dunia.pdf
> >> of type application/pdf
> >> ...
> >>
> >> I checked the parse-plugins.xml file and found no plugins handling
> >> type of application/pdf and application/vnd.oasis.opendocument.text.
> >> I knew that parse-tika handle PDF files but why those errors were still
> occurs?
> >>
> >> Any documents/links could explain in easy way to install and activate
> >> those supported plugins as mentioned at [1] on nutch parser?
> >>
> >> [1] http://tika.apache.org/1.2/formats.html#Portable_Document_Format
> >>
> >> Thanks,
> >>
> >> --
> >> wassalam,
> >> [bayu]
>
>
>
> --
> wassalam,
> [bayu]
>



-- 
wassalam,
[bayu]

Reply via email to