Fixed typo. Changed "parse-tike" to "parse-tika². Zero affect.


On 10/14/15, 12:24 PM, "Drulea, Sherban" <[email protected]> wrote:

>No luck.
>
>I changed my parse-plugin.xml and still zero URLs parsed:
>
>parse-plugin.xml
>
>--------------------------------------
><?xml version="1.0" encoding="UTF-8"?>
>
><parse-plugins>
>
>  <!--  by default if the mimeType is set to *, or
>        if it can't be determined, use parse-tika -->
>       <mimeType name="*">
>         <plugin id="parse-tika" />
>       </mimeType>
>
>       <mimeType name="text/html">
>               <plugin id="parse-tike" />
>       </mimeType>
>
>        <mimeType name="application/xhtml+xml">
>               <plugin id="parse-tika" />
>       </mimeType>
>
>       <mimeType name="application/rss+xml">
>           <plugin id="parse-tika" />
>           <plugin id="feed" />
>       </mimeType>
>
>       <mimeType name="application/x-bzip2">
>               <!--  try and parse it with the zip parser -->
>               <plugin id="parse-zip" />
>       </mimeType>
>
>       <mimeType name="application/x-gzip">
>               <!--  try and parse it with the zip parser -->
>               <plugin id="parse-zip" />
>       </mimeType>
>
>       <mimeType name="application/x-javascript">
>               <plugin id="parse-js" />
>       </mimeType>
>
>       <mimeType name="application/x-shockwave-flash">
>               <plugin id="parse-swf" />
>       </mimeType>
>
>       <mimeType name="application/zip">
>               <plugin id="parse-zip" />
>       </mimeType>
>
>       <mimeType name="text/xml">
>               <plugin id="parse-tika" />
>               <plugin id="feed" />
>       </mimeType>
>
>       <!-- Types for parse-ext plugin: required for unit tests to pass.
>-->
>
>       <mimeType name="application/vnd.nutch.example.cat">
>               <plugin id="parse-ext" />
>       </mimeType>
>
>       <mimeType name="application/vnd.nutch.example.md5sum">
>               <plugin id="parse-ext" />
>       </mimeType>
>
>       <!--  alias mappings for parse-xxx names to the actual extension
>implementation 
>       ids described in each plugin's plugin.xml file -->
>       <aliases>
>               <alias name="parse-html"
>                       extension-id="org.apache.nutch.parse.html.HtmlParser" />
>               <alias name="parse-tika"
>                       extension-id="org.apache.nutch.parse.tika.TikaParser" />
>               <alias name="parse-ext" extension-id="ExtParser" />
>               <alias name="parse-js" extension-id="JSParser" />
>               <alias name="feed"
>                       extension-id="org.apache.nutch.parse.feed.FeedParser" />
>               <alias name="parse-swf"
>                       extension-id="org.apache.nutch.parse.swf.SWFParser" />
>               <alias name="parse-zip"
>                       extension-id="org.apache.nutch.parse.zip.ZipParser" />
>       </aliases>
>       
></parse-plugins>
>
>
>
>On 10/12/15, 8:34 PM, "cuongcm.inews" <[email protected]> wrote:
>
>>Have you try change parse-plugin.xml
>><mimeType name="text/html">
>>      <plugin id="parse-tika" />
>></mimeType>
>>it worked for me :)
>>
>>
>>
>>--
>>View this message in context:
>>http://lucene.472066.n3.nabble.com/nutch-2-3-1-doesn-t-crawl-tp4232374p42
>>3
>>4192.html
>>Sent from the Nutch - User mailing list archive at Nabble.com.
>
>
>__________________________________________________________________________
>
>This email message is for the sole use of the intended recipient(s) and
>may contain confidential information. Any unauthorized review, use,
>disclosure or distribution is prohibited. If you are not the intended
>recipient, please contact the sender by reply email and destroy all copies
>of the original message.
>

Reply via email to