No luck.
I changed my parse-plugin.xml and still zero URLs parsed:
parse-plugin.xml
--------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<parse-plugins>
<!-- by default if the mimeType is set to *, or
if it can't be determined, use parse-tika -->
<mimeType name="*">
<plugin id="parse-tika" />
</mimeType>
<mimeType name="text/html">
<plugin id="parse-tike" />
</mimeType>
<mimeType name="application/xhtml+xml">
<plugin id="parse-tika" />
</mimeType>
<mimeType name="application/rss+xml">
<plugin id="parse-tika" />
<plugin id="feed" />
</mimeType>
<mimeType name="application/x-bzip2">
<!-- try and parse it with the zip parser -->
<plugin id="parse-zip" />
</mimeType>
<mimeType name="application/x-gzip">
<!-- try and parse it with the zip parser -->
<plugin id="parse-zip" />
</mimeType>
<mimeType name="application/x-javascript">
<plugin id="parse-js" />
</mimeType>
<mimeType name="application/x-shockwave-flash">
<plugin id="parse-swf" />
</mimeType>
<mimeType name="application/zip">
<plugin id="parse-zip" />
</mimeType>
<mimeType name="text/xml">
<plugin id="parse-tika" />
<plugin id="feed" />
</mimeType>
<!-- Types for parse-ext plugin: required for unit tests to pass.
-->
<mimeType name="application/vnd.nutch.example.cat">
<plugin id="parse-ext" />
</mimeType>
<mimeType name="application/vnd.nutch.example.md5sum">
<plugin id="parse-ext" />
</mimeType>
<!-- alias mappings for parse-xxx names to the actual extension
implementation
ids described in each plugin's plugin.xml file -->
<aliases>
<alias name="parse-html"
extension-id="org.apache.nutch.parse.html.HtmlParser" />
<alias name="parse-tika"
extension-id="org.apache.nutch.parse.tika.TikaParser" />
<alias name="parse-ext" extension-id="ExtParser" />
<alias name="parse-js" extension-id="JSParser" />
<alias name="feed"
extension-id="org.apache.nutch.parse.feed.FeedParser" />
<alias name="parse-swf"
extension-id="org.apache.nutch.parse.swf.SWFParser" />
<alias name="parse-zip"
extension-id="org.apache.nutch.parse.zip.ZipParser" />
</aliases>
</parse-plugins>
On 10/12/15, 8:34 PM, "cuongcm.inews" <[email protected]> wrote:
>Have you try change parse-plugin.xml
><mimeType name="text/html">
> <plugin id="parse-tika" />
></mimeType>
>it worked for me :)
>
>
>
>--
>View this message in context:
>http://lucene.472066.n3.nabble.com/nutch-2-3-1-doesn-t-crawl-tp4232374p423
>4192.html
>Sent from the Nutch - User mailing list archive at Nabble.com.
__________________________________________________________________________
This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.