Fixed typo. Changed "parse-tike" to "parse-tika². Zero affect.
On 10/14/15, 12:24 PM, "Drulea, Sherban" <[email protected]> wrote: >No luck. > >I changed my parse-plugin.xml and still zero URLs parsed: > >parse-plugin.xml > >-------------------------------------- ><?xml version="1.0" encoding="UTF-8"?> > ><parse-plugins> > > <!-- by default if the mimeType is set to *, or > if it can't be determined, use parse-tika --> > <mimeType name="*"> > <plugin id="parse-tika" /> > </mimeType> > > <mimeType name="text/html"> > <plugin id="parse-tike" /> > </mimeType> > > <mimeType name="application/xhtml+xml"> > <plugin id="parse-tika" /> > </mimeType> > > <mimeType name="application/rss+xml"> > <plugin id="parse-tika" /> > <plugin id="feed" /> > </mimeType> > > <mimeType name="application/x-bzip2"> > <!-- try and parse it with the zip parser --> > <plugin id="parse-zip" /> > </mimeType> > > <mimeType name="application/x-gzip"> > <!-- try and parse it with the zip parser --> > <plugin id="parse-zip" /> > </mimeType> > > <mimeType name="application/x-javascript"> > <plugin id="parse-js" /> > </mimeType> > > <mimeType name="application/x-shockwave-flash"> > <plugin id="parse-swf" /> > </mimeType> > > <mimeType name="application/zip"> > <plugin id="parse-zip" /> > </mimeType> > > <mimeType name="text/xml"> > <plugin id="parse-tika" /> > <plugin id="feed" /> > </mimeType> > > <!-- Types for parse-ext plugin: required for unit tests to pass. >--> > > <mimeType name="application/vnd.nutch.example.cat"> > <plugin id="parse-ext" /> > </mimeType> > > <mimeType name="application/vnd.nutch.example.md5sum"> > <plugin id="parse-ext" /> > </mimeType> > > <!-- alias mappings for parse-xxx names to the actual extension >implementation > ids described in each plugin's plugin.xml file --> > <aliases> > <alias name="parse-html" > extension-id="org.apache.nutch.parse.html.HtmlParser" /> > <alias name="parse-tika" > extension-id="org.apache.nutch.parse.tika.TikaParser" /> > <alias name="parse-ext" extension-id="ExtParser" /> > <alias name="parse-js" extension-id="JSParser" /> > <alias name="feed" > extension-id="org.apache.nutch.parse.feed.FeedParser" /> > <alias name="parse-swf" > extension-id="org.apache.nutch.parse.swf.SWFParser" /> > <alias name="parse-zip" > extension-id="org.apache.nutch.parse.zip.ZipParser" /> > </aliases> > ></parse-plugins> > > > >On 10/12/15, 8:34 PM, "cuongcm.inews" <[email protected]> wrote: > >>Have you try change parse-plugin.xml >><mimeType name="text/html"> >> <plugin id="parse-tika" /> >></mimeType> >>it worked for me :) >> >> >> >>-- >>View this message in context: >>http://lucene.472066.n3.nabble.com/nutch-2-3-1-doesn-t-crawl-tp4232374p42 >>3 >>4192.html >>Sent from the Nutch - User mailing list archive at Nabble.com. > > >__________________________________________________________________________ > >This email message is for the sole use of the intended recipient(s) and >may contain confidential information. Any unauthorized review, use, >disclosure or distribution is prohibited. If you are not the intended >recipient, please contact the sender by reply email and destroy all copies >of the original message. >

