Found the issue! plugin.xml defined extension id which didn't match id inside
mimeType="application/xhtml+xml" tag parse-plugins.xml.

i.e.: below bold highlighted should match.
plugin.xml:
<?xml version="1.0" encoding="UTF-8"?>
<plugin
   id="food"
   name="Food Parser."
   version="1.0.0"
   provider-name="amrut">

   <runtime>
      <library name="food.jar">
         <export name="*"/>
      </library>
   </runtime>

   <requires>
      <import plugin="nutch-extensionpoints"/>
   </requires>

   <extension id="com.amrut.parser.TDRParser"
              name="TDR Parser"
              point="org.apache.nutch.parse.Parser">

*
    <implementation id="com.amrut.parser.TDRParser"
         class="com.amrut.parser.TDRParser">
        <parameter name="contentType" value="application/xhtml+xml"/>
      </implementation>
*
   </extension>
</plugin>

parse-plugins.xml:

<?xml version="1.0" encoding="UTF-8"?>
<parse-plugins>
....
        <mimeType name="application/xhtml+xml">
*               <plugin id="food" />*
        </mimeType>
....
        
        <aliases>
*               <alias name="food"
                        extension-id="com.amrut.parser.TDRParser" />*
                <alias name="parse-tika" 
                        extension-id="org.apache.nutch.parse.tika.TikaParser" />
                <alias name="parse-ext" extension-id="ExtParser" />
                <alias name="parse-html"
                        extension-id="org.apache.nutch.parse.html.HtmlParser" />
                <alias name="parse-js" extension-id="JSParser" />
                <alias name="feed"
                        extension-id="org.apache.nutch.parse.feed.FeedParser" />
                <alias name="parse-swf"
                        extension-id="org.apache.nutch.parse.swf.SWFParser" />
                <alias name="parse-zip"
                        extension-id="org.apache.nutch.parse.zip.ZipParser" />
        </aliases>
</parse-plugins>

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Configuration-issue-Custom-parser-not-being-recognised-tp3179819p3190290.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to