[ http://issues.apache.org/jira/browse/NUTCH-140?page=all ]
     
Jerome Charron closed NUTCH-140:
--------------------------------

    Fix Version: 0.8-dev
     Resolution: Fixed

I have committed the patch provided by Chris with some modifications:
(http://svn.apache.org/viewcvs.cgi?rev=379403&view=rev)

* Some minor code reformatting
* An extension id can be used directly in the parse-plugin.xml file without any 
alias definition (will help in a transitional phase when we get a admin gui)
* The API provides the ability to retrieve a parser from its extension-id or 
its alias (getParserByExtensionId)
* Remove the deprecated methods.
* Make use of the new APIs in parse-mp3 and parse-rtf

Thanks Chris


> Add alias capability in parse-plugins.xml file that allows 
> mimeType->extensionId mapping
> ----------------------------------------------------------------------------------------
>
>          Key: NUTCH-140
>          URL: http://issues.apache.org/jira/browse/NUTCH-140
>      Project: Nutch
>         Type: Improvement
>   Components: fetcher
>  Environment:  Power Mac OS X 10.4, Dual Processor G5 2.0 Ghz, 1.5 GB RAM, 
> although bug is independent of environment
>     Reporter: Chris A. Mattmann
>     Assignee: Chris A. Mattmann
>     Priority: Minor
>      Fix For: 0.8-dev
>  Attachments: NUTCH-140.20051502.patch.txt
>
>  Jerome and I have been talking about an idea to address the current issue 
> raised by Stefan G. about having a mapping of mimeType->list of pluginIds 
> rather than mimeType->list of extensionIds in the parse-plugins.xml file. 
> We've come up with the following proposed update that would seemingly fix 
> this problem.
>   We propose to have the concept of "aliases" in the parse-plugins.xml file, 
> defined at the end of the file, something lie:
>  <parse-plugins>
>     ....
>    <mimeType name="text/html">
>       <plugin id="parse-html"/>
>    </mimeType>
>     .....
>   
>    <aliases>
>    <alias name="parse-html"
> extension-point="org.apache.nutch.parse.html.HtmlParser"/>
>    ....
>    <alias name="parse-html2" extension-point="my.other.html.Parser"/>
>    
>    ....
>    </aliases>
> </parse-plugins>
> What do you guys think? This approach would be flexible enough to allow the 
> mapping of extensionIds to mimeTypes, but without impacting the current 
> "pluginId" concept.
> Comments welcome. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to