Update of /cvsroot/nutch/nutch/src/plugin/parse-rtf
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10432/src/plugin/parse-rtf

Added Files:
        README.txt build.xml plugin.xml 
Log Message:
Added plugin parse-rtf, contributed by Andy Hedges.


--- NEW FILE: build.xml ---
<?xml version="1.0"?>

<project name="parse-rtf" default="jar">

  <import file="../build-plugin.xml"/>

  <!-- for junit test -->
  <mkdir dir="${build.test}/data"/>
  <copy file="sample/test.rtf" todir="${build.test}/data"/>
</project>

--- NEW FILE: README.txt ---
Prereqs: JDK 1.4+ and javacc version 3.2+

This document describes how to create rtf-parser.jar file as used by Nutch.

Source files are contained in:

http://www.cobase.cs.ucla.edu/pub/javacc/rtf_parser_src.jar

Create a new directory with the following files in:

        LICENCE
        RTFParser.jj
        RTFParserDelegate.java

cd into this new directory create a src directory
        
        $mkdir src
        
copy RTFParser.jj RTFParserDelegate.java into this src directory

        $cp RTFParser.jj RTFParserDelegate.java src/
        
now cd into this src directory and generate the javacc classes for the parser
and then cd out again

        $cd src
        $javacc RTFParser.jj
        $cd ..
        
now compile all the source and generated files

        $javac -d . src/*.java
        
(optional) remove the generated source

        $rm -rf src # (optional)
        
finally create the jar archive of all the salient files

        $jar -cvf rtf-parser.jar com/ LICENCE RTFParser*
        
--Andy Hedges

Credits:

Thanks to Eric Friedman for writing this javacc grammar file.


--- NEW FILE: plugin.xml ---
<?xml version = '1.0' encoding = 'UTF-8'?>
<plugin version="1.0.0" provider-name="nutch.org" id="parse-rtf" name="RTF Parse 
Plug-in" >
  <extension-point id="net.nutch.parse.Parser" name="Nutch Content Parser" />
  <runtime>
    <library name="parse-rtf.jar" >
      <export name="*" />
    </library>
    <library name="rtf-parser.jar"/>
  </runtime>
  <extension point="net.nutch.parse.Parser" id="net.nutch.parse.rtf" name="RTFParse" >
    <implementation class="net.nutch.parse.rtf.RTFParseFactory" pathSuffix="rtf" 
id="net.nutch.parse.rtf.RTFParseFactory" contentType="application/rtf" />
  </extension>
</plugin>



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
Nutch-cvs mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to