David
Please talk more about you own Analyzer:)
And first I think we should know what NutchDocumentAnalyzer should focus on and
what should not(Anyone to explain?).
BTW: I like AnalyzerFactory to maintain/cache all analyzers
/Jack
======= At 2005-04-12, 12:34:37 you wrote: =======
>Hi all,
>I have found a need to do document analysis other than that which is
>provided by the NutchDocumentAnalyzer class. I have written my own
>Analyzer class, and I need to plug it into the Nutch framework. What
>I've done is the following, and I'd like to suggest that it be made part
>of the main Nutch development stream. I don't know what the "correct"
>procedure is for submitting such changes, so please everyone forgive me
>if this list isn't a good place.
>
>In IndexSegment.java, replace the line that creates the IndexWriter
>object with:
>
>String analyzerClass = NutchConf.get("indexer.document.analyzer",
>"net.nutch.analysis.NutchDocumentAnalyzer");
>IndexWriter writer = new IndexWriter(
> localOutput,
> (Analyzer) Class.forName( analyzerClass ).newInstance(),
> true );
>
>Then add an appropriate entry in nutch-site.xml / nutch-default.xml.
>The default entry would be something like
>
><property>
> <name>indexer.document.analyzer</name>
> <value>net.nutch.analysis.NutchDocumentAnalyzer</value>
> <description>Class used by IndexSegment to analyze
>documents</description>
></property>
>
>Hope this can be considered.
>
>Regards,
>David.
>
>********************************************************************************
>This email may contain legally privileged information and is intended only for
>the addressee. It is not necessarily the official view or
>communication of the New Zealand Qualifications Authority. If you are not the
>intended recipient you must not use, disclose, copy or distribute this email
>or
>information in it. If you have received this email in error, please contact
>the sender immediately. NZQA does not accept any liability for changes made to
>this email or attachments after sending by NZQA.
>
>All emails have been scanned for viruses and content by MailMarshal.
>NZQA reserves the right to monitor all email communications through its
>network.
>
>********************************************************************************
>
>
>-------------------------------------------------------
>SF email is sponsored by - The IT Product Guide
>Read honest & candid reviews on hundreds of IT Products from real users.
>Discover which products truly live up to the hype. Start reading now.
>http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
>_______________________________________________
>Nutch-developers mailing list
>[email protected]
>https://lists.sourceforge.net/lists/listinfo/nutch-developers
= = = = = = = = = = = = = = = = = = = =