Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

fr . jurain Thu, 24 Mar 2011 02:47:58 -0700

Hi David, 
thanks for your advice I'll keep it in mind.

Best regards, 
      François Jurain.



> Message du 22/03/11 à 17h40
> De : "David Causse" <[email protected]>
> A : [email protected]
> Copie à : 
> Objet : Re: Wanted: a directory of quick-and-(not too)dirty analyzers for 
> multi-language RDF.
> 
> 
> On Tue, Mar 22, 2011 at 04:15:53PM +0100, [email protected] wrote:
> > The only thing I need is the middle layer: a Java component extending 
> > Lucene, that'd pull a plausible Analyzer out of its magic hat, for every 
> > ISO 639-1 language tag however unlikely that turns up in the RDF input.
> > Not just an analyzer, mark: a *plausible* one. I mean one that'll 
> > generate usable indexes right out of the box in most cases; I cannot 
> > afford to bring the system back into dev & study the arcanes of 
> > automatic indexing in just every language we're working with. So 
> > lucene-contrib + covering the rest with StandardAnalyzer/English 
> > stopwords is not an option.
> 
> Hi,
> 
> maybe you could have a look at java.text.* and specifically BreakIterator
> (Thai analyzer use it) it could be better than STDAnalyzer for a fallback.
> Don't forget that if you use multiple analyzers at index time you'll
> have to use multiple analyzers at query time (tricky part of the
> process).
> 
> Regards.
> 
> -- 
> David Causse
> Spotter
> http://www.spotter.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> 

____________________________________________________

  Retrouvez les 10 conseils pour économiser votre carburant sur Voila :  
http://actu.voila.fr/evenementiel/LeDossierEcologie/l-eco-conduite/




---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Wanted: a directory of quick-and-(not too)dirty analyzers for multi-language RDF.

Reply via email to