Hi all,
I have just created a simple index and search routine as a start with
Lucene. I have created a new index with 8 pages using the
StandardAnalyzer.
A given search returns 6 hits.
0. doc id = 1 : highest score : exact match 4 times in text.
1. doc id = 2 : next ranked score
-Ursprüngliche Nachricht-
Von: James Berrettini [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 18. März 2003 16:53
An: 'Lucene User listserv'
Betreff: Advice on Stop words
Hi,
I'm in the middle of a project to improve the Lucene
functionality that we've embedded in our
Hi all,
I've a matter with indexing then searching docs written in non-latin
languages and encoded in utf-8 (Russian, by example).
I have a web application, with a simple form to search in the contents of
the docs.
When I submit the form, I encode the query term in utf-8 with
encodeURI(String)
Have you verified that your form inputs are getting to your query objects without the
String being mangled due to encoding problems?
I'm getting japanese in UTF-8 and use the technique described at
http://w6.metronet.com/~wjm/tomcat/2001/Aug/msg00230.html to get the data from the
browser to
There are a bunch of other issues... I should have qualified that. There really aren't
any issues with the Lucene core to support Japanese, just other issues in my app that
uses Lucene and working with my content providers to ensure consistent use of
encodings, etc.
I have found what I think
I am having problems with the
lucene-sandbox/contributions/XML-Indexing-Demo. I get the following
error when I index my XML documents with the SAX parser in Java 1.4.1
java.lang.StringIndexOutOfBoundsException: String index out of range:
200
at
Doesn't that look like an error in Crimson?
If I were you I'd use Xerces instead, I always had a better feeling
about Xerces, and I think that demo code doesn't have anything
Crimson-specific hard-coded in it.
Otis
--- David Kendig [EMAIL PROTECTED] wrote:
I am having problems with the
Bummer, I get the same thing with Xerces. I do not suspect the XML file
itself since it is from a separate app that has been operational for
over a year. Does anyone maintain the sandbox contributions?
Dave
Traceback (innermost last):
File ./indexTest.py, line 22, in ?
jst a hunch, bt worth trying...
when u implement the methods of the parser, try handling (try-catch)the one that is
handling the characters that are parsedi had this problem when some of the values
werent returning values - it will be a valid xml with no/null values, bt when u try to
get