they can be found here: http://cvs.apache.org/viewcvs/jakarta-lucene/src/java/org/apache/lucene/quer yParser/ , and both files sould be put in your jakarta-lucene../src/java/org/apache/lucene/queryParser/ directory. then you will have to recompile , i used "ant clean jar" and then you get a new jar file in bin/
mvh karl oie -----Original Message----- From: Philipp Chudinov [mailto:[EMAIL PROTECTED]] Sent: 28. november 2001 12:40 To: [EMAIL PROTECTED] Subject: Re: scandinavian characters. //from Lucene users maillist Hi, Karl! I've faced with the same problem (trying to search russian documents). Now Iam trying to repeat your steps. I've included new version of QueryParser.jj, but could'nt you describe, how to "include the FastCharStream.java class to compile." - where it is and where I should include this? Thanks. Philipp ----- Original Message ----- From: "Karl �ie" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Tuesday, November 27, 2001 8:34 PM Subject: RE: scandinavian characters. > found a fix to the problem; > > the "QueryParser.jj" in rc2 does not accept unicode, version 1.6 in cvs > does, so i replaced the file with the newest one from cvs and also had to > include the FastCharStream.java class to compile. > > then i just had to force-convert the querystring that came from the browser > to utf-8 and it worked (guess the browser sent the string as ascii!!! i'm so > happy and thanks to you both jonas and david!! > > > > String query = this.request.getParameter( "query" ); > if( query!=null ) { > query = new String( query.getBytes(), "UTF-8" ); > } > > > > mvh karl �ie/gan media > > > > > > -----Original Message----- > From: Jonas Bechlund [mailto:[EMAIL PROTECTED]] > Sent: 27. november 2001 13:52 > To: 'Lucene Users List' > Subject: RE: scandinavian characters. > > > Hi Karl, > > It is a little bit tricky - but when you get the idea it is not that bad... > > I had the same problem with the danish characters. I made changes TOKEN > definition in the "Token Definitions" section of the file "QueryParser.jj" > and that actually solved the problem. One minor detail is that you have to > rebuild the jar file with ANT. (See build.txt for instructions) > > I guess that solves your problem, > Regards, > / Jonas > > -----Original Message----- > From: Karl �ie [mailto:[EMAIL PROTECTED]] > Sent: 27 November 2001 13:01 > To: Lucene Users List > Subject: RE: scandinavian characters. > > > there must be something seriously broken with the queryparse code. > > if a query starts with �/�/� (ø, &oaelig;, å) then an exception > in the queryparser occurs. > > org.apache.lucene.queryParser.TokenMgrError: Lexical error at line 1, column > 1. Encountered: "\u00c3" (195), after : "" > at > org.apache.lucene.queryParser.QueryParserTokenManager.getNextToken(Unknown > Source) > at org.apache.lucene.queryParser.QueryParser.jj_ntk(Unknown Source) > at org.apache.lucene.queryParser.QueryParser.Modifiers(Unknown > Source) > at org.apache.lucene.queryParser.QueryParser.Query(Unknown Source) > at org.apache.lucene.queryParser.QueryParser.parse(Unknown Source) > at org.apache.lucene.queryParser.QueryParser.parse(Unknown Source) > > but if the query contains �/�/� (ø, &oaelig;, å) then it is > translated wrongly into the swedish/german ä regardless of what > character it was. > > if someone could point me to where to start I could try to find the problem > because I guess it is errorous unicode translation... > > > mvh karl > > > > >no it's even stranger than that, i have decoded the querystring, the > problem > >is that it seems like something is changed on the way in. if i search for > >"fj�s" (fjøs) i get the swedish "fj�" (fjÄ). Where ø is > >changed to Ä and 's' is removed. > > > >is the querystring translated some where? > > > >mvh karl �ie > > -----Original Message----- > > From: David Bonilla [mailto:[EMAIL PROTECTED]] > > Sent: 27. november 2001 10:43 > > To: Lucene Users List; [EMAIL PROTECTED] > > Subject: Re: scandinavian characters. > > > > > > Hi Karl !!! > > > > I�m spanish and I have a lot of problems programming with our not english > >characters. I use LUCENE with spanish accents and it works fine... > > > > Have you tried to use the java.net.URLEncoder and java.net.URLDecoder > with > >your fields to index ? > > > > Best Regards from Spain ! > > __________________________ > > David Bonilla Fuertes > > THE BIT BANG NETWORK > > http://www.bit-bang.com > > Profesor Waksman, 8, 6� B > > 28036 Madrid > > SPAIN > > Tel.: (+34) 914 577 747 > > M�vil: 656 62 83 92 > > Fax: (+34) 914 586 176 > > __________________________ > > > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
