Hi, I am using CJK Tokenzier for searching the Japanese documents. I am able to search japanese documents which are text files. But I am not able to search from Microsoft word, excel files with content in Japanese.
Can you tell me how can search on Japanese content for Microsoft word, excel and ppt files. Thanks, Ankur -----Original Message----- From: Ankur Goel [mailto:[EMAIL PROTECTED] Sent: Sunday, April 04, 2004 1:36 AM To: 'Lucene Users List' Subject: RE: Boolean Phrase Query question Thanks Eric for the solution. I have to filename field as I have to give the end user facility to search on File Name also. That's why I am using TEXT for file Name also. "By using true on the finalQuery.add calls, you have said that both fields must have the word "temp" in them. Is that what you meant? Or did you mean an OR type of query?" I need an OR type of query. I mean the word can be in the filename or in the contents of the filename. But i am not able to do this. Can you tell me how to do it? Regards, Ankur -----Original Message----- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Sunday, April 04, 2004 1:27 AM To: Lucene Users List Subject: Re: Boolean Phrase Query question On Apr 3, 2004, at 12:13 PM, Ankur Goel wrote: > > Hi, > I have to provide a functionality which provides search on both file > name and contents of the file. > > For indexing I use the following code: > > > org.apache.lucene.document.Document doc = new org.apache. > lucene.document.Document(); > doc.add(Field.Keyword("fileId","" + document.getFileId())); > doc.add(Field.Text("fileName",fileName); > doc.add(Field.Text("contents", new FileReader(new File(fileName))); I'm not sure what you plan on doing with the fileName field, but you probably want to use a Keyword field for it. And you may want to glue the file name and contents together into a single field to facilitate searches to span both. (be sure to put a space in between if you do this) > For searching a text say "temp" I use the following code to look both > in file Name and contents of the file: > > BooleanQuery finalQuery = new BooleanQuery(); Query titleQuery = > QueryParser.parse("temp","fileName",analyzer); > Query mainQuery = QueryParser.parse("temp","contents",analyzer); > > finalQuery.add(titleQuery, true, false); finalQuery.add(mainQuery, > true, false); > > Hits hits = is.search(finalQuery); By using true on the finalQuery.add calls, you have said that both fields must have the word "temp" in them. Is that what you meant? Or did you mean an OR type of query? Erik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
