RE: Range Query
Hi Guys Apologies Please Correct me If I am wrong, with refrenc to http://issues.apache.org/eyebrowse/ReadMsg?listId=30msgNo=7103 I will have to Re - Index all my 1 Million subindexes with the 'Price FieldType' padded of to standard no of '0' s. So can use the code modified while Searching to find the range of Query... [ Is there any other way to handle this Only during SearchProcesss... ] Please some more Advise:( Thx in advance. -Original Message- From: Chuck Williams [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 20, 2004 8:06 PM To: Lucene Users List Subject: RE: Range Query Karthik, It is all spelled out in a Lucene HowTo here: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields Have fun with it, Chuck -Original Message- From: Karthik N S [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 20, 2004 12:15 AM To: Lucene Users List; Jonathan Hager Subject: RE: Range Query Hi Jonathan When searching I also pad the query term ??? When Exactly are u handling this [ using During Indexing Process Also or while Search on Process Only ] Can u be Please be specific. [ if time permits and possible please can u send me the sample Code for the same ] . :) Thx in advance -Original Message- From: Jonathan Hager [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 20, 2004 3:31 AM To: Lucene Users List Subject: Re: Range Query That is exactly right. It is searching the ASCII. To solve it I pad my price using a method like this: /** * Pads the Price so that all prices are the same number of characters and * can be compared lexigraphically. * @param price * @return */ public static String formatPriceAsString(Double price) { if (price == null) { return null; } return PRICE_FORMATTER.format(price.doubleValue()); } where PRICE_FORMATTER contains enough digits for your largest number. private static final DecimalFormat PRICE_FORMATTER = new DecimalFormat(000.00); When searching I also pad the query term. I looked into hooking into QueryParser, but since the lower/upper prices for my application are different inputs, I choose to handle them without hooking into the QueryParser. Jonathan On Tue, 19 Oct 2004 12:35:06 +0530, Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies. I have a Field Type Text 'ItemPrice' , Using it to Store Price Factor in numeric such as 10, 25.25 , 50.00 If I am suppose to Find the Range factor between 2 prices ex - Contents:shoes +ItemPrice:[10.00 TO 50.60] I get results other then the Range that has been executed [This may be due to query parsing the Ascii values instead of numeric values ] Am I am missing something in the Querry syntax or Is this the wrong way to construct the Query. Please Somebody Advise me ASAP. :( Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query
On Oct 21, 2004, at 3:32 AM, Karthik N S wrote: I will have to Re - Index all my 1 Million subindexes with the 'Price FieldType' padded of to standard no of '0' s. So can use the code modified while Searching to find the range of Query... [ Is there any other way to handle this Only during SearchProcesss... ] Reindexed is the wisest choice. There are very complicated ways of doing it only during searching, but not worth the effort I don't think. You could write your own custom Query subclass that could walk all the terms and select the ones in the range. But keep in mind that what you have indexed is not ordered in an efficient manner for searching - reindexing is recommended. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query
Hi Jonathan When searching I also pad the query term ??? When Exactly are u handling this [ using During Indexing Process Also or while Search on Process Only ] Can u be Please be specific. [ if time permits and possible please can u send me the sample Code for the same ] . :) Thx in advance -Original Message- From: Jonathan Hager [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 20, 2004 3:31 AM To: Lucene Users List Subject: Re: Range Query That is exactly right. It is searching the ASCII. To solve it I pad my price using a method like this: /** * Pads the Price so that all prices are the same number of characters and * can be compared lexigraphically. * @param price * @return */ public static String formatPriceAsString(Double price) { if (price == null) { return null; } return PRICE_FORMATTER.format(price.doubleValue()); } where PRICE_FORMATTER contains enough digits for your largest number. private static final DecimalFormat PRICE_FORMATTER = new DecimalFormat(000.00); When searching I also pad the query term. I looked into hooking into QueryParser, but since the lower/upper prices for my application are different inputs, I choose to handle them without hooking into the QueryParser. Jonathan On Tue, 19 Oct 2004 12:35:06 +0530, Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies. I have a Field Type Text 'ItemPrice' , Using it to Store Price Factor in numeric such as 10, 25.25 , 50.00 If I am suppose to Find the Range factor between 2 prices ex - Contents:shoes +ItemPrice:[10.00 TO 50.60] I get results other then the Range that has been executed [This may be due to query parsing the Ascii values instead of numeric values ] Am I am missing something in the Querry syntax or Is this the wrong way to construct the Query. Please Somebody Advise me ASAP. :( Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query
Karthik, It is all spelled out in a Lucene HowTo here: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields Have fun with it, Chuck -Original Message- From: Karthik N S [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 20, 2004 12:15 AM To: Lucene Users List; Jonathan Hager Subject: RE: Range Query Hi Jonathan When searching I also pad the query term ??? When Exactly are u handling this [ using During Indexing Process Also or while Search on Process Only ] Can u be Please be specific. [ if time permits and possible please can u send me the sample Code for the same ] . :) Thx in advance -Original Message- From: Jonathan Hager [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 20, 2004 3:31 AM To: Lucene Users List Subject: Re: Range Query That is exactly right. It is searching the ASCII. To solve it I pad my price using a method like this: /** * Pads the Price so that all prices are the same number of characters and * can be compared lexigraphically. * @param price * @return */ public static String formatPriceAsString(Double price) { if (price == null) { return null; } return PRICE_FORMATTER.format(price.doubleValue()); } where PRICE_FORMATTER contains enough digits for your largest number. private static final DecimalFormat PRICE_FORMATTER = new DecimalFormat(000.00); When searching I also pad the query term. I looked into hooking into QueryParser, but since the lower/upper prices for my application are different inputs, I choose to handle them without hooking into the QueryParser. Jonathan On Tue, 19 Oct 2004 12:35:06 +0530, Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies. I have a Field Type Text 'ItemPrice' , Using it to Store Price Factor in numeric such as 10, 25.25 , 50.00 If I am suppose to Find the Range factor between 2 prices ex - Contents:shoes +ItemPrice:[10.00 TO 50.60] I get results other then the Range that has been executed [This may be due to query parsing the Ascii values instead of numeric values ] Am I am missing something in the Querry syntax or Is this the wrong way to construct the Query. Please Somebody Advise me ASAP. :( Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query
Hi Guys Apologies. I have a Field Type Text 'ItemPrice' , Using it to Store Price Factor in numeric such as 10, 25.25 , 50.00 If I am suppose to Find the Range factor between 2 prices ex - Contents:shoes +ItemPrice:[10.00 TO 50.60] I get results other then the Range that has been executed [This may be due to query parsing the Ascii values instead of numeric values ] Am I am missing something in the Querry syntax or Is this the wrong way to construct the Query. Please Somebody Advise me ASAP. :( Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query
Range queries use a lexicographic (dictionary) order. So, assuming all your values are positive, you need to ensure that the integer part of each number has a fixed number of digits (pad with leading 0's). The fractional part should be fine, although 1.0 will follow 1. If you have negative numbers you need to pad an extra 0 on the left of the positives, start the negatives with -, and invert the magnitude of the negatives (so they go in the other order). Your actual example below should work as is, except that 10 will not be in the range since 10.00 is strictly after 10. However, this won't work without the padding assuming you have any prices with at an integer part of other than exactly two digits (e.g., 10 is before 6, but after 06). Chuck -Original Message- From: Karthik N S [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 19, 2004 12:05 AM To: LUCENE Subject: Range Query Hi Guys Apologies. I have a Field Type Text 'ItemPrice' , Using it to Store Price Factor in numeric such as 10, 25.25 , 50.00 If I am suppose to Find the Range factor between 2 prices ex - Contents:shoes +ItemPrice:[10.00 TO 50.60] I get results other then the Range that has been executed [This may be due to query parsing the Ascii values instead of numeric values ] Am I am missing something in the Querry syntax or Is this the wrong way to construct the Query. Please Somebody Advise me ASAP. :( Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query
That is exactly right. It is searching the ASCII. To solve it I pad my price using a method like this: /** * Pads the Price so that all prices are the same number of characters and * can be compared lexigraphically. * @param price * @return */ public static String formatPriceAsString(Double price) { if (price == null) { return null; } return PRICE_FORMATTER.format(price.doubleValue()); } where PRICE_FORMATTER contains enough digits for your largest number. private static final DecimalFormat PRICE_FORMATTER = new DecimalFormat(000.00); When searching I also pad the query term. I looked into hooking into QueryParser, but since the lower/upper prices for my application are different inputs, I choose to handle them without hooking into the QueryParser. Jonathan On Tue, 19 Oct 2004 12:35:06 +0530, Karthik N S [EMAIL PROTECTED] wrote: Hi Guys Apologies. I have a Field Type Text 'ItemPrice' , Using it to Store Price Factor in numeric such as 10, 25.25 , 50.00 If I am suppose to Find the Range factor between 2 prices ex - Contents:shoes +ItemPrice:[10.00 TO 50.60] I get results other then the Range that has been executed [This may be due to query parsing the Ascii values instead of numeric values ] Am I am missing something in the Querry syntax or Is this the wrong way to construct the Query. Please Somebody Advise me ASAP. :( Thx in advance WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
range query problems
Hi, I'm having a problem with a range query. I have a field in my documents called adzer. In at least one of those documents, the value is: -93 (without the quotes). I know this because if I create a search string like so: adzer: \\-93 (again, without the quotes), I get results. However, if I create a range query that I would expect to find that value, I get nothing. The range query string is: adzer:[# TO 0] (minus the quotes). As far as I can tell, this query string should find any value in the adzer fields that starts with a -. The unicode value for # comes before the unicode value for - and the unicode value for - comes before the unicode value for 0. Creating a sample program with the mentioned Strings and using the compareTo function seems to confirm this. But Lucene seems to disagree. Am I missing something here? I've been banging my head on this all day, and any help would be greatly appreciated. Derek - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: range query problems
On Friday 17 September 2004 19:37, Derek Baker wrote: However, if I create a range query that I would expect to find that value, I get nothing. The range query string is: adzer:[# TO 0] (minus the quotes). As far as I can tell, this query string should find any value in the adzer fields that starts with a -. Did you try building that query manually? Maybe even starting from null instead of #. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: range query problems
Thanks for your reply. If I do it manually: Term term1 = new Term(adzer, #); Term term2 = new Term(adzer, 0); Query myQuery = new RangeQuery(term1, term2, true); hits = searcher.search(myQuery); I still get nothing. If I make the first term in the new RangeQuery call null: Query myQuery = new RangeQuery(null, term2, true); I get nothing. If, however I make the second term in the new RangeQuery call null: Query myQuery = new RangeQuery(term1, null, true); I get the results I expect. Seems very strange. Derek Daniel Naber wrote: On Friday 17 September 2004 19:37, Derek Baker wrote: However, if I create a range query that I would expect to find that value, I get nothing. The range query string is: adzer:[# TO 0] (minus the quotes). As far as I can tell, this query string should find any value in the adzer fields that starts with a -. Did you try building that query manually? Maybe even starting from null instead of #. Regards Daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: range query problems
Ah, but if I escape the 0 in the term constructor: Term term2 = new Term(adzer, \\0); It works. And then it works for a dash as well. Seems that to pass a search string to a queryParser, the 0 has to be escaped doubly: searchString = adzer: [# TO 0]; Just escaping with a double backslash does not work. I still wonder though, if that is the desired behavior. It does not say on the page on the Lucene web site, that either 0 or - are special characters that need to be escaped. Thanks for your time and for pointing me in the right direction. Derek Derek Baker wrote: Thanks for your reply. If I do it manually: Term term1 = new Term(adzer, #); Term term2 = new Term(adzer, 0); Query myQuery = new RangeQuery(term1, term2, true); hits = searcher.search(myQuery); I still get nothing. If I make the first term in the new RangeQuery call null: Query myQuery = new RangeQuery(null, term2, true); I get nothing. If, however I make the second term in the new RangeQuery call null: Query myQuery = new RangeQuery(term1, null, true); I get the results I expect. Seems very strange. Derek Daniel Naber wrote: On Friday 17 September 2004 19:37, Derek Baker wrote: However, if I create a range query that I would expect to find that value, I get nothing. The range query string is: adzer:[# TO 0] (minus the quotes). As far as I can tell, this query string should find any value in the adzer fields that starts with a -. Did you try building that query manually? Maybe even starting from null instead of #. Regards Daniel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range query problem
A description on how to search numerical fields is available on the wiki: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields sv On Thu, 26 Aug 2004, Alex Kiselevski wrote: Thanks, I'll try it -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Thursday, August 26, 2004 12:59 PM To: Lucene Users List Subject: Re: Range query problem On Thursday 26 August 2004 11:02, Alex Kiselevski wrote: I have a strange problem with range query PERIOD:[1 TO 9] It works only if the second parameter is equals or less than 9 If it's greater than 9 , it finds no documents You have to store your numbers so that they will appear in the right order when sorted lexicographically, e.g. save 1 as 01 if you save numbers up to 99, or as 0001 if you save numbers up to . You also have to use this format for searching I think. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] The information contained in this message is proprietary of Amdocs, protected from disclosure, and may be privileged. The information is intended to be conveyed only to the designated recipient(s) of the message. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range query problem
Hello, I have a strange problem with range query PERIOD:[1 TO 9] It works only if the second parameter is equals or less than 9 If it's greater than 9 , it finds no documents Thanks in advance Alex Kiselevsky Speech Technology Tel:972-9-776-43-46 RD, Amdocs - IsraelMobile: 972-53-63 50 38 mailto:[EMAIL PROTECTED] The information contained in this message is proprietary of Amdocs, protected from disclosure, and may be privileged. The information is intended to be conveyed only to the designated recipient(s) of the message. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you.
Re: Range query problem
On Thursday 26 August 2004 11:02, Alex Kiselevski wrote: I have a strange problem with range query PERIOD:[1 TO 9] It works only if the second parameter is equals or less than 9 If it's greater than 9 , it finds no documents You have to store your numbers so that they will appear in the right order when sorted lexicographically, e.g. save 1 as 01 if you save numbers up to 99, or as 0001 if you save numbers up to . You also have to use this format for searching I think. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range query problem
Thanks, I'll try it -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Thursday, August 26, 2004 12:59 PM To: Lucene Users List Subject: Re: Range query problem On Thursday 26 August 2004 11:02, Alex Kiselevski wrote: I have a strange problem with range query PERIOD:[1 TO 9] It works only if the second parameter is equals or less than 9 If it's greater than 9 , it finds no documents You have to store your numbers so that they will appear in the right order when sorted lexicographically, e.g. save 1 as 01 if you save numbers up to 99, or as 0001 if you save numbers up to . You also have to use this format for searching I think. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] The information contained in this message is proprietary of Amdocs, protected from disclosure, and may be privileged. The information is intended to be conveyed only to the designated recipient(s) of the message. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: date range query problem(Help me)
If you'd provide a succinct JUnit test case (using RAMDirectory and hard-coded values being indexed) I'd be happy to have a look. As it is, this is too convoluted for me to follow. Erik On Jun 14, 2004, at 12:40 AM, Sumit Mishra wrote: Hi, My requirement to fetch the result with in the date range. I am filtering the query like to retrieve the date wichi fall between these two year.. bqr.add(QueryParser.parse ([ + 1978 + TO +2000+],fullhead,new StandardAnalyzer()),true,false); i search upon the tag with in fullhead/fullhead This query retrives the record between these two date range along with also select the date wich is of year 2001,2002 etcc. Could you please help me out what i am doing wrong? Here i am giving some sample data chron id=t0104415 sorthead=8250 subject=cine area=USA fullhead1990/fullhead pThe gangster film iGoodFellas/i, cohyphen;written and directed by Martin Scorsese, is released in the USA. It stars Ray Liotta, Robert De Niro, Joe Pesci, Lorraine Bracco, and Paul Sorvino./p /chron chron id=t0068037 sorthead=8630 subject=trea area=USA, USSR fullhead31 July 1991/fullhead pThe US president George Bush and the Soviet leader Mikhail Gorbachev sign the Strategic Arms Reduction Treaty (START) to reduce their arsenals of longhyphen;range nuclear weapons by a third./p /chron chron id=t0141562 sorthead=00016450 subject=stru area=UK fullhead1 January 2000/fullhead pThe Millennium Dome in Greenwich, London, England, opens to the public, and is scheduled to remain open throughout 2000. Some 12,500 people visit it on the opening day./p /chron chron id=t0141561 sorthead=00016460 subject=life fullhead1 January 2000/fullhead pThe new millennium is celebrated across the world, with fireworks, street parties, ceremonies, and speeches. The millennium bug does not appear to make a large impact, and despite fears of acts of extremism and terrorism, the global celebration passes peacefully./p /chron - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: date range query problem
On Jun 11, 2004, at 8:00 AM, Sumit Mishra wrote: Hi, My requirement to fetch the result with in the date range. I am filtering the query like to retrieve the date wichi fall between these two year.. bqr.add(QueryParser.parse ([ + 1978 + TO +2000+],fullhead,new StandardAnalyzer()),true,false); Sorry for the short reply, but I'm leaving momentarily to go present Lucene (and Ant and Tapestry) at the Research Triangle Software Symposium. Two things: - Don't build a QueryParser string in code - simply construct a RangeQuery directly. This eliminates a lot of variables to the equation that may be getting in the way. - Know what terms are being emitted during analysis. Try the utility pointed to here (http://wiki.apache.org/jakarta-lucene/AnalysisParalysis) with your data. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Thursday 03 June 2004 07:10, Karthik N S wrote: Hey Ype the Query of range +button +shirt +filename:[b10181_p100 TO b10181_p200] did not work for me but on other way around +(button OR shirt) +filename:[b10181_p100 TO b10181_p200] resulted to me in 2 hits with either one term button / shirt in each page,but not both of them I found from the Html file that both words are present in more then 2 files, Are there any other possibilities for getting both words. Your index contains book pages as Lucene documents. In this case you need to index larger parts of the books as Lucene documents in order to retrieve books with multiple subjects on different pages. Kind regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey Ype/Erick Thx in advance in helping me for the Range of Queries. Finally I was able to trace the wrong process within my code and closed them. I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] [Do you think this will work] It's not on returning hits , but it does return hits with either one of them Shirt or button Only. 2)When the indexer start indexing does it do according to alphabetic order or is it some other way... 3)The Field Type Keyword is not accepting name of Files as it indexes [ Try indexing filenames and then do a search on them ,the hits will return u 0 defnitly, lucene1.3-final version ] doc.add(Field.Text(filename,file.getName())) Will return Hits doc.add(Field.Keyword(filename,file.getName())) Will Not return Hits why??? with regards Karthik On Monday 31 May 2004 13:47, Karthik N S wrote: Hey Ype... 1) I switched Off the Multi search Senerio. 2) Changing the Field type from Text to Keyword will fail When I search for the the Field type filename so,I still maintained it to be Text Just make sure the file name is indexed as you show it, ie. the underscore should be in the indexed term. The best way to do that is to index the filename as keyword. Check the output of the analyzer, or use luke to see what is in the index for the filename field. D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : b10181_p388 Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ Found document(s) that matched : 'b10181_p388' no of hits :'1' in query Field :'filename' File Name : B10181_P388 3)On Search for range between 2 file names B10181_P702 to B01081_P355 still returns me 0 hits [Included space before the 2nd '+' ] D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Could you try this: +button +filename:[b10181_p355 TO b10181_p702] ? If this does not work, please narrow your problem down to a java test program of 10-20 lines, and post the code. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: Hey Ype/Erick If you're gonna ask for help, the least ya could do is spell my name correctly :) I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] [Do you think this will work] It's not on returning hits , but it does return hits with either one of them Shirt or button Only. My guess is you have documents none of your documents in that range have button AND shirt in them. 2)When the indexer start indexing does it do according to alphabetic order or is it some other way... I don't understand the question, sorry. Terms in the index are ordered lexicographically, if that is what you mean. 3)The Field Type Keyword is not accepting name of Files as it indexes [ Try indexing filenames and then do a search on them ,the hits will return u 0 defnitly, lucene1.3-final version ] doc.add(Field.Text(filename,file.getName())) Will return Hits doc.add(Field.Keyword(filename,file.getName())) Will Not return Hits why??? Because of your analyzer. Try indexing as a Keyword and search using a TermQuery. Don't use QueryParser at first - it gets in the way of understanding what is really going on. For fun, look at the .toString of the Query generated by QueryParser if you like. Look at the AnalysisParalysis page on the wiki for more details. Read my java.net articles to get a better understanding. The short answer is that it is analysis that is bogging you down here. You need to decide how to index file names on how you plan on querying for them. We cannot answer this for you. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Wednesday 02 June 2004 14:46, Erik Hatcher wrote: On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: ... I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] [Do you think this will work] It's not on returning hits , but it does return hits with either one of them Shirt or button Only. My guess is you have documents none of your documents in that range have button AND shirt in them. You can also try this: +button +shirt +filename:[b10181_p100 TO b10181_p200] I never got to completely understand the way the query parser deals with AND and OR, so I prefer to avoid them. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey Ype the Query of range +button +shirt +filename:[b10181_p100 TO b10181_p200] did not work for me but on other way around +(button OR shirt) +filename:[b10181_p100 TO b10181_p200] resulted to me in 2 hits with either one term button / shirt in each page,but not both of them I found from the Html file that both words are present in more then 2 files, Are there any other possibilities for getting both words. with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, June 03, 2004 12:26 AM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Wednesday 02 June 2004 14:46, Erik Hatcher wrote: On Jun 2, 2004, at 6:20 AM, Karthik N S wrote: ... I still have 3 small Questions. 1)While creating the Range Query Is it possible for Lucene to do somthing similar.. +(button AND shirt) +filename:[b10181_p100 TO b10181_p200] [Do you think this will work] It's not on returning hits , but it does return hits with either one of them Shirt or button Only. My guess is you have documents none of your documents in that range have button AND shirt in them. You can also try this: +button +shirt +filename:[b10181_p100 TO b10181_p200] I never got to completely understand the way the query parser deals with AND and OR, so I prefer to avoid them. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey Ype/Erick Apologies please I sent u guys some code as per mail did u recieve it or shall i re send them. with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Monday, May 31, 2004 8:41 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Monday 31 May 2004 13:47, Karthik N S wrote: Hey Ype... 1) I switched Off the Multi search Senerio. 2) Changing the Field type from Text to Keyword will fail When I search for the the Field type filename so,I still maintained it to be Text Just make sure the file name is indexed as you show it, ie. the underscore should be in the indexed term. The best way to do that is to index the filename as keyword. Check the output of the analyzer, or use luke to see what is in the index for the filename field. D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : b10181_p388 Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ Found document(s) that matched : 'b10181_p388' no of hits :'1' in query Field :'filename' File Name : B10181_P388 3)On Search for range between 2 file names B10181_P702 to B01081_P355 still returns me 0 hits [Included space before the 2nd '+' ] D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Could you try this: +button +filename:[b10181_p355 TO b10181_p702] ? If this does not work, please narrow your problem down to a java test program of 10-20 lines, and post the code. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Jun 1, 2004, at 8:10 AM, Karthik N S wrote: Hey Ype/Erick Apologies please I sent u guys some code as per mail did u recieve it or shall i re send them. I did not send it. Please just copy/paste it into an e-mail to the list. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
Karthik, On Monday 31 May 2004 06:12, Karthik N S wrote: Hey Ype ... My Question now is, If I want to Use Range Query to get search hits between fileName B10181_P702 and B10181_P355 only Instead of all the 67 hits , In this case there is no need to override range query, just use +fileName:[B10181_P702 TO B10181_P355] as part of the query. Kind regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey YPE Apologies again I did as per u'r mail but see the ERROR... Search Keyword : +king+filename:[b10181_p702 TO b01081_p355] Source path [ E:/po/ ] : e:/indexer3/b10181 The Exception Raised file = SearchFiles.searchIndx0 java.lang.NegativeArraySizeException at org.apache.lucene.index.TermInfosReader.readIndex(TermInfosReader.java:106) at org.apache.lucene.index.TermInfosReader.init(TermInfosReader.java:82) at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:141) at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:120) at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:118) at org.apache.lucene.store.Lock$With.run(Lock.java:148) at org.apache.lucene.index.IndexReader.open(IndexReader.java:111) at org.apache.lucene.search.IndexSearcher.init(IndexSearcher.java:80) at com.controlnet.indexing.search.SearchFiles.searchIndex0(SearchFiles.java:68) at com.controlnet.indexing.search.SearchFiles.main(SearchFiles.java:240) [Note the Field filename is in lower case not fileName ,sorry about that] Am I doing some thing wrong in here With regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Monday, May 31, 2004 1:47 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Monday 31 May 2004 06:12, Karthik N S wrote: Hey Ype ... My Question now is, If I want to Use Range Query to get search hits between fileName B10181_P702 and B10181_P355 only Instead of all the 67 hits , In this case there is no need to override range query, just use +fileName:[B10181_P702 TO B10181_P355] as part of the query. Kind regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey Ype Sorry Once again Apologies for my last mail I re indexed my folder 10181 [Seem's to be corrupted] Now I am getting the hits as D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': 204 Total milliseconds D:\JAVA\lucene\src\demojava java org.lucene.src.indexer.search.SearchFiles Search Keyword : button+filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': Is this Correct.. Or something still wrong as per Query parse String is concerned. with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Monday, May 31, 2004 1:47 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Monday 31 May 2004 06:12, Karthik N S wrote: Hey Ype ... My Question now is, If I want to Use Range Query to get search hits between fileName B10181_P702 and B10181_P355 only Instead of all the 67 hits , In this case there is no need to override range query, just use +fileName:[B10181_P702 TO B10181_P355] as part of the query. Kind regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Monday 31 May 2004 11:09, Karthik N S wrote: ... I re indexed my folder 10181 [Seem's to be corrupted] Was the index writer closed? Now I am getting the hits as D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] The query needs to have space before the 2nd + : +button +filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': You seem to use a search mechanism that searches all these fields. I'd recommend to switch this off until a query with explicit fields works, eg.: +contents:button +filename:[B10181_P702 TO B01081_P355] Btw. You'll need to make sure that a term like B10181_P702 is not split at the underscore _ by a tokenizer at indexing time. If your filename is not a keyword field, you might consider changing it into a keyword field. You seem to index book pages as Lucene documents, which is ok. However, you may also need to index larger parts of the books in order to retrieve books with multiple subjects on different pages. Is this what your original question is about? Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query Sombody HELP please
Hey Ype... 1) I switched Off the Multi search Senerio. 2) Changing the Field type from Text to Keyword will fail When I search for the the Field type filename so,I still maintained it to be Text D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : b10181_p388 Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ Found document(s) that matched : 'b10181_p388' no of hits :'1' in query Field :'filename' File Name : B10181_P388 3)On Search for range between 2 file names B10181_P702 to B01081_P355 still returns me 0 hits [Included space before the 2nd '+' ] D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+button +filename:[b10181_p702 TO b10181_p355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': or D:\JAVA\lucene\src\demojava com.controlnet.indexing.search.SearchFiles Search Keyword : +contents:button +filename:[b10181_p702 TO b10181_p355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+contents:button +filename:[b10181_p702 TO b10181_p355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Also the does the search varies on the Field Type if so My Indexed Field types as below doc.add(Field.Text(path, fhtml.getPath())); doc.add(Field.Keyword(modified,fhtml.lastModified()+)); doc.add(Field.Text(filename,fhtml.getName())); doc.add(Field.Keyword(creation,CREATION_)); doc.add(Field.Keyword(bookid,BOOKID_)); doc.add(Field.Text(chapNme,CHAPNAME_)); doc.add(Field.Text(itmName,ITEMNAME_)); please do advise me. Karthik [ James Goslink says Microsoft has More Money to burn then GOD has ...on his visit to India,In an interview to MSNBC TV Last night ] -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Monday, May 31, 2004 2:52 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Monday 31 May 2004 11:09, Karthik N S wrote: ... I re indexed my folder 10181 [Seem's to be corrupted] Was the index writer closed? Now I am getting the hits as D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] The query needs to have space before the 2nd + : +button +filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': You seem to use a search mechanism that searches all these fields. I'd recommend to switch this off until a query with explicit fields works, eg.: +contents:button +filename:[B10181_P702 TO B01081_P355] Btw. You'll need to make sure that a term like B10181_P702 is not split at the underscore _ by a tokenizer at indexing time. If your filename is not a keyword field, you might consider changing it into a keyword field. You seem to index book pages as Lucene documents, which is ok. However, you may also need to index larger parts of the books in order to retrieve books with multiple subjects on different pages. Is this what your original question is about? Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
Try my AnalysisDemo code on some filename field samples: http://wiki.apache.org/jakarta-lucene/AnalysisParalysis You mentioned earlier, I think, that you are using a custom analyzer. Give us the output of AnalysisDemo on some samples so we can see what is coming out. If you can put together a 10-line Java program that uses RAMDirectory and has some sample hard-coded text that I can easily run standalone I would look into your situation further. As it is, you are providing far more complexity than I have time to delve into. Narrow it down to a very very simple example that we can all see in one screen. Erik On May 31, 2004, at 7:47 AM, Karthik N S wrote: Hey Ype... 1) I switched Off the Multi search Senerio. 2) Changing the Field type from Text to Keyword will fail When I search for the the Field type filename so,I still maintained it to be Text D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : b10181_p388 Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ Found document(s) that matched : 'b10181_p388' no of hits :'1' in query Field :'filename' File Name : B10181_P388 3)On Search for range between 2 file names B10181_P702 to B01081_P355 still returns me 0 hits [Included space before the 2nd '+' ] D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+button +filename:[b10181_p702 TO b10181_p355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': or D:\JAVA\lucene\src\demojava com.controlnet.indexing.search.SearchFiles Search Keyword : +contents:button +filename:[b10181_p702 TO b10181_p355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+contents:button +filename:[b10181_p702 TO b10181_p355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Also the does the search varies on the Field Type if so My Indexed Field types as below doc.add(Field.Text(path, fhtml.getPath())); doc.add(Field.Keyword(modified,fhtml.lastModified()+)); doc.add(Field.Text(filename,fhtml.getName())); doc.add(Field.Keyword(creation,CREATION_)); doc.add(Field.Keyword(bookid,BOOKID_)); doc.add(Field.Text(chapNme,CHAPNAME_)); doc.add(Field.Text(itmName,ITEMNAME_)); please do advise me. Karthik [ James Goslink says Microsoft has More Money to burn then GOD has ...on his visit to India,In an interview to MSNBC TV Last night ] -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Monday, May 31, 2004 2:52 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Monday 31 May 2004 11:09, Karthik N S wrote: ... I re indexed my folder 10181 [Seem's to be corrupted] Was the index writer closed? Now I am getting the hits as D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button+filename:[B10181_P702 TO B01081_P355] The query needs to have space before the 2nd + : +button +filename:[B10181_P702 TO B01081_P355] Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['+button+filename:[B10181_P702 TO B01081_P355]'] in Folder e:/indexer3/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'contents': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': You seem to use a search mechanism that searches all these fields. I'd recommend to switch this off until a query with explicit fields works, eg.: +contents:button +filename:[B10181_P702 TO B01081_P355] Btw. You'll need to make sure that a term like B10181_P702 is not split at the underscore _ by a tokenizer at indexing time. If your filename is not a keyword field, you might consider changing it into a keyword field. You seem to index book pages as Lucene documents, which is ok. However, you may also need to index larger parts of the books in order to retrieve books with multiple subjects on different pages. Is this what your original question is about? Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
Karthik, On Monday 31 May 2004 13:47, Karthik N S wrote: Hey Ype... 1) I switched Off the Multi search Senerio. 2) Changing the Field type from Text to Keyword will fail When I search for the the Field type filename so,I still maintained it to be Text Just make sure the file name is indexed as you show it, ie. the underscore should be in the indexed term. The best way to do that is to index the filename as keyword. Check the output of the analyzer, or use luke to see what is in the index for the filename field. D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : b10181_p388 Source path [ E:/po/ ] : e:/indexer3/b10181 Query: ['b10181_p388'] in Folder e:/indexer3/b10181/b10181_indx_ Found document(s) that matched : 'b10181_p388' no of hits :'1' in query Field :'filename' File Name : B10181_P388 3)On Search for range between 2 file names B10181_P702 to B01081_P355 still returns me 0 hits [Included space before the 2nd '+' ] D:\JAVA\lucene\src\demojava org.lucene.src.indexer.search.SearchFiles Search Keyword : +button +filename:[b10181_p702 TO b10181_p355] Could you try this: +button +filename:[b10181_p355 TO b10181_p702] ? If this does not work, please narrow your problem down to a java test program of 10-20 lines, and post the code. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey Ype Apologies please Have a look at the Search Factor hits in the O/p sample of my indexed file == Start Searching == Search Keyword : king~ Source path [ E:/po/ ] : e:/indexer/b10181 Query: ['king~'] in Folder e:/indexer/b10181/b10181_indx_ Not a Found document(s) that matched query Field 'filename': Not a Found document(s) that matched query Field 'bookid': Not a Found document(s) that matched query Field 'creation': Not a Found document(s) that matched query Field 'chapNme': Not a Found document(s) that matched query Field 'itmName': Found document(s) that matched : 'king~' no of hits :'67' in query Field :'contents' File Name : B10181_P703 File Path : E:\po\catalog\B10181\B10181_P703 Modified Date: 1080036442000 Bookid : B10181 Chapter Name : Item Name : File Name: B10181_P702 File Path : E:\po\catalog\B10181\B10181_P702 Modified Date : 1080036442000 Bookid : B10181 Chapter Name: Item Name : File Name : B10181_P512 File Path : E:\po\catalog\B10181\B10181_P512 Modified Date : 1080036438000 Bookid : B10181 Chapter Name: Item Name : File Name: B10181_P40 File Path : E:\po\catalog\B10181\B10181_P40 Modified Date : 1080036444000 Bookid : B10181 Chapter Name: Item Name : File Name : B10181_P355 File Path : E:\po\catalog\B10181\B10181_P355 Modified Date: 1080036436000 Bookid : B10181 Chapter Name : Item Name : File Name : B10181_P379 File Path : E:\po\catalog\B10181\B10181_P379 Modified Date: 1080036436000 Bookid : B10181 Chapter Name : Item Name : . . . . .. 328 Total milliseconds == End Searching The o/p says a hit of 67 in total [ I have sniped out most of them for view case ] , the search word is present in field Contents where the content part of html file is indexed. If u see the Field File Name it's Unique and is indexed/ Viewed / as per Windows O/s Explore case. My Question now is, If I want to Use Range Query to get search hits between fileName B10181_P702 and B10181_P355 only Instead of all the 67 hits , How Do I do it [Please state with clear Example or send me an attachement for the same , I overrided the getRange() Query method as per u'r last mail ,but still not able to achive the Results ]. with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Saturday, May 29, 2004 12:10 AM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Friday 28 May 2004 10:54, Karthik N S wrote: Hey ype Thx for the advice but still I need to get the exact situation working , 1) I have a unique Field [ called filename ] which is indexed of type Text. It accepts the name of the HTML files as the indexing parameter , Also there is another Field called Contents which stores all the contents of that indicated unique named html file. 2) The indexer complete indexes for about 5000 html files sucessfully . 3) When I do a search for word ,it returns a hit of 400 on various html files Now in this situation if I want to limit the hits between First 200 to 400 html Page Names only what exactly should I do to using getRange() method. A range query will provide a range of indexed values, and I thought you needed to add the record number as an indexed field in each record. However, you seem to use the 200 and 400 here as the order number for each record in the result of the query on the Contents field. Is that correct? When so, in which order do you expect the results of your query? Kind regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
Karthik, On Friday 28 May 2004 05:54, Karthik N S wrote: ... Weh we do a search in SQL using '*' we all know that the result would be total no of records in the table,but when we want to get limit our record we apply range between 2 specific row records [Which we call it as subsearch] Similarly on a indexed record I would like perform the same tecnique as above. In case you need to reuse the limitation a filter is the way to go in Lucene. However it seems to be better to get the range query working first. In fact I was looking at the url u sent me in the last mail on using getRange Queries and was working on the same http://jakarta.apache.org/lucene/docs/queryparsersyntax.html The query I gave uses two +'s prefixed to the query parts: +search_word +(book:[100 TO 200]) Both query parts are required because of the +'s, ie. it works as the AND operator in SQL. The TO operator queries the range in the book field. and http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html but witou results for the last 12 hrs. You have probably seen a lot of different things that will be useful later. If u could spare a few minuts and please expalin or provide a simple [ full ] example using and over riding the getRange() method . The problem you'll probably run into is that Lucene does not support numbers directly, you'll have to index them as strings, eg. by prefixing zero's: As Erik indicated: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields You may have to reindex your data for this. In case you have a lot of data consider setting up a test first. Then in the getRangeQuery() method of your parser you'll need to prefix the queried numbers in the same way. The example in the article is about date fields, but the adaptation to numbers shouldn't be a problem. When you override this in your query parser: getRangeQuery(String field, Analyzer analyzer, String start, String end, boolean inclusive) it will be called for the example query with start = 100 and end = 200. (See http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html under Customizing query parser). In the overriding method you can then call the super method with the start and end prefixed with zero's as indicated in searching numerical fields referred to above. Have fun, you'll get it working, Ype with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, May 27, 2004 11:03 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 09:37, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. ... Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hey ype Thx for the advice but still I need to get the exact situation working , 1) I have a unique Field [ called filename ] which is indexed of type Text. It accepts the name of the HTML files as the indexing parameter , Also there is another Field called Contents which stores all the contents of that indicated unique named html file. 2) The indexer complete indexes for about 5000 html files sucessfully . 3) When I do a search for word ,it returns a hit of 400 on various html files Now in this situation if I want to limit the hits between First 200 to 400 html Page Names only what exactly should I do to using getRange() method. Please advise on how to proceed ... with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Friday, May 28, 2004 1:14 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please Karthik, On Friday 28 May 2004 05:54, Karthik N S wrote: ... Weh we do a search in SQL using '*' we all know that the result would be total no of records in the table,but when we want to get limit our record we apply range between 2 specific row records [Which we call it as subsearch] Similarly on a indexed record I would like perform the same tecnique as above. In case you need to reuse the limitation a filter is the way to go in Lucene. However it seems to be better to get the range query working first. In fact I was looking at the url u sent me in the last mail on using getRange Queries and was working on the same http://jakarta.apache.org/lucene/docs/queryparsersyntax.html The query I gave uses two +'s prefixed to the query parts: +search_word +(book:[100 TO 200]) Both query parts are required because of the +'s, ie. it works as the AND operator in SQL. The TO operator queries the range in the book field. and http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html but witou results for the last 12 hrs. You have probably seen a lot of different things that will be useful later. If u could spare a few minuts and please expalin or provide a simple [ full ] example using and over riding the getRange() method . The problem you'll probably run into is that Lucene does not support numbers directly, you'll have to index them as strings, eg. by prefixing zero's: As Erik indicated: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields You may have to reindex your data for this. In case you have a lot of data consider setting up a test first. Then in the getRangeQuery() method of your parser you'll need to prefix the queried numbers in the same way. The example in the article is about date fields, but the adaptation to numbers shouldn't be a problem. When you override this in your query parser: getRangeQuery(String field, Analyzer analyzer, String start, String end, boolean inclusive) it will be called for the example query with start = 100 and end = 200. (See http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html under Customizing query parser). In the overriding method you can then call the super method with the start and end prefixed with zero's as indicated in searching numerical fields referred to above. Have fun, you'll get it working, Ype with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, May 27, 2004 11:03 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 09:37, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. ... Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On May 28, 2004, at 4:54 AM, Karthik N S wrote: 1) I have a unique Field [ called filename ] which is indexed of type Text. You probably do not want to use Field.Text for a filename. Use Field.Keyword instead. 2) The indexer complete indexes for about 5000 html files sucessfully . Now use Luke (Google for _luke lucene_) to browse your index, and check that you are getting what you think. You can do ad-hoc queries there also. Now in this situation if I want to limit the hits between First 200 to 400 html Page Names only what exactly should I do to using getRange() method. If you want the first 200 - 400, start your Hits walking at index 200, and proceed through 400. Is there some field you want to key off to do the range? Or do you just want the 200th - 400th hits from the search, which is an entirely different question than about ranges. Please advise on how to proceed ... Please send (succinct) code examples in the future to really keep this discussion concrete and clear. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Friday 28 May 2004 10:54, Karthik N S wrote: Hey ype Thx for the advice but still I need to get the exact situation working , 1) I have a unique Field [ called filename ] which is indexed of type Text. It accepts the name of the HTML files as the indexing parameter , Also there is another Field called Contents which stores all the contents of that indicated unique named html file. 2) The indexer complete indexes for about 5000 html files sucessfully . 3) When I do a search for word ,it returns a hit of 400 on various html files Now in this situation if I want to limit the hits between First 200 to 400 html Page Names only what exactly should I do to using getRange() method. A range query will provide a range of indexed values, and I thought you needed to add the record number as an indexed field in each record. However, you seem to use the 200 and 400 here as the order number for each record in the result of the query on the Contents field. Is that correct? When so, in which order do you expect the results of your query? Kind regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Thursday 27 May 2004 07:00, Karthik N S wrote: Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] The query parser can construct this query for you (assuming search_word is in the query default field): +search_word +(book:[100 TO 200]) See also: http://jakarta.apache.org/lucene/docs/queryparsersyntax.html One problem you might run into is that Lucene does not support numbers directly, only strings are indexed. You can index these numbers with sufficient zero's prefixed and add these prefix zero's in the query. Erik Hatcher wrote an article on how to do make the query: http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html You'll need to override the getRangeQuery() method. Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query Sombody HELP please
Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. This is similar to a SQL = select * from BOOKSHELF. or select * from BOOKSHELF where book1 between 100 and 200. with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, May 27, 2004 12:46 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 07:00, Karthik N S wrote: Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] The query parser can construct this query for you (assuming search_word is in the query default field): +search_word +(book:[100 TO 200]) See also: http://jakarta.apache.org/lucene/docs/queryparsersyntax.html One problem you might run into is that Lucene does not support numbers directly, only strings are indexed. You can index these numbers with sufficient zero's prefixed and add these prefix zero's in the query. Erik Hatcher wrote an article on how to do make the query: http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html You'll need to override the getRangeQuery() method. Have fun, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On May 27, 2004, at 3:37 AM, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. This is similar to a SQL = select * from BOOKSHELF. or select * from BOOKSHELF where book1 between 100 and 200. Karthik - I'm having a hard time understanding your questions unfortunately. Ype replied with solution suggestion by overriding getRangeQuery on a custom QueryParser subclass. You need to ensure you are indexing numbers in a padded fashion: http://wiki.apache.org/jakarta-lucene/SearchNumericalFields Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
Karthik, namaste! I seem to be getting multiple copies of your email. I received 4 copies of this email. Could you please limit things to 1 message per subject? I get hundreds of messages every day as is. :( Thank you, Otis --- Karthik N S [EMAIL PROTECTED] wrote: Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] Sombody Please Help me :( with regards Karthik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Range Query Sombody HELP please
On Thursday 27 May 2004 09:37, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. You don't need to shout (uppercase), I've been teaching SQL. Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Range Query Sombody HELP please
Hey Ype Apologies for the misconduct. Weh we do a search in SQL using '*' we all know that the result would be total no of records in the table,but when we want to get limit our record we apply range between 2 specific row records [Which we call it as subsearch] Similarly on a indexed record I would like perform the same tecnique as above. In fact I was looking at the url u sent me in the last mail on using getRange Queries and was working on the same http://jakarta.apache.org/lucene/docs/queryparsersyntax.html and http://today.java.net/pub/a/today/2003/11/07/QueryParserRules.html but witou results for the last 12 hrs. If u could spare a few minuts and please expalin or provide a simple [ full ] example using and over riding the getRange() method . with regards Karthik -Original Message- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, May 27, 2004 11:03 PM To: [EMAIL PROTECTED] Subject: Re: Range Query Sombody HELP please On Thursday 27 May 2004 09:37, Karthik N S wrote: Hi Lucene -Developer My main intention was Search for an word hit in a Unique Field between ranges say book100 - book 200 indexed numbers It's something like creating a SUBSEARCH with in the SEARCHINDEX. You don't need to shout (uppercase), I've been teaching SQL. Could you explain what you mean by subsearch? I suppose you might want to have a look at the various filter classes in the org.apache.lucene.search package. Regards, Ype - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query Sombody HELP please
Hi Lucene developers Is it possible to do Search and retrieve relevant information on the Indexed Document within in specific range settings which may be similar to an Query in SQL = select * from BOOKSHELF where book1 between 100 and 200 ex:- search_word , Book between 100 AND 200 [ Note:- where Book uniquefield hit info which is already Indexed ] Sombody Please Help me :( with regards Karthik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Range Query
Hi, When I do range query like id:[0* to 9*] the result set exclude documents having id 0, 90 ... i.e boundary values are excluded. Is it expected or am I going wrong some where. thanks, vikas. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Change in Range Query Syntax?
I was surprised by this change too. I think the syntax changed from [from - to] to [from to]. - Original Message - From: Terry Steichen [EMAIL PROTECTED] To: Lucene Users Group [EMAIL PROTECTED] Sent: Thursday, November 14, 2002 12:18 AM Subject: Change in Range Query Syntax? I recently upgraded (from 1.2) to the latest build (1.3.1) and found that my range queries no longer work. Here's what a simple query against my index yields: pub_date:20021109 yields 133 hits pub_date:20021110 yields 225 hits pub_date:2002 yields 144 hits With 1.2RC5 and 1.2, here's how the range query works: pub_date:[20021109 - 2002] yields 502 hits (note space on both sides of dash) With 1.3 (nightly build as of 11/11/02), here's how the range query now works: pub_date:[20021109 - 2002] yields 0 hits (note space on both sides of dash) pub_date:[20021109- 2002] yields 369 hits (note space only following the dash) pub_date:[20021109-2002] yields 0 hits (note no spaces on either side of dash) Also, note that pub_date:]20021109- 20021110] does *not* include the hits for 20021109 as it did previously. The errors (ParseExceptions) generated were these: Was expecting one of: TO ... RANGEIN_QUOTED ... RANGEIN_GOOP ... , Encountered ] at line 1, column 27. Was expecting one of: TO ... RANGEIN_QUOTED ... RANGEIN_GOOP ... Has the syntax changed, or is this a bug? Regards, Terry -- To unsubscribe, e-mail: mailto:lucene-user-unsubscribe;jakarta.apache.org For additional commands, e-mail: mailto:lucene-user-help;jakarta.apache.org
range query error
Why this query: _published:[20010101 - 20020101] returns an error like this: Encountered 20020101 at line 1, column 27. Was expecting: ] ...? whats wrong with syntax? if I query with string (_published:[ - 20020101]) it works with no problems... -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]