problem restoring index
hi, when I restart the tomcat . the Index is getting corrupted. If I take the backup of Index and then restarting tomcat. the Index is not working properly. Do I have to Index again all the documents whenever I restart the Tomcat? ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS.
Help to remove document
Hello, Help me pls, I want to know how to remove document from index Alex Kiselevsky Speech Technology Tel:972-9-776-43-46 RD, Amdocs - IsraelMobile: 972-53-63 50 38 mailto:[EMAIL PROTECTED] The information contained in this message is proprietary of Amdocs, protected from disclosure, and may be privileged. The information is intended to be conveyed only to the designated recipient(s) of the message. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, use, distribution or copying of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you.
Re: problem restoring index
There is no need to reindex. However, I also don't quite get what the problem is :) Otis --- Santosh [EMAIL PROTECTED] wrote: hi, when I restart the tomcat . the Index is getting corrupted. If I take the backup of Index and then restarting tomcat. the Index is not working properly. Do I have to Index again all the documents whenever I restart the Tomcat? ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searchig with special characters
Leading wildcard character (*) is not allowed if you use QueryParser that comes with Lucene. Reason: performance. See many discussions about this on lucene-user mailing list. Also see the search sytax document on the Lucene site. What other characters are you having trouble with? Otis --- Santosh [EMAIL PROTECTED] wrote: whenever I search with some special chracters like *world I am getting exception . how can I avoid this? and for what other characters lucene give this type of exceptions? ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Empty/non-empty field indexing question
Hi Otis What kind of implications does that produce on the search? If I understand correctly that record would not be searched for if the field is not there, correct? But then is there a point putting an empty value in it, if an application will never search for empty values? thanks -pedja Otis Gospodnetic said the following on 12/8/2004 1:31 AM: Empty fields won't add any value, you can skip them. Documents in an index don't have to be uniform. Each Document could have a different set of fields. Of course, that has some obvious implications for search, but is perfectly fine technically. Otis --- [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Here's probably a silly question, very newbish, but I had to ask. Since I have mysql documents that contain over 30 fields each and most of them are added to the index, is it a common practice to add fields to the index with empty values, for that perticular record, or should the field be totally omitted. What I mean is if let's say a Title field is empty on a specific record (in mysql) should I still add that field into Lucene index with an empty value or just skip it and only add the fields that contain non-empty values? thanks -pedja - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Empty/non-empty field indexing question
Correct. No, there is no point in putting an empty field there. Otis --- [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi Otis What kind of implications does that produce on the search? If I understand correctly that record would not be searched for if the field is not there, correct? But then is there a point putting an empty value in it, if an application will never search for empty values? thanks -pedja Otis Gospodnetic said the following on 12/8/2004 1:31 AM: Empty fields won't add any value, you can skip them. Documents in an index don't have to be uniform. Each Document could have a different set of fields. Of course, that has some obvious implications for search, but is perfectly fine technically. Otis --- [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Here's probably a silly question, very newbish, but I had to ask. Since I have mysql documents that contain over 30 fields each and most of them are added to the index, is it a common practice to add fields to the index with empty values, for that perticular record, or should the field be totally omitted. What I mean is if let's say a Title field is empty on a specific record (in mysql) should I still add that field into Lucene index with an empty value or just skip it and only add the fields that contain non-empty values? thanks -pedja - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Does Lucene support the XOR operator?
Does Lucene support the XOR operator? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene Vs Ixiasoft
Does anyone know about Ixiasoft server. Its a xml repository/search engine. If anyone knows about it, does he/she also know how it is compared to Lucene? Which is fast? Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration
Re: 'IN' type search
Hello, You can use BooleanQuery for that. Otis --- Ravi [EMAIL PROTECTED] wrote: Hi How do you get all documents in lucene where a particular field value is in a given list of values (like SQL IN). What kind of Query class should I use? Thanks in advance. Ravi. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Does Lucene support the XOR operator?
On Dec 8, 2004, at 2:05 PM, [EMAIL PROTECTED] wrote: Does Lucene support the XOR operator? XOR is not a built-in operation. However in a few lines of code (a custom subclass of BooleanQuery) I was able to implement it. I built this functionality under contract and I'm still working out the details of how much of my work can be contributed back, most of which is custom and isn't generalizable, but some like an XOR query is general purpose enough. However, I will give some hints - all the details of providing a custom Similarity has been been mentioned in this list - thats the trick. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Index delete failing
I got this working. I had to close all index searchers and writer on the index, set them to null and call System.gc() before the delete process. I think windows still thinks writer and searchers are pointing to the index directory even if you close them. Ravi -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Monday, December 06, 2004 4:48 PM To: Lucene Users List Subject: Re: Index delete failing This smells like a Windows issue. It is possible that something in your JVM is still holding onto the index directory (for example, FSDirectory), and Winblows is not letting you remove the directory. I bet this will work if you exit the JVM and run java.io.file.delete() without calling Lucene. Sorry, my Windows + Lucene experience is limited. Otis --- Ravi [EMAIL PROTECTED] wrote: Hi We need to delete a lucene index from our application using java.io.file.delete(). We are closing the indexWriter and even all the index searchers on that folder. But a call to delete returns false. There is no lock on the index directory. Interesting thing is that the deletable and segments files are getting removed. But the rest of the .cfs are not. Has somebody had similar problem? Thanks in advance, Ravi. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Sorting in Lucene
Hi Erik, I am not getting any error. Yes I am indexing multiple keyword fields by the same name in a single document. Does that works with Lucene? Thanks, Ramon -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 5:13 PM To: Lucene Users List Subject: Re: Sorting in Lucene Ramon, More details would be most helpful in being able to assist. You said you cannot sort - but you did not tell us what error you're getting. Are you indexing multiple keyword fields by the same name for a single document? As for searching - depending on the type of text you're searching for, the analyzer may be making it difficult/impossible to search for. QueryParser doesn't know about keyword fields. Erik On Dec 7, 2004, at 7:13 PM, Ramon Aseniero wrote: Hi Chuck, Yes I tried to search with the exact string stored on the index but I don't get a match. I tried the search using LIMO and LUKE. It seems like untokenized field are not searchable. Thanks, Ramon -Original Message- From: Chuck Williams [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 4:04 PM To: Lucene Users List Subject: RE: Sorting in Lucene Since it's untokenized, are you searching with the exact string stored in the field? Chuck -Original Message- From: Ramon Aseniero [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 3:29 PM To: 'Lucene Users List'; 'Chris Fraschetti' Subject: RE: Sorting in Lucene I also tried searching the said field on LIMO and I don't get a match. Thanks, Ramon -Original Message- From: Ramon Aseniero [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 3:20 PM To: 'Lucene Users List'; 'Chris Fraschetti' Subject: RE: Sorting in Lucene Hi, I use LIMO to look into my index. Limo tells me that the field is untokenized but is indexed. Is it possible to search on untokenized field? Thanks, Ramon -Original Message- From: Chris Fraschetti [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 3:14 PM To: Lucene Users List Subject: Re: Sorting in Lucene I would try 'luke' to look at your index and use it's search functionality to make sure it's now your code that is the problem, as well as to ensure your document is appearing in the index as you intend it. It's been a lifesaver for me. http://www.getopt.org/luke/ On Tue, 7 Dec 2004 15:02:26 -0800, Ramon Aseniero [EMAIL PROTECTED] wrote: Hi All, Any idea why a Keyword field is not searchable? On my index I have a field of type Keyword but I could not somehow search on the field. Thanks in advance. Ramon -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 -- ___ Chris Fraschetti e [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004 -- No virus found in this outgoing message.
partial updating of lucene
Hi all, I have a question about updating the lucene document. I know that there is no API to do that now. So this is what I am doing in order to update the document with the field title. 1) Get the document from lucene index 2) Remove a field called title and add the same field with a modified value 3) Remove the docment (based on one of our field) using Reader and then close the Reader. 4) Add the document that is obtained in 1 and modified in 2. I am not sure if this is the right way of doing it but I am having problems searching for that document after updating it. The problem is only with the un stored fields. For example, I search as description:boy where description is a unstored, indexed, tokenized field in the document. I find 1 document. Now I update the document the document's title as descripbed above and repeat the same search description:boy and now I don't find any results. I have not touched the field description at all. I just updated the field title. Is this an expected behaviour? If not, is it a bug. If I change the field description as stored, indexed and tokenized, the search works fine before and after updating. Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration
Re: partial updating of lucene
You unstored fields were not stored in the index, only their terms were stored. When you get the document from the index and modify it, those terms are lost when you add the document again. You can either simply create a new document and populate all the fields and add that document to the index, or you can add the unstored fields to the document retrieved in step 1. On Wed, 8 Dec 2004 17:53:26 -0500, Praveen Peddi [EMAIL PROTECTED] wrote: Hi all, I have a question about updating the lucene document. I know that there is no API to do that now. So this is what I am doing in order to update the document with the field title. 1) Get the document from lucene index 2) Remove a field called title and add the same field with a modified value 3) Remove the docment (based on one of our field) using Reader and then close the Reader. 4) Add the document that is obtained in 1 and modified in 2. I am not sure if this is the right way of doing it but I am having problems searching for that document after updating it. The problem is only with the un stored fields. For example, I search as description:boy where description is a unstored, indexed, tokenized field in the document. I find 1 document. Now I update the document the document's title as descripbed above and repeat the same search description:boy and now I don't find any results. I have not touched the field description at all. I just updated the field title. Is this an expected behaviour? If not, is it a bug. If I change the field description as stored, indexed and tokenized, the search works fine before and after updating. Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene Vs Ixiasoft
hi, think first of the relevance of the model in this 2 search engine for XML document retrieval. Lucene is classic fulltext search engine using the vector space model. this model is efficient for indexing no structred document (like plain text file ) and not made for structured document like XML. there is a XML demo of lucene sandbox but it's not really very efficient because it doesn't take advantage of the document strucutre in the indexing and the ranking model, so it lose semantic information and relevance. i don't know Ixiasoft, check the information to see how it index and rank XML document. nicolas On Wed, 8 Dec 2004 14:20:45 -0500, Praveen Peddi [EMAIL PROTECTED] wrote: Does anyone know about Ixiasoft server. Its a xml repository/search engine. If anyone knows about it, does he/she also know how it is compared to Lucene? Which is fast? Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Unexpected TermEnum behavior
My application needs to enumerate all terms for a specific field. To do that I get the TermEnum using the following code: TermEnum terms = reader.terms(new Term(fieldName, )); I noticed that initially TermEnum is positioned at the first term. In other words, I don't have to call terms.next() before calling terms.term(). This is different from the behavior of Iterator, Enumeration and ResultSet whose initial position is before the first result. I wonder whether it is this way by design. If it is by design, what is the defined TermEnum behavior if there are no terms for the field name in question? Will the call to terms.term() return null? Or get positioned at the first term with the field name that comes after the provided field name? What if there are no field names after it? In any case, some javadoc describing the behavior would be extremely useful. Being used to Iterators and ResultSets, I automatically wrote the code the same way, calling next() first. Fortunately, I had a field with only 2 terms, that's why I noticed that I am missing the first element. Thanks, Alexey
Re: Unexpected TermEnum behavior
: TermEnum terms = reader.terms(new Term(fieldName, )); : : I noticed that initially TermEnum is positioned at the first term. In other : words, I don't have to call terms.next() before calling terms.term(). This : is different from the behavior of Iterator, Enumeration and ResultSet whose Well, strictly speeking it's very different -- in particular, the next method doesn't return the item, which is also very different from Iterators and Enumeration. I agree it's a little confusing, esecially since TermDocs and TermEnum are different. : If it is by design, what is the defined TermEnum behavior if there are no : terms for the field name in question? Will the call to terms.term() return : null? Or get positioned at the first term with the field name that comes : after the provided field name? What if there are no field names after it? I believe that in those cases, the TermEnum object itself will be null. : In any case, some javadoc describing the behavior would be extremely useful. I thought it was documented in the TermEnum interface, but looking at it now I realize that not only does the TermEnum javadoc not explain it very well, but the class FilteredTermEnum (which implements TermEnum) acctually documents the oposite behavior... public Term term() Returns the current Term in the enumeration. Initially invalid, valid after next() called for the first time. -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: problem restoring index
You cannot use a wild character as the first character of the search. http://jakarta.apache.org/lucene/docs/queryparsersyntax.html - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Wednesday, December 08, 2004 6:21 PM Subject: problem restoring index hi, when I restart the tomcat . the Index is getting corrupted. If I take the backup of Index and then restarting tomcat. the Index is not working properly. Do I have to Index again all the documents whenever I restart the Tomcat? ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: searchig with special characters
You cannot use a wild character as the first character of the search. http://jakarta.apache.org/lucene/docs/queryparsersyntax.html - Original Message - From: Santosh [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Wednesday, December 08, 2004 6:24 PM Subject: searchig with special characters whenever I search with some special chracters like *world I am getting exception . how can I avoid this? and for what other characters lucene give this type of exceptions? ---SOFTPRO DISCLAIMER-- Information contained in this E-MAIL and any attachments are confidential being proprietary to SOFTPRO SYSTEMS is 'privileged' and 'confidential'. If you are not an intended or authorised recipient of this E-MAIL or have received it in error, You are notified that any use, copying or dissemination of the information contained in this E-MAIL in any manner whatsoever is strictly prohibited. Please delete it immediately and notify the sender by E-MAIL. In such a case reading, reproducing, printing or further dissemination of this E-MAIL is strictly prohibited and may be unlawful. SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment hereto is free from computer viruses or other defects. The opinions expressed in this E-MAIL and any ATTACHEMENTS may be those of the author and are not necessarily those of SOFTPRO SYSTEMS. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.io.FileNotFoundException: ... (No such file or directory)
Justin Swanhart wrote: The indexes are located on a NFS mountpoint. Could this be the problem? Yes. Lucene's lock mechanism is designed to keep this from happening, but the sort of lock files that FSDirectory uses are known to be broken with NFS. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Sorting in Lucene
On Dec 8, 2004, at 5:02 PM, Ramon Aseniero wrote: Yes I am indexing multiple keyword fields by the same name in a single document. Does that works with Lucene? No - logically it doesn't make sense. How would Lucene determine which of those field values to sort by? You need a single field value to sort by. I think you should get an error when sorting by a field with duplicate values though. Again, it would be most helpful if you could provide code that demonstrates what you're doing during indexing and searching specifically related to the sorting issue. Erik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene Vs Ixiasoft
I thought Lucene implements the Boolean model. -John On Thu, 9 Dec 2004 00:19:21 +0100, Nicolas Maisonneuve [EMAIL PROTECTED] wrote: hi, think first of the relevance of the model in this 2 search engine for XML document retrieval. Lucene is classic fulltext search engine using the vector space model. this model is efficient for indexing no structred document (like plain text file ) and not made for structured document like XML. there is a XML demo of lucene sandbox but it's not really very efficient because it doesn't take advantage of the document strucutre in the indexing and the ranking model, so it lose semantic information and relevance. i don't know Ixiasoft, check the information to see how it index and rank XML document. nicolas On Wed, 8 Dec 2004 14:20:45 -0500, Praveen Peddi [EMAIL PROTECTED] wrote: Does anyone know about Ixiasoft server. Its a xml repository/search engine. If anyone knows about it, does he/she also know how it is compared to Lucene? Which is fast? Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Lucene Vs Ixiasoft
Lucene contains a complete set of Boolean query operators, and it uses the vector space model to determine scores for relevance ranking. It's fast. It works. Chuck -Original Message- From: John Wang [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 08, 2004 7:13 PM To: Lucene Users List; Nicolas Maisonneuve Subject: Re: Lucene Vs Ixiasoft I thought Lucene implements the Boolean model. -John On Thu, 9 Dec 2004 00:19:21 +0100, Nicolas Maisonneuve [EMAIL PROTECTED] wrote: hi, think first of the relevance of the model in this 2 search engine for XML document retrieval. Lucene is classic fulltext search engine using the vector space model. this model is efficient for indexing no structred document (like plain text file ) and not made for structured document like XML. there is a XML demo of lucene sandbox but it's not really very efficient because it doesn't take advantage of the document strucutre in the indexing and the ranking model, so it lose semantic information and relevance. i don't know Ixiasoft, check the information to see how it index and rank XML document. nicolas On Wed, 8 Dec 2004 14:20:45 -0500, Praveen Peddi [EMAIL PROTECTED] wrote: Does anyone know about Ixiasoft server. Its a xml repository/search engine. If anyone knows about it, does he/she also know how it is compared to Lucene? Which is fast? Praveen ** Praveen Peddi Sr Software Engg, Context Media, Inc. email:[EMAIL PROTECTED] Tel: 401.854.3475 Fax: 401.861.3596 web: http://www.contextmedia.com ** Context Media- The Leader in Enterprise Content Integration - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Conditional Operator in Lucene
Hi All, Does Lucene support conditional operator? Like retrieve all documents where age is greater than 21, how do I compose a query like this in Lucene is there a different Query object to use? Thanks, Ramon -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.289 / Virus Database: 265.4.7 - Release Date: 12/7/2004
Re: Unexpected TermEnum behavior
Chris Hostetter writes: I thought it was documented in the TermEnum interface, but looking at it now I realize that not only does the TermEnum javadoc not explain it very well, but the class FilteredTermEnum (which implements TermEnum) acctually documents the oposite behavior... public Term term() Returns the current Term in the enumeration. Initially invalid, valid after next() called for the first time. That's a documentation bug. Fixed in CVS. http://issues.apache.org/bugzilla/show_bug.cgi?id=32353 Morus - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
SEARCH +HITS+LIMIT
Hi Guy's Apologies... One question for the form [ Especially Erik] 1) I have a MERGED Index with 100,000 File Indexed into it ( Content is one of the Fields of Type 'Text' ) 2) On search for a simple words Camera returns me 6000 hits. 3) Since the Search process is via WebApps , a simple JSP is used to display the Content. Question How to Display the Contents for the Hits in Incremental order ? [ Each Time a re hit to the Mergerindex with Incremental X value ]. This would solve the problem of Out of Memory by prefetching all the hit in one strait go process. Ex: Total hits 6000 1st page - hit's returned (1 to 25) 2nd page - hit's returned (26 to 50) . . . . N th page hit's returned ( 5975 - 6000 ) Hint : - This is similar to a SQL query SELECT * FROM LUCENE LIMIT 10, 5 WITH WARM REGARDS HAVE A NICE DAY [ N.S.KARTHIK] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]