Como compilar arquivos .jj
Hellow, I am trying executes the example IndexHTML, but in this example two classes generated by javacc, StandadTokenizer.jj and HTMLParser.jj are used, and I don't know as compiling these classes. How can I solve this? At once I thank . Wender Magno Cota ___ Yahoo! PageBuilder O super editor para criação de sites: é grátis, fácil e rápido. http://br.geocities.yahoo.com/v/pb.html -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Size Capabilites of Lucene Index
Can anyone tell me the amount of data that Lucene is able to index? Can it handle up to 3 Terrabytes, how large are the indexes it creates, (1/2 the size of the data)? Thanks, Scott The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Ernst Young LLP -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Size Capabilities of Lucene Index
since it's file system based index I don't see any limitations other than OS max file size, and Imagine if you're data is 3 Terabytes you have monster machines with monster memory (you'll need it) also you'll need to max up the file handle set up on the OS and probably use a high MERGE_FACTOR. PS: I'm hypothesizing here, so please anyone feel free to jump in Nader Henein -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Wednesday, July 31, 2002 6:32 PM To: Lucene Users List Subject: Size Capabilites of Lucene Index Can anyone tell me the amount of data that Lucene is able to index? Can it handle up to 3 Terrabytes, how large are the indexes it creates, (1/2 the size of the data)? Thanks, Scott The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Ernst Young LLP -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Using Filters in Lucene
Cool. But instead of adding a new class, why not change Hits to inherit from Filter and add the bits() method to it? Then one could pipe the output of one Query into another search without modifying the Queries... Scott -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED]] Sent: Monday, July 29, 2002 12:03 PM To: Lucene Users List Subject: Re: Using Filters in Lucene Peter Carlson wrote: Would you suggest that search in selection type functionality use filters or redo the search with an AND clause? I'm not sure I fully understand the question. If you a condition that is likely to re-occur commonly in subsequent queries, then using a Filter which caches its bit vector is much faster than using an AND clause. However, you probably cannot afford to keep a large number of such filters around, as the cached bit vectors use a fair amount of memory--one bit per document in the index. Perhaps the ultimate filter is something like the attached class, QueryFilter. This caches the results of an arbitrary query in a bit vector. The filter can then be reused with multiple queries, and (so long as the index isn't altered) that part of the query computation will be cached. For example, RangeQuery could be used with this, instead of using DateFilter, which does not cache (yet). Caution: I have not yet tested this code. If someone does try it, please send a message to the list telling how it goes. If this is useful, I can document it better and add it to Lucene. Doug
Re: is this possible in a query?
This depends on your analyzer. Currently it splits on words. How do you want it to split? Is there other text around this? I guess you could write your own analyzer that if it finds a special phrase it would add it as a phrase. If you did it this way, you would have to use similar methodology to parse the query string itself. Another option I can think of if you are just indexing text is to have something that filters out your product names that looks for such things. So if only OrthoMed is in the query string and not Cathflo then add the query clause NOT Cathflo. This seems like it might get very complicated though. I hope someone else comes up with more elegant solutions. --Peter On 7/31/02 5:07 PM, Robert A. Decker [EMAIL PROTECTED] wrote: I have a Text Field named product. Two of the products are: Cathflo OrthoMed OrthoMed When I search for Cathflo OrthoMed, I correctly only get items that have the product Cathflo OrthoMed. However, when I search for OrthoMed, not only do I get all OrthoMed products, but I also get all Cathflo OrthoMed products. Is there a way, when searching on a Field.Text type, to limit the above OrthoMed search to only OrthoMed, and to exclude Cathflo OrthoMed? The solution has to be generic enough to work with any combination of product names. thanks, rob http://www.robdecker.com/ http://www.planetside.com/ -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: is this possible in a query?
I think this may be what I end up doing... Unfortunately this means reindexing the documents... thanks, rob http://www.robdecker.com/ http://www.planetside.com/ On Wed, 31 Jul 2002 [EMAIL PROTECTED] wrote: if you make the product name a type Field.Keyword, it will still be indexed and searchable, but will not be tokenized. --dmg - Original Message - From: Robert A. Decker [EMAIL PROTECTED] Date: Wednesday, July 31, 2002 5:07 pm Subject: is this possible in a query? I have a Text Field named product. Two of the products are: Cathflo OrthoMed OrthoMed When I search for Cathflo OrthoMed, I correctly only get items that have the product Cathflo OrthoMed. However, when I search for OrthoMed, not only do I get all OrthoMed products, but I also get all Cathflo OrthoMed products. Is there a way, when searching on a Field.Text type, to limit the aboveOrthoMed search to only OrthoMed, and to exclude Cathflo OrthoMed? The solution has to be generic enough to work with any combination of product names. thanks, rob http://www.robdecker.com/ http://www.planetside.com/ -- To unsubscribe, e-mail: mailto:lucene-user- [EMAIL PROTECTED]For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: Using Filters in Lucene
My index changes ( updates every 15 minutes and delete every 2 minutes ) so using the filter is not going to work for me because the order of the Documents might change from the time the initial search is done to the time the filter is done, I'm currently using a crude method ( ... doc_id:(23 AND 78 .. ) ) and so to filter it works surprisingly well because I thought the query parser would cave but it's doing great even with sets as large as filtering within 2000 documents -Original Message- From: Scott Ganyo [mailto:[EMAIL PROTECTED]] Sent: Wednesday, July 31, 2002 10:24 PM To: 'Lucene Users List' Subject: RE: Using Filters in Lucene Cool. But instead of adding a new class, why not change Hits to inherit from Filter and add the bits() method to it? Then one could pipe the output of one Query into another search without modifying the Queries... Scott -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED]] Sent: Monday, July 29, 2002 12:03 PM To: Lucene Users List Subject: Re: Using Filters in Lucene Peter Carlson wrote: Would you suggest that search in selection type functionality use filters or redo the search with an AND clause? I'm not sure I fully understand the question. If you a condition that is likely to re-occur commonly in subsequent queries, then using a Filter which caches its bit vector is much faster than using an AND clause. However, you probably cannot afford to keep a large number of such filters around, as the cached bit vectors use a fair amount of memory--one bit per document in the index. Perhaps the ultimate filter is something like the attached class, QueryFilter. This caches the results of an arbitrary query in a bit vector. The filter can then be reused with multiple queries, and (so long as the index isn't altered) that part of the query computation will be cached. For example, RangeQuery could be used with this, instead of using DateFilter, which does not cache (yet). Caution: I have not yet tested this code. If someone does try it, please send a message to the list telling how it goes. If this is useful, I can document it better and add it to Lucene. Doug -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
RE: is this possible in a query?
This is a long shot but if you want you search to yield exact results alone on that specific field, you might wannna think about replacing the spaces between words with underscores (make sure the analyzer doesn't split them up) and then apply that same rule to the query string in the sense that Cathflo OrthoMed will become Cathflo_OrthoMed and OrthoMed will stay the same so when you search for OrthoMed you'll only get exact results, this does not save you from re-indexing (unfortunately) but it does save you from writing a whole new analyzer. Nader Henein -Original Message- From: Robert A. Decker [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 01, 2002 6:35 AM To: Lucene Users List Subject: Re: is this possible in a query? I think this may be what I end up doing... Unfortunately this means reindexing the documents... thanks, rob http://www.robdecker.com/ http://www.planetside.com/ On Wed, 31 Jul 2002 [EMAIL PROTECTED] wrote: if you make the product name a type Field.Keyword, it will still be indexed and searchable, but will not be tokenized. --dmg - Original Message - From: Robert A. Decker [EMAIL PROTECTED] Date: Wednesday, July 31, 2002 5:07 pm Subject: is this possible in a query? I have a Text Field named product. Two of the products are: Cathflo OrthoMed OrthoMed When I search for Cathflo OrthoMed, I correctly only get items that have the product Cathflo OrthoMed. However, when I search for OrthoMed, not only do I get all OrthoMed products, but I also get all Cathflo OrthoMed products. Is there a way, when searching on a Field.Text type, to limit the aboveOrthoMed search to only OrthoMed, and to exclude Cathflo OrthoMed? The solution has to be generic enough to work with any combination of product names. thanks, rob http://www.robdecker.com/ http://www.planetside.com/ -- To unsubscribe, e-mail: mailto:lucene-user- [EMAIL PROTECTED]For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]