Zip Files

2005-03-01 Thread Luke Shannon
Hello; Anyone have an ideas on how to index the contents within zip files? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Zip Files

2005-03-01 Thread Luke Shannon
with zis (ZipInputStream) } good luck Ernesto Luke Shannon escribió: Hello; Anyone have an ideas on how to index the contents within zip files? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

Filtering Question

2005-02-23 Thread Luke Shannon
Hello; I'm trying to create a Filter that only retrieves documents with a path field containing a sub string(s). I can get the Filter to work if the BooleanQuery below (used to create the Filter) contains only TermQueries (this requires me to know the exact path). But not if it contains

MultiField Queries without the QueryParser

2005-02-22 Thread Luke Shannon
Hello; The book meantions the MultiFieldQueryParser as one way of dealing with multifield queries. Can someone point me in the direction of other ways? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

Re: MultiField Queries without the QueryParser

2005-02-22 Thread Luke Shannon
Responding to this posts. Please disreguard. Sorry. - Original Message - From: Luke Shannon [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Tuesday, February 22, 2005 5:16 PM Subject: MultiField Queries without the QueryParser Hello; The book meantions

Handling Synonyms

2005-02-21 Thread Luke Shannon
Hello; Does anyone see a problem with the following approach? For synonyms, rather than putting them in the index, I put the original term and all the synonyms in the query. Every time I create a query, I check if the term has any synonyms. If it does, I create Boolean Query OR'ing one Query

Optional Terms in a single query

2005-02-21 Thread Luke Shannon
Hi; I'm trying to create a query that look for a field containing type:181 and name doesn't contain tim, bill or harry. +(type: 181) +((-name: tim -name:bill -name:harry +oldfaith:stillHere)) +(type: 181) +((-name: tim OR bill OR harry +oldfaith:stillHere)) +(type: 181) +((-name:*(tim bill

Re: Optional Terms in a single query

2005-02-21 Thread Luke Shannon
, 2005 5:31 PM Subject: Re: Optional Terms in a single query On Monday 21 February 2005 23:23, Luke Shannon wrote: Hi; I'm trying to create a query that look for a field containing type:181 and name doesn't contain tim, bill or harry. type: 181 -(name: tim name:bill name:harry

Re: Optional Terms in a single query

2005-02-21 Thread Luke Shannon
Terms in a single query Luke Shannon wrote: Hi; I'm trying to create a query that look for a field containing type:181 and name doesn't contain tim, bill or harry. +(type: 181) +((-name: tim -name:bill -name:harry +oldfaith:stillHere)) +(type: 181) +((-name: tim OR bill OR harry

Re: Optional Terms in a single query

2005-02-21 Thread Luke Shannon
. Thanks! Luke - Original Message - From: Todd VanderVeen [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Monday, February 21, 2005 6:26 PM Subject: Re: Optional Terms in a single query Luke Shannon wrote: The API I'm working with combines a series of queries

Re: Query Question

2005-02-18 Thread Luke Shannon
Thanks Erik. Option 2 sounds like the path of least resistance. Luke - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Thursday, February 17, 2005 9:05 PM Subject: Re: Query Question On Feb 17, 2005, at 5:51 PM, Luke

Re: Lucene in the Humanties

2005-02-18 Thread Luke Shannon
Nice work Eric. I would like to spend more time playing with it, but I saw a few things I really liked. When a specific query turns up no results you prompt the client to preform a free form search. Less sauvy search users will benefit from this strategy. I also like the display of information

Analyzing Advise

2005-02-18 Thread Luke Shannon
Hi; I'm having a situation where my synonyms weren't working for a particular field. When I looked at the indexing I noticed it was a Keyword, thus not tokenized. The problem is when I switched that field to Text (now tokenized with my SynonymAnalyzer) a bunch of query queires broke that where

Re: Analyzing Advise

2005-02-18 Thread Luke Shannon
This is exactly what I was looking for. Thanks - Original Message - From: Steven Rowe [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Friday, February 18, 2005 4:41 PM Subject: Re: Analyzing Advise Luke Shannon wrote: But now that I'm looking at the API

More Analyzer Question

2005-02-18 Thread Luke Shannon
reader) { TokenStream result = new SynonymFilter(new LowerCaseTokenizer(reader), engine); return result; } } Luke Shannon | Software Developer FutureBrand Toronto 207 Queen's Quay, Suite 400 Toronto, ON, M5J 1A7 416 642 7935 (office

Re: Query Question

2005-02-17 Thread Luke Shannon
:*home\**) This looks right to me. Any theories as to why the it would not match: Document (relevant fields): Keywordtype:203 Keywordname:marcipan + home* Is the \ escaping both * characters? Thanks, Luke - Original Message - From: Luke Shannon [EMAIL PROTECTED] To: Lucene Users

Re: Problem searching Field.Keyword field

2005-02-10 Thread Luke Shannon
Are there any issues with having a bunch of boolean queries and than adding them to one big boolean queries (making them all required)? Or should I be looking at Query.combine()? Thanks, Luke - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene Users List

Re: Problem searching Field.Keyword field

2005-02-10 Thread Luke Shannon
Are there any issues with having a bunch of boolean queries and than adding them to one big boolean queries (making them all required)? Or should I be looking at Query.combine()? Thanks, Luke - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene Users List

RangeQuery With Date

2005-02-07 Thread Luke Shannon
Hi; I am working on a set of queries that allow you to find modification dates before, after and equal to a given date. Here are some of the before queries I have been playing with. I want a query that pull up dates modified before Nov 11 2004: Query query = new RangeQuery(null, new

Re: RangeQuery With Date

2005-02-07 Thread Luke Shannon
Bingo. Thanks! Luke - Original Message - From: Chris Hostetter [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Monday, February 07, 2005 5:10 PM Subject: Re: RangeQuery With Date : Your dates need to be stored in lexicographical order for the RangeQuery

Re: Starts With x and Ends With x Queries

2005-02-07 Thread Luke Shannon
I implemented this concept for my ends with query. It works very well! - Original Message - From: Chris Hostetter [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Friday, February 04, 2005 9:37 PM Subject: Re: Starts With x and Ends With x Queries : Also

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-04 Thread Luke Shannon
, Luke Shannon [EMAIL PROTECTED] wrote: Hello; I have a query that finds document that contain fields with a specific value. query1 = QueryParser.parse(jpg, kcfileupload, new StandardAnalyzer()); This works well. I would like a query that find documents containing all kcfileupload fields

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-04 Thread Luke Shannon
()); assertEquals(elwood is safe, jakes sensitive info, hits.doc(0).get(keywords)); } } On Thu, 3 Feb 2005 13:04:50 -0500, Luke Shannon [EMAIL PROTECTED] wrote: Hello; I have a query that finds document that contain fields with a specific value. query1 = QueryParser.parse(jpg, kcfileupload, new

Re: Parsing The Query: Every document that doesn't have a field containing x (but still has the field)

2005-02-04 Thread Luke Shannon
Hello; I think Chris's approach might be helpfull, but I can't seems to get it to work. So since I running out of time and I still need to figure out starts with and ends with queries, I have implemented a hacky solution to getting all documents with a kcfileupload field present that does not

Starts With x and Ends With x Queries

2005-02-04 Thread Luke Shannon
Hello; I have these two documents: Textsort:9 Keywordmodified:0e1as4og8 Textprogress_ref:1099927045180 Textname:FutureBrand Testing Textdesc:Demo Textanouncement:We are testing our project Textcategory:Category 1 Textolfaithfull:stillhere Textposter:hello Texturgent:yes Textprovider:Mo

Re: Lock failure recovery

2005-02-03 Thread Luke Shannon
The indexing process is totally synchronized in our system. Thus if an Indexing thread starts up and the index exists, but is locked, I know this to be the only indexing processing running so the lock must be from a process that got stopped before it could finish. So right before I begin writing

Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
Hello; I have a query that finds document that contain fields with a specific value. query1 = QueryParser.parse(jpg, kcfileupload, new StandardAnalyzer()); This works well. I would like a query that find documents containing all kcfileupload fields that don't contain jpg. The example I found

Re: Synonyms Not Showing In The Index

2005-02-03 Thread Luke Shannon
Thanks! I can wait for the release. Luke - Original Message - From: Andrzej Bialecki [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Thursday, February 03, 2005 2:53 PM Subject: Re: Synonyms Not Showing In The Index Andrzej Bialecki wrote: Luke Shannon

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
Ok. I have added the following to every document: doc.add(Field.UnIndexed(olFaithfull, stillHere)); The plan is a query that says: olFaithull = stillHere and kcfileupload!=jpg. I have been experimenting with the MultiFieldQueryParser, this is not working out for me. From a syntax how is this

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
Hello, Still working on the same query, here is the code I am currently working with. I am thinking this should bring up all the documents that have olFaithFull=stillHere and kcfileupload!=jpg (so anything else) query1 = QueryParser.parse(jpg, kcfileupload, new StandardAnalyzer()); query2 =

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
Yes. There should be 119 with stillHere, and if I run a query in Luke on kcfileupload = ppt, it returns one result. I am thinking I should at least get this result back with: -kcfileupload:jpg +olFaithFull:stillhere? Luke - Original Message - From: Maik Schreiber [EMAIL PROTECTED] To:

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
I did, I have ran both queries in Luke. kcfileupload:ppt returns 1 olFaithfull:stillhere returns 119 Luke - Original Message - From: Maik Schreiber [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent: Thursday, February 03, 2005 4:55 PM Subject: Re: Parsing

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
This works: query1 = QueryParser.parse(jpg, kcfileupload, new StandardAnalyzer()); query2 = QueryParser.parse(stillHere, olFaithFull, new StandardAnalyzer()); BooleanQuery typeNegativeSearch = new BooleanQuery(); typeNegativeSearch.add(query1, false, false); typeNegativeSearch.add(query2, false,

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
out is case-sensitivity. Does your olFaithFull field contain stillHere or stillhere? --Leto -Original Message- From: Luke Shannon [mailto:[EMAIL PROTECTED] This works: query1 = QueryParser.parse(jpg, kcfileupload, new StandardAnalyzer()); query2 = QueryParser.parse(stillHere

Re: Parsing The Query: Every document that doesn't have a field containing x

2005-02-03 Thread Luke Shannon
in that field. It depends on how you built the index (index and stored fields are different), but I would check on that. Also maybe try out TermQuery and see if that does anything for you. -Original Message- From: Luke Shannon [mailto:[EMAIL PROTECTED] Sent: Friday, 4 February 2005

Re: which HTML parser is better?

2005-02-02 Thread Luke Shannon
In our application I use regular expressions to strip all tags in one situation and specific ones in another situation. Here is sample code for both: This strips all html 4.0 tags except p, ul, br, li, strong, em, u: html_source =

QueryParser Help

2005-02-02 Thread Luke Shannon
Hello; Getting squinted with Query Parsing. I have a questions: Query query = MultiFieldQueryParser .parse(mario, new String[] { name, desc }, new int[] { MultiFieldQueryParser.NORMAL_FIELD, MultiFieldQueryParser.NORMAL_FIELD }, new StandardAnalyzer());

Re: QueryParser Help

2005-02-02 Thread Luke Shannon
This is it. Thank Maik. One of the docs had the result in both name and desc. Not sure how to handle this yet, I still don't know enough about QueryParsing. Luke - Original Message - From: Maik Schreiber [EMAIL PROTECTED] To: Lucene Users List lucene-user@jakarta.apache.org Sent:

Re: QueryParser Help

2005-02-02 Thread Luke Shannon
Actually now that I am looking at it, I think I am already accomplishing it. I wanted all the documents with Mario in either field to show up. There are two, but one has them in both fields in the Document. This is correct. Thanks for the help. It would have taken me a while to catch that.

Synonyms Not Showing In The Index

2005-02-02 Thread Luke Shannon
Hello; It seems my Synonym analyzer is working (based on some successful queries). But I can't see the synonyms in the index using Luke. Is this correct? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For

Re: How to get document count?

2005-02-01 Thread Luke Shannon
Not sure if the API provides a method for this, but you could use Luke: http://www.getopt.org/luke/ It gives you a count and lets you step through each Doc looking at their fields. - Original Message - From: Jim Lynch [EMAIL PROTECTED] To: Lucene Users List

Combining Documents

2005-02-01 Thread Luke Shannon
Hello; I have a situation where I need to combine the fields returned from one document to an existing document. Is there something in the API for this that I'm missing or is this the best way: //add the fields contained in the PDF document to the existing doc Document Document attachedDoc =

Boosting Questions

2005-01-27 Thread Luke Shannon
Hi All; I just want to make sure I have the right idea about boosting. So if I boost a document (Document A) after I index it (lets say a score of 2.0) Lucene will now consider this document relativly more important than other documents in the index with a boost factor less than 2.0. This boost

Re: Boosting Questions

2005-01-27 Thread Luke Shannon
the Explanation class, which can dump all scoring factors in text or HTML format. Otis --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I just want to make sure I have the right idea about boosting. So if I boost a document (Document A) after I index it (lets say a score

Getting Into Search

2005-01-26 Thread Luke Shannon
Hello; My lucene application has been performing well in our company's CMS application. The plan now is too offer advanced searching. I just bought the eBook version of Lucene in Action to help with my research (it is taking Amazon for ever to ship the printed version to Canada). The book looks

Re: Getting Into Search

2005-01-26 Thread Luke Shannon
the section name and page number, so you can quickly locate this stuff in your ebook. Otis P.S. Do you know if Indigo/Chapters has Lucene in Action on their book shelves yet? --- Luke Shannon [EMAIL PROTECTED] wrote: Hello; My lucene application has been performing well in our

Re: what if the IndexReader crashes, after delete, before close.

2005-01-10 Thread Luke Shannon
One thing that will happen is the lock file will get left behind. This means when you start back up and try to create another Reader you will get a file lock error. Our system is threaded and synchronized. Thus when a Reader is being created I know it is the only one (the Writer comes after the

Re: questions

2005-01-07 Thread Luke Shannon
Hello Jac; If you have verified that the index folder is indeed being create and their is a segment(s) file(s) in it, check that the IndexSearcher in the demo is pointing to that location. This is a easy error to make and would account for the error message no segments folder. Luke -

Re: Check to see if index is optimized

2005-01-07 Thread Luke Shannon
This may not be a simple way, but you could just do a quick check on the folder to see if there is more than one file containing the name segment. Luke - Original Message - From: Crump, Michael [EMAIL PROTECTED] To: lucene-user@jakarta.apache.org Sent: Friday, January 07, 2005 2:24 PM

Re: Deleting an index

2005-01-04 Thread Luke Shannon
If you opened an IndexReader was has it also been closed before you attempt to delete? - Original Message - From: Scott Smith [EMAIL PROTECTED] To: lucene-user@jakarta.apache.org Sent: Monday, January 03, 2005 7:39 PM Subject: Deleting an index I'm writing some junit tests for my

Re: Problems...

2005-01-04 Thread Luke Shannon
I had a similar situation with the same problem. I found the previous system was creating all the object (including the Searcher) and than updating the Index. The result was the Searcher was not able to find any of the data just added to the Index. The solution for me was to move the creation

Re: how to create a long lasting unique key?

2005-01-04 Thread Luke Shannon
This is taken from the example code writen by Doug Cutting that ships with Lucene. It is the key our system uses. It also comes in handy when incrementally updating. Luke public static String uid(File f) { // Append path and date into a string in such a way that lexicographic // sorting

Re: Lucene in Action e-book now available!

2004-12-10 Thread Luke Shannon
Nice Work! Congratulations Guys. - Original Message - From: Erik Hatcher [EMAIL PROTECTED] To: Lucene User [EMAIL PROTECTED]; Lucene List [EMAIL PROTECTED] Sent: Friday, December 10, 2004 3:52 AM Subject: Lucene in Action e-book now available! The Lucene in Action e-book is now

Re: LIMO problems

2004-12-09 Thread Luke Shannon
I use Luke. It is pretty good. http://www.getopt.org/luke/ Luke - Original Message - From: Daniel Cortes [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, December 09, 2004 8:32 AM Subject: LIMO problems Hi, I'm tying Limo (Index Monitor of Lucene) and I have a problem,

Re: Read locks on indexes

2004-12-07 Thread Luke Shannon
I think the read locks are preventing you from deleting from the index with your reader and writing to the index with a writer at the same time. If you never use a writer than I guess you don't need to worry about this. But how do you create the indexes? Luke - Original Message -

Weird Behavior On Windows

2004-12-07 Thread Luke Shannon
Hello All; Things have been running smoothly on Linux for sometime. We set up a version of the site on a Win2K machine, this is when all the fun started. A pdf would be added to the system. The indexer would run, find the new file, index it and successfully complete the update of the index

Re: Weird Behavior On Windows

2004-12-07 Thread Luke Shannon
there be logic in the flaw (swap that), or could you be catching an Exception that is thrown only on Winblows due to Windows not letting you do certain things with referenced files and dirs? Otis --- Luke Shannon [EMAIL PROTECTED] wrote: Hello All; Things have been running smoothly

Re: Weird Behavior On Windows

2004-12-07 Thread Luke Shannon
be catching an Exception that is thrown only on Winblows due to Windows not letting you do certain things with referenced files and dirs? Otis --- Luke Shannon [EMAIL PROTECTED] wrote: Hello All; Things have been running smoothly on Linux for sometime. We set up a version of the site

Re: PDF Indexing Error

2004-12-03 Thread Luke Shannon
This error is because of security settings that have been applied to the PDF document which disallow text extraction. Not sure why you would all of a sudden get this error, unless you upgraded recently. Older versions of PDFBox did not fully support PDF security. Ben On Thu, 2 Dec 2004, Luke

PDF Indexing Error

2004-12-02 Thread Luke Shannon
Hello All; Perhaps this should be on the PDFBox forum but I was curious if anyone has seen this error parsing PDF documents using packages other than PDFBox. /usr/tomcat/fb_hub/GM/Administration/Document/java/java_io.pdf java.io.IOException: You do not have permission to extract text The weird

Re: How much time indexing doc ??

2004-11-22 Thread Luke Shannon
PDF(s) can definitely slow things down, depending on their size. If there are a few larger PDF documents that time is definitely possible. Luke - Original Message - From: Miguel Angel [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Saturday, November 20, 2004 11:25 AM Subject: How much

Re: Optimized??

2004-11-22 Thread Luke Shannon
As I understand it optimization is when you merge several segments into one allowing for faster queries. The FAQs and API have further details. http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.indexingtoc=faq#q24 Luke - Original Message - From: Miguel Angel [EMAIL

False Locking Conflict?

2004-11-19 Thread Luke Shannon
Hey All; Is it possible for there to be a situation where the locking file is in place after the reader has been closed? I have extra logging in place and have followed the code execution. The reader finishes deleting old content and closes (I know this for sure). This is the only reader

Re: urgent help needed

2004-11-18 Thread Luke Shannon
These are the ones I think. They were the first things I read on Lucene and were very helpful. http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html - Original Message - From: Neelam Bhatnagar [EMAIL PROTECTED] To: Otis

PDF Index Time

2004-11-18 Thread Luke Shannon
Hi; I am using the PDFBox's getLuceneDocument method to parse my PDF documents. It returns good results and was very easy to integrate into the project. However it is slow. Does anyone know of a faster package? Someone mentioned snowtide on an earlier post. Anyone have experience with this

Re: PDF Index Time

2004-11-18 Thread Luke Shannon
a trial download and the docs show that it should be just as easy to integrate as PDFBox is. They list pricings on there site as well, which is nice that it is not hidden as some software companies do. Ben On Thu, 18 Nov 2004, Luke Shannon wrote: Hi; I am using the PDFBox's

Re: version documents

2004-11-18 Thread Luke Shannon
Thank you for the suggestion. I ended up biting the bullet and re-working my indexing logic. Luckily the system itself knows what the current version of a document is (otherwise it won't know which one to display to the user) for any given folder. I was able to get a static method I could call

Re: DOC, PPT index???

2004-11-18 Thread Luke Shannon
Check out: http://jakarta.apache.org/poi/ - Original Message - From: Miguel Angel [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, November 18, 2004 4:49 PM Subject: DOC, PPT index??? Hi !!! Lucene can index the files (do, ppt the MS OFFICE ??) How do you can this index

Re: tool to check the index field

2004-11-17 Thread Luke Shannon
Try this: http://www.getopt.org/luke/ Luke - Original Message - From: lingaraju [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Wednesday, November 17, 2004 10:00 AM Subject: tool to check the index field HI ALL I am having index file created by other people Now

version documents

2004-11-17 Thread Luke Shannon
Hey all; I have ran into an interesting case. Our system has notes. These need to be indexed. They are xml files called default.xml and are easily parsed and indexed. No problem, have been doing it all week. The problem is if someone edits the note, the system doesn't update the default.xml.

Re: version documents

2004-11-17 Thread Luke Shannon
your query by version descending, and only use the first basefile you encounter. On Wed, 17 Nov 2004 15:05:19 -0500, Luke Shannon [EMAIL PROTECTED] wrote: Hey all; I have ran into an interesting case. Our system has notes. These need to be indexed. They are xml files called default.xml

Re: IndexSearcher Refresh

2004-11-16 Thread Luke Shannon
It would nice if the IndexerSearcher contained a method that could return the last modified date of the index folder it was created with. This would make it easier to know when you need to create a new Searcher. - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene

Re: how do you work with PDF

2004-11-16 Thread Luke Shannon
www.pdfbox.org Once you get the package installed the code you can use is: Document doc = LucenePDFDocument.getDocument(file); writer.addDocument(doc); This method returns the PDF in Lucene document format. Luke - Original Message - From: Miguel Angel [EMAIL PROTECTED] To:

Re: IndexSearcher Refresh

2004-11-16 Thread Luke Shannon
/IndexReader.html#getCurrentVersion(org.apache.lucene.store.Directory) Otis --- Luke Shannon [EMAIL PROTECTED] wrote: It would nice if the IndexerSearcher contained a method that could return the last modified date of the index folder it was created with. This would make it easier

_4c.fnm missing

2004-11-16 Thread Luke Shannon
I received the error below when I was attempting to over whelm my system with incremental update requests. What is this file it is looking for? I checked the index. It contains: _4c.del _4d.cfs deletable segments Where does _4c.fnm come from? Here is the error: Unable to create the create

Re: _4c.fnm missing

2004-11-16 Thread Luke Shannon
index files described at the above URL). Maybe you can provide the code that causes this error in Bugzilla for somebody to look at. Does it consistently break? Otis --- Luke Shannon [EMAIL PROTECTED] wrote: I received the error below when I was attempting to over whelm my system

Re: _4c.fnm missing

2004-11-16 Thread Luke Shannon
of doing to overwhelm Lucene. What's your update schedule, how big is the index, and after how many updates does the system crash? Nader Henein Luke Shannon wrote: It conistantly breaks when I run more than 10 concurrent incremental updates. I can post the code on Bugzilla (hopefully when

Re: _4c.fnm missing

2004-11-16 Thread Luke Shannon
what kind of increments are we talking about it takes a bit of doing to overwhelm Lucene. What's your update schedule, how big is the index, and after how many updates does the system crash? Nader Henein Luke Shannon wrote: It conistantly breaks when I run more than 10 concurrent

Re: _4c.fnm missing

2004-11-16 Thread Luke Shannon
: 'Concurrent' and 'updates' in the same sentence sounds like a possible source of the problem. You have to use a single IndexWriter and it should not overlap with an IndexReader that is doing deletes. Otis --- Luke Shannon [EMAIL PROTECTED] wrote: It conistantly breaks when I run

Index Locking Issues Resolved...I hope

2004-11-16 Thread Luke Shannon
sounds like a possible source of the problem. You have to use a single IndexWriter and it should not overlap with an IndexReader that is doing deletes. Otis --- Luke Shannon [EMAIL PROTECTED] wrote: It conistantly breaks when I run more than 10 concurrent incremental updates. I can

Parsing .ppt

2004-11-15 Thread Luke Shannon
Hey All; Anyone know a good API for parsing MS powerpoint files? Luke

Re: Index File

2004-11-15 Thread Luke Shannon
Based on Otis's suggestion I was able to resolve this issue. The class I was integrating with for search created one IndexSearcher when it was instantiated and keep that same reference throughout the session. Once this was modified to create a new IndexerSearch for every search request, all my

Re: Index File

2004-11-15 Thread Luke Shannon
] Sent: Monday, November 15, 2004 11:03 AM Subject: Re: Index File On Mon, 2004-11-15 at 09:52, Luke Shannon wrote: Once this was modified to create a new IndexerSearch for every search request, all my problems went away. Be careful with this. You could conceivably run out of file handles

Re: Index File

2004-11-15 Thread Luke Shannon
Hi Luke; I implemented the logging like you said. At present I am speeding about 678 milliseconds creating a new IndexSearcher. I am going to implement your scheme to resolve this but at a later point since I don't think this is a huge time factor to be worried about at present. Thanks for all

Re: Incremental Indexing

2004-11-15 Thread Luke Shannon
Take a peak at IndexHTML.java in the demo that ships with Lucene. It performs an incremental update as you have described. - Original Message - From: Hetan Shah [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, November 15, 2004 4:45 PM Subject: Incremental Indexing

Re: Is opening IndexReader multiple times safe?

2004-11-15 Thread Luke Shannon
Hi Satoshi; (B (BI troubled shooted a problem similar to this by moving around a (BIndexReader.isLocked(indexFileLocation) to determine exactly when the reader (Bwas closed. (B (BNote: the method throws an error if the index file doesn't exist that you (Bare checking on. (B (BLuke (B

Re: HTMLParser.getReader returning null

2004-11-12 Thread Luke Shannon
me know if you need anything else. L - Original Message - From: sergiu gordea [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, November 12, 2004 3:39 AM Subject: Re: HTMLParser.getReader returning null Luke Shannon wrote: Hi, May I ask you which library you are using

Re: HTMLParser.getReader returning null

2004-11-12 Thread Luke Shannon
the document String itself, so that's probably the origin of the confusion. -Original Message- From: Luke Shannon [mailto:[EMAIL PROTECTED] Sent: donderdag 11 november 2004 20:17 To: Lucene Users List Subject: HTMLParser.getReader returning null Hello; Things were working fine. I

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Shannon
. I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. Regards, Luke Francl On Thu, 2004-11-11 at 18:33, Luke Shannon wrote: Syncronizing the method didn't seem to help. The lock is being

Re: Lucene : avoiding locking

2004-11-12 Thread Luke Shannon
(); } } } - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, November 12, 2004 11:03 AM Subject: Re: Lucene : avoiding locking Hello, --- Luke Shannon [EMAIL PROTECTED] wrote: Currently I am experimenting

Index File

2004-11-12 Thread Luke Shannon
Hi; Is there someway to determine if specific contents are in the index folder other than running a query against it? I see that my document is being indexed. But when I run a query against the index I get no results returned. The weird thing is if I restart TomCat and run the search again

Re: Acedemic Question About Indexing

2004-11-11 Thread Luke Shannon
it takes to index by about 5X. -Original Message- From: Luke Shannon [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 10, 2004 2:39 PM To: Lucene Users List Subject: Re: Acedemic Question About Indexing Don't worry, regardless of what I learn in this forum I am telling my company

HTMLParser.getReader returning null

2004-11-11 Thread Luke Shannon
Hello; Things were working fine. I have been re-organizing my code to drop into QA when I noticed I was no longer getting search results for my HTML files. When I checked things out I confirmed I was still creating the Documents but realized no content was being indexed. HTMLParser parser = new

Lucene : avoiding locking

2004-11-11 Thread Luke Shannon
:-). The traditional approach is to put a prefix on your subject line -- for commons package foo it would be: [foo] avoiding locking It's also generally helpful to see the entire stack trace, not just the exception message itself. Craig On Thu, 11 Nov 2004 17:27:19 -0500, Luke Shannon

Re: Lucene : avoiding locking

2004-11-11 Thread Luke Shannon
is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure

Re: Lucene : avoiding locking

2004-11-11 Thread Luke Shannon
PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I

Indexing MS Files

2004-11-10 Thread Luke Shannon
I need to index Word, Excel and Power Point files. Is this the place to start? http://jakarta.apache.org/poi/ Is there something better? Thanks, Luke

Re: Indexing MS Files

2004-11-10 Thread Luke Shannon
--- Luke Shannon [EMAIL PROTECTED] wrote: I need to index Word, Excel and Power Point files. Is this the place to start? http://jakarta.apache.org/poi/ Is there something better? Thanks, Luke - To unsubscribe, e

Re: Indexing MS Files

2004-11-10 Thread Luke Shannon
, you should be able to get it for your grandma this Christmas. Otis --- Luke Shannon [EMAIL PROTECTED] wrote: Thanks Otis. I am looking forward to this book. Any idea when it may be released? - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users

Acedemic Question About Indexing

2004-11-10 Thread Luke Shannon
I am working on debugging an existing Lucene implementation. Before I started, I built a demo to understand Lucene. In my demo I indexed the entire content hierarhcy all at once, and than optimize this index and used it for queries. It was time consuming but very simply. The code I am currently

  1   2   >