Re: UNIX command-line indexing script?

2004-03-15 Thread Erik Hatcher
Have a look at the Ant index task in the Lucene sandbox.  You're on 
your own, currently, to build this and understand it, but I use it 
frequently.  In fact, the sample index from our book is generated with 
this:

index index=${build.dir}/index
  documenthandler=lia.common.TestDataDocumentHandler
  fileset dir=${data.dir}/
  config basedir=${data.dir}/
/index
You can plug in your own DocumentHandler implementation to index 
different document types however you like.  The default one indexes 
.txt and .html files, but a custom implementation can do its own thing. 
 Again, to write a DocumentHandler that knows about various document 
types is not hard you will have to write your own at the moment.

Despite the (minor) amount of work you'll have to do to start using 
index - the infrastructure adds a lot of value: an incremental file 
system indexer (only new docs get indexed on successive runs).  
Plugging this into cron would be trivial.

	Erik

On Mar 13, 2004, at 11:45 AM, Charlie Smith wrote:

Anyone written a simple UNIX command-line indexing script which will 
read a
bunch off different kinds of docs and index them?  I'd like to make a 
cron job
out of this so as to be able to come back and read it later during a 
search.

PERL or JAVA script would be fine.




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Reader Text input as field for HTML data text leading to null retrieval

2004-03-15 Thread Otis Gospodnetic
Re-directing this message to lucene-user list.

That is the correct behaviour.
Use
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/document/Field.html#Text(java.lang.String,%20java.lang.String)
if you want to be able to retrieve the original value of the indexed
text.

Otis

--- jitender ahuja [EMAIL PROTECTED] wrote:
 I am working to make an index using Lucene over HTML files. I intend
 to use the Reader as the type of the text field so as to not store
 the Html files verbatim in the index. But the data retrieval yields
 null as the text retrieved.
 
 However, if I do not use the Reader class as the Text field type,
 then I get whole file back .Also, the index directory size is nearly
 four times more now.
 
 brbr
 
 The indexer code that deals with the Reader data type is:
 
 br
 
 p public class IndexData{
 
 p protected static final String INDEX_FOLDER = C:\\Temp\\DB_GT11;
 
 prepublic static void main(String[] args)
 
   {
 
 try{
 
   IndexData objDBdex = new IndexData();
 
   boolean createDex = !objDBdex.indexExists();
 
 /pre
 
  pIndexWriter writ = new IndexWriter(INDEX_FOLDER, new
 StandardAnalyzer(), createDex);
 
 pre
 
  for(int i=0; iargs.length; i++){
 
 System.out.println(Indexing File +args[i]);
 
 InputStream is = new FileInputStream(args[i]);
 
 Document doc = new Document();
 
doc.add(Field.UnIndexed(path, args[i]));/pre
 
   p BufferedReader rdr = new BufferedReader((Reader)new
 InputStreamReader(is));
 
 pre
 
   StringBuffer fileBuffer = new StringBuffer();
 
  String line;
 
  while ((line = rdr.readLine()) != null ) {
 
 fileBuffer.append(line);
 
}
 
   System.out.println(File contents from buffer: );
 
   System.out.println(fileBuffer.toString());
 
   StringReader ab = new StringReader(fileBuffer.toString());
 
   doc.add(Field.Text(body, (Reader)ab));
 
   writ.addDocument(doc);
 
   is.close();
 
 }
 
  writ.close();
 
 }
 
  catch(IOException ex) {
 
  ex.printStackTrace();
 
  }
 
   }
 
public boolean indexExists(){
 
 return false;
 
 }
 
   }
 
 /pre
 
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UNIX command-line indexing script?

2004-03-15 Thread Otis Gospodnetic
To add to this.
The upcoming Lucene in Action book has ready to use code that will
handle and index files in most popular file formats.

Otis

--- Erik Hatcher [EMAIL PROTECTED] wrote:
 Have a look at the Ant index task in the Lucene sandbox.  You're on
 
 your own, currently, to build this and understand it, but I use it 
 frequently.  In fact, the sample index from our book is generated
 with 
 this:
 
  index index=${build.dir}/index
documenthandler=lia.common.TestDataDocumentHandler
fileset dir=${data.dir}/
config basedir=${data.dir}/
  /index
 
 You can plug in your own DocumentHandler implementation to index 
 different document types however you like.  The default one indexes 
 .txt and .html files, but a custom implementation can do its own
 thing. 
   Again, to write a DocumentHandler that knows about various document
 
 types is not hard you will have to write your own at the moment.
 
 Despite the (minor) amount of work you'll have to do to start using 
 index - the infrastructure adds a lot of value: an incremental file
 
 system indexer (only new docs get indexed on successive runs).  
 Plugging this into cron would be trivial.
 
   Erik
 
 On Mar 13, 2004, at 11:45 AM, Charlie Smith wrote:
 
  Anyone written a simple UNIX command-line indexing script which
 will 
  read a
  bunch off different kinds of docs and index them?  I'd like to make
 a 
  cron job
  out of this so as to be able to come back and read it later during
 a 
  search.
 
  PERL or JAVA script would be fine.
 
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UNIX command-line indexing script?

2004-03-15 Thread Charlie Smith
So, how upcoming is this book going to be?

 [EMAIL PROTECTED] 3/15/2004 3:39:39 AM 
To add to this.
The upcoming Lucene in Action book has ready to use code that will
handle and index files in most popular file formats.

Otis

--- Erik Hatcher [EMAIL PROTECTED] wrote:
 Have a look at the Ant index task in the Lucene sandbox.  You're on
 
 your own, currently, to build this and understand it, but I use it 
 frequently.  In fact, the sample index from our book is generated
 with 
 this:
 
  index index=${build.dir}/index
documenthandler=lia.common.TestDataDocumentHandler
fileset dir=${data.dir}/
config basedir=${data.dir}/
  /index
 
 You can plug in your own DocumentHandler implementation to index 
 different document types however you like.  The default one indexes 
 .txt and .html files, but a custom implementation can do its own
 thing. 
   Again, to write a DocumentHandler that knows about various document
 
 types is not hard you will have to write your own at the moment.
 
 Despite the (minor) amount of work you'll have to do to start using 
 index - the infrastructure adds a lot of value: an incremental file
 
 system indexer (only new docs get indexed on successive runs).  
 Plugging this into cron would be trivial.
 
   Erik
 
 On Mar 13, 2004, at 11:45 AM, Charlie Smith wrote:
 
  Anyone written a simple UNIX command-line indexing script which
 will 
  read a
  bunch off different kinds of docs and index them?  I'd like to make
 a 
  cron job
  out of this so as to be able to come back and read it later during
 a 
  search.
 
  PERL or JAVA script would be fine.
 
 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED] 
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED] 
For additional commands, e-mail: [EMAIL PROTECTED] 




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe

I am using Lucene 1.3 final and am having an error
that I can't seem to shake. Basically, I am updating a
Document in the index incrementally by calling an
IndexReader to remove the document. This works. Then,
I close the IndexReader with the following code:

reader.unlock(reader.directory());
reader.close();

I put the first of the two lines in to try to force
the lock to disable. According to the logging, this
code is being called and the IndexReader is being
closed.

However, then I open a writer to add the document, I
get the following.

java.io.IOException: Lock obtain timed out
at
org.apache.lucene.store.Lock.obtain(Lock.java:97)
at
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:173)
at 

...

I open the writer by calling:
return new IndexWriter(INDEX_DIR, analyzer, false);

where analyzer=new StandardAnalyzer();

I get the reader by calling:
IndexReader reader=IndexReader.open(INDEX_DIR);

Thanks for any help,
Gabe

__
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Otis Gospodnetic
There is no need for that .unlock call, just .close()

Otis

--- Gabe [EMAIL PROTECTED] wrote:
 
 I am using Lucene 1.3 final and am having an error
 that I can't seem to shake. Basically, I am updating a
 Document in the index incrementally by calling an
 IndexReader to remove the document. This works. Then,
 I close the IndexReader with the following code:
 
 reader.unlock(reader.directory());
 reader.close();
 
 I put the first of the two lines in to try to force
 the lock to disable. According to the logging, this
 code is being called and the IndexReader is being
 closed.
 
 However, then I open a writer to add the document, I
 get the following.
 
 java.io.IOException: Lock obtain timed out
 at
 org.apache.lucene.store.Lock.obtain(Lock.java:97)
 at
 org.apache.lucene.index.IndexWriter.init(IndexWriter.java:173)
 at 
 
 ...
 
 I open the writer by calling:
 return new IndexWriter(INDEX_DIR, analyzer, false);
 
 where analyzer=new StandardAnalyzer();
 
 I get the reader by calling:
 IndexReader reader=IndexReader.open(INDEX_DIR);
 
 Thanks for any help,
 Gabe
 
 __
 Do you Yahoo!?
 Yahoo! Mail - More reliable, more storage, less spam
 http://mail.yahoo.com
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: UNIX command-line indexing script?

2004-03-15 Thread Otis Gospodnetic
Erik and I are putting finishing touches on it, so by Summer (this one
;)).

Otis

--- Charlie Smith [EMAIL PROTECTED] wrote:
 So, how upcoming is this book going to be?
 
  [EMAIL PROTECTED] 3/15/2004 3:39:39 AM 
 To add to this.
 The upcoming Lucene in Action book has ready to use code that will
 handle and index files in most popular file formats.
 
 Otis
 
 --- Erik Hatcher [EMAIL PROTECTED] wrote:
  Have a look at the Ant index task in the Lucene sandbox.  You're
 on
  
  your own, currently, to build this and understand it, but I use it 
  frequently.  In fact, the sample index from our book is generated
  with 
  this:
  
   index index=${build.dir}/index
 documenthandler=lia.common.TestDataDocumentHandler
 fileset dir=${data.dir}/
 config basedir=${data.dir}/
   /index
  
  You can plug in your own DocumentHandler implementation to index 
  different document types however you like.  The default one indexes
 
  .txt and .html files, but a custom implementation can do its own
  thing. 
Again, to write a DocumentHandler that knows about various
 document
  
  types is not hard you will have to write your own at the moment.
  
  Despite the (minor) amount of work you'll have to do to start using
 
  index - the infrastructure adds a lot of value: an incremental
 file
  
  system indexer (only new docs get indexed on successive runs).  
  Plugging this into cron would be trivial.
  
  Erik
  
  On Mar 13, 2004, at 11:45 AM, Charlie Smith wrote:
  
   Anyone written a simple UNIX command-line indexing script which
  will 
   read a
   bunch off different kinds of docs and index them?  I'd like to
 make
  a 
   cron job
   out of this so as to be able to come back and read it later
 during
  a 
   search.
  
   PERL or JAVA script would be fine.
  
  
  
  
 
 -
  To unsubscribe, e-mail: [EMAIL PROTECTED] 
  For additional commands, e-mail:
 [EMAIL PROTECTED]
  
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED] 
 For additional commands, e-mail: [EMAIL PROTECTED] 
 
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe

Otis,

I only put the unlock call in because I had the error
in the first place. Removing it, the IOException still
occurs, when trying to instantiate the IndexWriter.

Thanks,
Gabe

--- Otis Gospodnetic [EMAIL PROTECTED]
wrote:
 There is no need for that .unlock call, just
 .close()
 
 Otis
 
 --- Gabe [EMAIL PROTECTED] wrote:
  
  I am using Lucene 1.3 final and am having an error
  that I can't seem to shake. Basically, I am
 updating a
  Document in the index incrementally by calling an
  IndexReader to remove the document. This works.
 Then,
  I close the IndexReader with the following code:
  
  reader.unlock(reader.directory());
  reader.close();
  
  I put the first of the two lines in to try to
 force
  the lock to disable. According to the logging,
 this
  code is being called and the IndexReader is being
  closed.
  
  However, then I open a writer to add the document,
 I
  get the following.
  
  java.io.IOException: Lock obtain timed out
  at
  org.apache.lucene.store.Lock.obtain(Lock.java:97)
  at
 

org.apache.lucene.index.IndexWriter.init(IndexWriter.java:173)
  at 
  
  ...
  
  I open the writer by calling:
  return new IndexWriter(INDEX_DIR, analyzer,
 false);
  
  where analyzer=new StandardAnalyzer();
  
  I get the reader by calling:
  IndexReader reader=IndexReader.open(INDEX_DIR);
  
  Thanks for any help,
  Gabe
  
  __
  Do you Yahoo!?
  Yahoo! Mail - More reliable, more storage, less
 spam
  http://mail.yahoo.com
  
 

-
  To unsubscribe, e-mail:
 [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
  
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Nguyen, Tri (NIH/NLM/LHC)
Did you close your writer if an Exception occured?

I had a similiar problem, but it was fixed when i close the writer in the
finally block.

Below is my original code (which generate Mjava.io.Exception: Lock obtain
timed out when an Exception is thrown)

public static void index(File indexDir, List cList, boolean ow) 
throws Exception{
IndexWriter writer = null;
try{  
writer = new IndexWriter(indexDir, new MyAnalyzer(), overwrite); 
  // index documents
}
catch(Exception e){
  writer = new IndexWriter(indexDir, new MyAnalyzer(), true);
try{  
// index documents 
}
catch(Exception ee){ 
throw ee;  
}
}
writer.close();  // never reaches this statement if the catch block is
called.  
}


// revised code to force a close on the IndexWriter
public static void index(File indexDir, List cList, boolean ow) 
throws Exception{
IndexWriter writer = null;
try{  
writer = new IndexWriter(indexDir, new MyAnalyzer(), overwrite); 
  // index documents
writer.close();
}
catch(Exception e){
  writer = new IndexWriter(indexDir, new MyAnalyzer(), true);
try{  
// index documents 
}
catch(Exception ee){ 
throw ee;  
}
finally{ 
writer.close(); 
} 
}
}



-Original Message-


From: Gabe [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 15, 2004 1:53 PM
To: Lucene Users List
Subject: Re: java.io.IOException: Lock obtain timed out


Otis,

I only put the unlock call in because I had the error
in the first place. Removing it, the IOException still
occurs, when trying to instantiate the IndexWriter.

Thanks,
Gabe

--- Otis Gospodnetic [EMAIL PROTECTED]
wrote:
 There is no need for that .unlock call, just
 .close()
 
 Otis
 
 --- Gabe [EMAIL PROTECTED] wrote:
  
  I am using Lucene 1.3 final and am having an error
  that I can't seem to shake. Basically, I am
 updating a
  Document in the index incrementally by calling an
  IndexReader to remove the document. This works.
 Then,
  I close the IndexReader with the following code:
  
  reader.unlock(reader.directory());
  reader.close();
  
  I put the first of the two lines in to try to
 force
  the lock to disable. According to the logging,
 this
  code is being called and the IndexReader is being
  closed.
  
  However, then I open a writer to add the document,
 I
  get the following.
  
  java.io.IOException: Lock obtain timed out
  at
  org.apache.lucene.store.Lock.obtain(Lock.java:97)
  at
 

org.apache.lucene.index.IndexWriter.init(IndexWriter.java:173)
  at 
  
  ...
  
  I open the writer by calling:
  return new IndexWriter(INDEX_DIR, analyzer,
 false);
  
  where analyzer=new StandardAnalyzer();
  
  I get the reader by calling:
  IndexReader reader=IndexReader.open(INDEX_DIR);
  
  Thanks for any help,
  Gabe
  
  __
  Do you Yahoo!?
  Yahoo! Mail - More reliable, more storage, less
 spam
  http://mail.yahoo.com
  
 

-
  To unsubscribe, e-mail:
 [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
  
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe

I notice in your catch clause you always set the
writer to be true... (i.e. new IndexWriter(INDEX_DIR,
analyzer, true). 

If I am not mistaken reading the docs, this overwrites
the entire index, no? That is why I was setting that
variable to false when doing an incremental update.
When I reindex all documents, I have had no problem.

Gabe

--- Nguyen, Tri (NIH/NLM/LHC)
[EMAIL PROTECTED] wrote:
 Did you close your writer if an Exception occured?
 
 I had a similiar problem, but it was fixed when i
 close the writer in the
 finally block.
 
 Below is my original code (which generate
 Mjava.io.Exception: Lock obtain
 timed out when an Exception is thrown)
 
 public static void index(File indexDir, List cList,
 boolean ow) 
 throws Exception{
 IndexWriter writer = null;
 try{  
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), overwrite); 
 // index documents
 }
 catch(Exception e){
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), true);
 try{
 // index documents 
 }
 catch(Exception ee){ 
 throw ee;  
 }
 }
 writer.close();  // never reaches this statement
 if the catch block is
 called.  
 }
 
 
 // revised code to force a close on the IndexWriter
 public static void index(File indexDir, List cList,
 boolean ow) 
 throws Exception{
 IndexWriter writer = null;
 try{  
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), overwrite); 
 // index documents
 writer.close();
 }
 catch(Exception e){
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), true);
 try{
 // index documents 
 }
 catch(Exception ee){ 
 throw ee;  
 }
 finally{ 
 writer.close(); 
 } 
 }
 }
 
 
 
 -Original Message-
 
 
 From: Gabe [mailto:[EMAIL PROTECTED] 
 Sent: Monday, March 15, 2004 1:53 PM
 To: Lucene Users List
 Subject: Re: java.io.IOException: Lock obtain timed
 out
 
 
 Otis,
 
 I only put the unlock call in because I had the
 error
 in the first place. Removing it, the IOException
 still
 occurs, when trying to instantiate the IndexWriter.
 
 Thanks,
 Gabe
 
 --- Otis Gospodnetic [EMAIL PROTECTED]
 wrote:
  There is no need for that .unlock call, just
  .close()
  
  Otis
  
  --- Gabe [EMAIL PROTECTED] wrote:
   
   I am using Lucene 1.3 final and am having an
 error
   that I can't seem to shake. Basically, I am
  updating a
   Document in the index incrementally by calling
 an
   IndexReader to remove the document. This works.
  Then,
   I close the IndexReader with the following code:
   
   reader.unlock(reader.directory());
   reader.close();
   
   I put the first of the two lines in to try to
  force
   the lock to disable. According to the logging,
  this
   code is being called and the IndexReader is
 being
   closed.
   
   However, then I open a writer to add the
 document,
  I
   get the following.
   
   java.io.IOException: Lock obtain timed out
   at
  
 org.apache.lucene.store.Lock.obtain(Lock.java:97)
   at
  
 

org.apache.lucene.index.IndexWriter.init(IndexWriter.java:173)
   at 
   
   ...
   
   I open the writer by calling:
   return new IndexWriter(INDEX_DIR, analyzer,
  false);
   
   where analyzer=new StandardAnalyzer();
   
   I get the reader by calling:
   IndexReader reader=IndexReader.open(INDEX_DIR);
   
   Thanks for any help,
   Gabe
   
   __
   Do you Yahoo!?
   Yahoo! Mail - More reliable, more storage, less
  spam
   http://mail.yahoo.com
   
  
 

-
   To unsubscribe, e-mail:
  [EMAIL PROTECTED]
   For additional commands, e-mail:
  [EMAIL PROTECTED]
   
  
  
 

-
  To unsubscribe, e-mail:
  [EMAIL PROTECTED]
  For additional commands, e-mail:
  [EMAIL PROTECTED]
  
 
 
 __
 Do you Yahoo!?
 Yahoo! Mail - More reliable, more storage, less spam
 http://mail.yahoo.com
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe

I figured it out. an errant open IndexWriter. 


--- Nguyen, Tri (NIH/NLM/LHC)
[EMAIL PROTECTED] wrote:
 Did you close your writer if an Exception occured?
 
 I had a similiar problem, but it was fixed when i
 close the writer in the
 finally block.
 
 Below is my original code (which generate
 Mjava.io.Exception: Lock obtain
 timed out when an Exception is thrown)
 
 public static void index(File indexDir, List cList,
 boolean ow) 
 throws Exception{
 IndexWriter writer = null;
 try{  
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), overwrite); 
 // index documents
 }
 catch(Exception e){
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), true);
 try{
 // index documents 
 }
 catch(Exception ee){ 
 throw ee;  
 }
 }
 writer.close();  // never reaches this statement
 if the catch block is
 called.  
 }
 
 
 // revised code to force a close on the IndexWriter
 public static void index(File indexDir, List cList,
 boolean ow) 
 throws Exception{
 IndexWriter writer = null;
 try{  
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), overwrite); 
 // index documents
 writer.close();
 }
 catch(Exception e){
 writer = new IndexWriter(indexDir, new
 MyAnalyzer(), true);
 try{
 // index documents 
 }
 catch(Exception ee){ 
 throw ee;  
 }
 finally{ 
 writer.close(); 
 } 
 }
 }
 
 
 
 -Original Message-
 
 
 From: Gabe [mailto:[EMAIL PROTECTED] 
 Sent: Monday, March 15, 2004 1:53 PM
 To: Lucene Users List
 Subject: Re: java.io.IOException: Lock obtain timed
 out
 
 
 Otis,
 
 I only put the unlock call in because I had the
 error
 in the first place. Removing it, the IOException
 still
 occurs, when trying to instantiate the IndexWriter.
 
 Thanks,
 Gabe
 
 --- Otis Gospodnetic [EMAIL PROTECTED]
 wrote:
  There is no need for that .unlock call, just
  .close()
  
  Otis
  
  --- Gabe [EMAIL PROTECTED] wrote:
   
   I am using Lucene 1.3 final and am having an
 error
   that I can't seem to shake. Basically, I am
  updating a
   Document in the index incrementally by calling
 an
   IndexReader to remove the document. This works.
  Then,
   I close the IndexReader with the following code:
   
   reader.unlock(reader.directory());
   reader.close();
   
   I put the first of the two lines in to try to
  force
   the lock to disable. According to the logging,
  this
   code is being called and the IndexReader is
 being
   closed.
   
   However, then I open a writer to add the
 document,
  I
   get the following.
   
   java.io.IOException: Lock obtain timed out
   at
  
 org.apache.lucene.store.Lock.obtain(Lock.java:97)
   at
  
 

org.apache.lucene.index.IndexWriter.init(IndexWriter.java:173)
   at 
   
   ...
   
   I open the writer by calling:
   return new IndexWriter(INDEX_DIR, analyzer,
  false);
   
   where analyzer=new StandardAnalyzer();
   
   I get the reader by calling:
   IndexReader reader=IndexReader.open(INDEX_DIR);
   
   Thanks for any help,
   Gabe
   
   __
   Do you Yahoo!?
   Yahoo! Mail - More reliable, more storage, less
  spam
   http://mail.yahoo.com
   
  
 

-
   To unsubscribe, e-mail:
  [EMAIL PROTECTED]
   For additional commands, e-mail:
  [EMAIL PROTECTED]
   
  
  
 

-
  To unsubscribe, e-mail:
  [EMAIL PROTECTED]
  For additional commands, e-mail:
  [EMAIL PROTECTED]
  
 
 
 __
 Do you Yahoo!?
 Yahoo! Mail - More reliable, more storage, less spam
 http://mail.yahoo.com
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 


__
Do you Yahoo!?
Yahoo! Mail - More reliable, more storage, less spam
http://mail.yahoo.com

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Can lucene index both Big5 and GB2312 encoding character?

2004-03-15 Thread Tuan Jean Tee
Can I find out if I  have both Big5 and GB2312 encoded HTML files in two
separate directories, and when I build the index, does Lucene able to
distinguish the character set? or Lucene only work with single
encoding.

Thank you.


IMPORTANT -

This email and any attachments are confidential and may be privileged in which case 
neither is intended to be waived. If you have received this message in error, please 
notify us and remove it from your system. It is your responsibility to check any 
attachments for viruses and defects before opening or sending them on. Where 
applicable, liability is limited by the Solicitors Scheme approved under the 
Professional Standards Act 1994 (NSW). Minter Ellison collects personal information to 
provide and market our services. For more information about use, disclosure and 
access, see our privacy policy at www.minterellison.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]