Re: JSP Parser class wanted

2002-02-24 Thread Chris Opler

Hi,

this is a great tool to retrieve and scrape html pages (rendered or not)...

http://www.research.compaq.com/SRC/WebL/

:-)

Chris Opler

w i l l i a m__b o y d wrote:

   If they're mostly static, why not just code a little crawler to
  request the pages via the web-server and parse the rendered HTML?
 

 right then. i've added that onto my list of things to do. immediately after
 meet project deadline and ...learning javacc and lucene inside and
 out... ;¬) if anyone has such code they're willing to contribute i would
 put it to good use.

 - Original Message -
 From: Steven J. Owens [EMAIL PROTECTED]
 To: Lucene Users List [EMAIL PROTECTED]; w i l l i a m__b o y
 d [EMAIL PROTECTED]
 Sent: Sunday, February 24, 2002 1:25 AM
 Subject: Re: JSP Parser class wanted

  w i l l i a m__b o y d [EMAIL PROTECTED] writes:
 
   i have had some success in solving my problem. mind you, it is a
   hack; a quick fix. it may or may not work for everyone. also the jsp
   pages i am indexing/searching have very little dynamically generated
   content. they are mostly static.
 
   If they're mostly static, why not just code a little crawler to
  request the pages via the web-server and parse the rendered HTML?
 
  Steven J. Owens
  [EMAIL PROTECTED]

 --
 To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
 For additional commands, e-mail: mailto:[EMAIL PROTECTED]

--
===
http://www.openwine.org



--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




does lucene support term hightlighting ?

2002-02-24 Thread Biswas, Goutam_Kumar


If it does how do I implement it ?

Thanks in advance
-Goutam

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




RE: Index Locked For Write

2002-02-24 Thread Otis Gospodnetic


--- Howk, Michael [EMAIL PROTECTED] wrote:
 Out of curiosity, why didn't we need to close the writer in rc2 or
 rc3?
 
 When you suggest a synchronized keyword, are you suggesting that
 the
 writer is not inherently thread-safe? Do we need to write our own
 thread
 management on top of Lucene?

Sorry, that might have been a wrong suggestion, IndexWriter (at least
the add method) is supposed to be thread safe.

Otis


 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, February 21, 2002 4:07 PM
 To: Lucene Users List
 Subject: RE: Index Locked For Write
 
 
 You could use synchronized keyword and use IndexReader.isLocked() or
 something like that, no?
 
 Otis
 
 --- Howk, Michael [EMAIL PROTECTED] wrote:
  Thank you for your quick responses. But in our application, we're
  working in
  a transactional environment where multiple threads are accessing a
  single
  writer using the recommended singleton pattern. Since no thread has
  exclusive access to the writer, how can we have one thread
  arbitrarily
  decide to close the writer?
  
  Michael
  
  -Original Message-
  From: Mark Tucker [mailto:[EMAIL PROTECTED]]
  Sent: Thursday, February 21, 2002 3:51 PM
  To: Lucene Users List
  Subject: RE: Index Locked For Write
  
  
  You forgot to close your writer after the call to optimize.
  
  -Original Message-
  From: Howk, Michael [mailto:[EMAIL PROTECTED]]
  Sent: Thursday, February 21, 2002 2:49 PM
  To: Lucene Mailing List (E-mail)
  Subject: Index Locked For Write
  
  
  We just got the newest daily build (to try to fix some NullPointer
  errors
  with ? and _ characters), and we're getting the same problem
 that
  Daniel
  Calvo mentioned: Index Locked for Write. Here's basically what our
  code is
  doing:
IndexWriter writer = new IndexWriter(path, analyzer, create);
  try {
  Document doc = new Document();
  doc.add(Field.Keyword(DOC_ID, 14));
doc.add(Field.UnStored(ANY, mushu));
writer.addDocument(doc);
writer.optimize();
  
// Search the document for our keyword
{   
IndexReader reader = IndexReader.open(path);
IndexSearcher searcher = new IndexSearcher(reader);
Vector returnStuff = searcher.search(mushu);
}
  
// Verify that we got one record back
assertNotNull(returnStuff);
assertEquals(1, returnStuff.size());
}
finally {
// Clean up after ourselves
IndexReader reader = IndexReader.open(path);
reader.delete(new Term(DOC_ID, 14));
reader.close();
}
  
  And the exception we're getting on the reader.delete line in the
  finally
  clause:
  
  java.io.IOException: Index locked for write:
  Lock@C:\devtools\JBossTomcat\jboss\indexes\marc\write.lock at
 

sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteC
  all.java:245) at
 

sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:220)
  at
  sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122) at
 

org.jboss.ejb.plugins.jrmp.server.JRMPContainerInvoker_Stub.invoke(Unknown
  Source) at
 

org.jboss.ejb.plugins.jrmp.interfaces.GenericProxy.invokeContainer(GenericPr
  oxy.java:357) at
 

org.jboss.ejb.plugins.jrmp.interfaces.StatelessSessionProxy.invoke(Stateless
  SessionProxy.java:123) at
  $Proxy5.deleteDocument(Unknown Source)
  
  Are we using the right approach? Any suggestions? Thank you.
  
  Michael Howk
  
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
  
  
  --
  To unsubscribe, e-mail:
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
  
  --
  To unsubscribe, e-mail:  
  mailto:[EMAIL PROTECTED]
  For additional commands, e-mail:
  mailto:[EMAIL PROTECTED]
  
 
 
 __
 Do You Yahoo!?
 Yahoo! Sports - Coverage of the 2002 Olympic Games
 http://sports.yahoo.com
 
 --
 To unsubscribe, e-mail:
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 
 --
 To unsubscribe, e-mail:  
 mailto:[EMAIL PROTECTED]
 For additional commands, e-mail:
 mailto:[EMAIL PROTECTED]
 


__
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]