David, this is not really a Lucene issue. Here is some Perl code that you could either use or rewrite in Java if you need it in Java:
http://search.cpan.org/dist/Date-Extract/ Tika won't help with this, and I believe UIMA itself with not help either, although there may be components for date extraction that plug into UIMA. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: David Lee <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Thursday, October 2, 2008 7:18:22 PM > Subject: Extracting Dates > > What should I use if I want to try to extract events (dates/times) out of an > HTML page? I looked at Tika since it's a parsing project. Am I on the right > track or is there something better to use? It also seems like Apache UIMA is > kind of doing that, but I'm not sure. I thought since a lot of these > projects are associated to lucene, someone might know. > > David Lee --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]