Hello all I am a newbie to Nutch and Lucene and am experimenting with this combination to 'scrape' web pages. To this end, I need to use regular expressions in combination with Lucene to search the pages fetched by nutch.
Does Lucene support regular expressions? The book 'Lucene in Action' talks about wildcard queries but not so much about regex. Does Lucene support regex searches? Thanks in advance for your help Renuka -----Original Message----- From: Zaheed Haque [mailto:[EMAIL PROTECTED] Sent: Monday, October 31, 2005 9:58 AM To: [email protected] Subject: Re: Jira - Nutch 48 - did you mean patch :-) Yep! Works Great! /Z On 10/31/05, Byron Miller <[EMAIL PROTECTED]> wrote: > brainfar, meant mozdex.com using slashdot.org as an > example > > http://www.mozdex.com/search.jsp?query=slashdt > > Try that one. > > --- Zaheed Haque <[EMAIL PROTECTED]> wrote: > > > I just tried > > > > http://slashdot.org/search.pl?query=slashdt > > > > doesn't work! or maybe the URL above is not correct? > > > > Cheers > > Zaheed > > > > On 10/31/05, Byron Miller <[EMAIL PROTECTED]> > > wrote: > > > I got this to work this evening.. was a problem > > with > > > patch on the system i was working on.. > > > > > > feel free to check it out on slashdot.org.. you > > can > > > try an example of searching for "slashdt" and it > > > should recommend the good site :) > > > > > > -byron > > > > > > --- Byron Miller <[EMAIL PROTECTED]> wrote: > > > > > > > Anyone using this patch? > > > > > > > > http://issues.apache.org/jira/browse/NUTCH-48 > > > > > > > > I would like to incorporate this, but not having > > > > much > > > > luck getting the patch to install over svn > > release > > > > (branch .7) > > > > > > > > -byron > > > > > > > > > > > > > > The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.
