I actually tried that but it also picks up http://123.com/456/789/folderA/folderB/ http://123.com/456/789/folderA/folderB/folderC/
what I really need is something to say the first slash after the previous one. --- On Mon, 2/16/09, Cool The Breezer <[email protected]> wrote: > From: Cool The Breezer <[email protected]> > Subject: Re: regex for a folder only crawl > To: [email protected] > Date: Monday, February 16, 2009, 9:47 AM > Try > ^http://123.com/456/789/.*/$, which says end should be / > > - RB > > --- On Mon, 2/16/09, Alex Basa <[email protected]> > wrote: > > > From: Alex Basa <[email protected]> > > Subject: regex for a folder only crawl > > To: [email protected] > > Date: Monday, February 16, 2009, 9:54 AM > > Hi guys, > > > > I'm trying to make a regex to only crawl a folder. > So > > if I was crawling 123.com/456/789 > > > > I would only want to crawl > > ^http://123.com/456/789/(.*)/ > > > > I tried > > ^http://123.com/456/789/*\.* > > > > but there are many web pages with no file extensions. > > > > I'm not sure how to specify only one forward slash > > after in the regex. Any ideas? > > > > Thanks as always in advance, > > > > Alex
