ohh then ^http://123.com/456/789/\\w*/$ that means it will allow only a word(a-zA-Z_0-9) between http://123.com/456/789/ and /.
- RB --- On Mon, 2/16/09, Alex Basa <[email protected]> wrote: > From: Alex Basa <[email protected]> > Subject: Re: regex for a folder only crawl > To: [email protected] > Date: Monday, February 16, 2009, 12:08 PM > I actually tried that but it also picks up > > http://123.com/456/789/folderA/folderB/ > http://123.com/456/789/folderA/folderB/folderC/ > > what I really need is something to say the first slash > after the previous one. > > --- On Mon, 2/16/09, Cool The Breezer > <[email protected]> wrote: > > > From: Cool The Breezer > <[email protected]> > > Subject: Re: regex for a folder only crawl > > To: [email protected] > > Date: Monday, February 16, 2009, 9:47 AM > > Try > > ^http://123.com/456/789/.*/$, which says end should be > / > > > > - RB > > > > --- On Mon, 2/16/09, Alex Basa > <[email protected]> > > wrote: > > > > > From: Alex Basa <[email protected]> > > > Subject: regex for a folder only crawl > > > To: [email protected] > > > Date: Monday, February 16, 2009, 9:54 AM > > > Hi guys, > > > > > > I'm trying to make a regex to only crawl a > folder. > > So > > > if I was crawling 123.com/456/789 > > > > > > I would only want to crawl > > > ^http://123.com/456/789/(.*)/ > > > > > > I tried > > > ^http://123.com/456/789/*\.* > > > > > > but there are many web pages with no file > extensions. > > > > > > I'm not sure how to specify only one forward > slash > > > after in the regex. Any ideas? > > > > > > Thanks as always in advance, > > > > > > Alex
