ohh then
^http://123.com/456/789/\\w*/$ that means it will allow only a word(a-zA-Z_0-9) 
between http://123.com/456/789/ and /.

- RB


--- On Mon, 2/16/09, Alex Basa <[email protected]> wrote:

> From: Alex Basa <[email protected]>
> Subject: Re: regex for a folder only crawl
> To: [email protected]
> Date: Monday, February 16, 2009, 12:08 PM
> I actually tried that but it also picks up
> 
> http://123.com/456/789/folderA/folderB/
> http://123.com/456/789/folderA/folderB/folderC/
> 
> what I really need is something to say the first slash
> after the previous one.
> 
> --- On Mon, 2/16/09, Cool The Breezer
> <[email protected]> wrote:
> 
> > From: Cool The Breezer
> <[email protected]>
> > Subject: Re: regex for a folder only crawl
> > To: [email protected]
> > Date: Monday, February 16, 2009, 9:47 AM
> > Try 
> > ^http://123.com/456/789/.*/$, which says end should be
> /
> > 
> > - RB
> > 
> > --- On Mon, 2/16/09, Alex Basa
> <[email protected]>
> > wrote:
> > 
> > > From: Alex Basa <[email protected]>
> > > Subject: regex for a folder only crawl
> > > To: [email protected]
> > > Date: Monday, February 16, 2009, 9:54 AM
> > > Hi guys,
> > > 
> > > I'm trying to make a regex to only crawl a
> folder.
> >  So
> > > if I was crawling 123.com/456/789
> > > 
> > > I would only want to crawl
> > > ^http://123.com/456/789/(.*)/
> > > 
> > > I tried
> > > ^http://123.com/456/789/*\.*
> > > 
> > > but there are many web pages with no file
> extensions.
> > > 
> > > I'm not sure how to specify only one forward
> slash
> > > after in the regex.  Any ideas?
> > > 
> > > Thanks as always in advance,
> > > 
> > > Alex


      

Reply via email to