Re: Repost: RegEx problem

jianguo cai Sat, 25 Oct 2008 06:11:53 -0700

 +^http://yyy\\.www\\.com/(used-cars|ID\\w*)/((\\w*/(\\w*/)?)|(.*\\.html))
may be only one \ is enough in text file.  though you need doubel \\ is " "
strings.  It is related to how regex is interpreted in two ways.



2008/10/22 Cool The Breezer <[EMAIL PROTECTED]>

> Its really bugging me from last two days as regex in crawl-urlfilter.txt
> does not recognize the urls as expected.
>
> URLS
> http://yyy.www.com/
> http://yyy.www.com/used-cars/902/1/
> http://yyy.www.com/used-cars/902/2/
> http://yyy.www.com/ID_101360033/Honda-city-GXI-2005-for-sale.html
>
> Pettern
> +^http://yyy\\.www\\.com/(used-cars|ID\\w*)/((\\w*/(\\w*/)?)|(.*\\.html))
>
> return pattern.matcher(url).find(); should return true for following URLs
> but it returns false. Not sure why???
> I captured all URLs in a test file and test case correctly recognizes above
> urls i.e. pattern.matcher(url).find() returns true when it finds any of
> above URLs.
>
> I understand that this maybe a easy question but it really had taken three
> days and I am still banging my head.
>
> Appreciate your help on this.
>
> - RB
>
>
>
>
>
>
>

Re: Repost: RegEx problem

Reply via email to