Ya, there are a large number of Url's that contain hello and the url's are 
different also.
Need to just get the 1st Url which contains hello and ignore rest all.

Currently I am fetching the first url that contains hello and updating a 
boolean value to ignore other url's having hello but the problem is it is 
always parsing all the ignored url's also.

Need a way in which I can update the rules (deny_urls) as soon as I get the 
required url containing 'hello'.

Thanks,
Sunny

On Wednesday, June 4, 2014 5:48:13 PM UTC+1, Lhassan Baazzi wrote:
>
> Hi
>
> Is there many URLs that containt "hello" ? if not scrapy filter duplicate 
> request aka url.
>
> Cheers.
> Le 4 juin 2014 14:45, "sunny arora" <sunnya...@gmail.com <javascript:>> a 
> écrit :
>
>>  Hi All,
>>
>> Is it possible in scrapy to crawl a url which contains 'hello' only once 
>> and update the rules dynamically to exclude it and continue scraping rest 
>> of the urls and follow them ?
>>
>> Any suggestions/help is appreciated.
>>
>> Thanks,
>> Sunny
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "scrapy-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to scrapy-users...@googlegroups.com <javascript:>.
>> To post to this group, send email to scrapy...@googlegroups.com 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/scrapy-users.
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to