Hi all...

my crawler only goes on 1rs page.. is not following all the links related 
to the site...

Ex:

from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor

class hugo_spyder(CrawlSpider):
name = "hugo_spyder"
allowed_domains = ["site.com"]
start_urls = ["http://www.site.com/";]
rules = [Rule(LinkExtractor(allow = ('')), callback = 'parse', follow = 
True)]

def parse(self, response):
url = response.url
code = response.status
                print url, code


Thanks..


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to