Re: Problem with crawling multiple pages 2

Travis Leleu Tue, 03 Mar 2015 11:03:27 -0800

If you are using the CrawlSpider class, do not use parse() as the name of
your callback method!  This is very clearly spelled out in the scrapy docs.



On Tue, Mar 3, 2015 at 9:45 AM, JEBI93 <[email protected]>
wrote:

> Again i don't know how to deal with pagination. Anyway here's problem:
> class GumtreespiderSpider(CrawlSpider):
>     name = "gumtreeSpider"
>     allowed_domains = ["gumtree.com.au"]
>     start_urls = ['http://www.gumtree.com.au/s-jobs/page-1/c9302?ad=wanted
> ']
>
>     rules = (
>         Rule(SgmlLinkExtractor(allow=('/s-jobs/page-\d+c9302?ad=wanted')),
> callback='parse', follow=True),
>     )
>
> What I'm trying to do is iterate with \d+ to scrape 100+ pages but it
> returns only first one(start_urls one).
> Here's full script: http://pastebin.com/CYrPvZuc
>
>  --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Problem with crawling multiple pages 2

Reply via email to