Try: site.xpath("descendant::div[....]")
On Fri, Jan 3, 2014 at 3:36 PM, Carlos Espeleta <[email protected]> wrote:
> I'm using scrapy to get information of this
> website<http://guia.bcn.cat/index.php?pg=search&q=*:*>
>
> The code that I want scrape has the following structure:
>
> <div id="llista-resultats">
> <div>
> <h3>
> <a href="URL"> Title </a>
> <div class="dades">
> <dl>
> <dt> </dt>
> <dd> </dd>
> ...
> </div>
> <div>
> And repetar again
>
>
> I have done tests and I know how to get the information, but the problem
> that I have with the following code is that I get all the titles, then all
> the URLs, etc and that I want is select the first title with the first URL.
>
>
> class BcnSpider(CrawlSpider):
> name = 'bcn'
> allowed_domains = ['guia.bcn.cat']
> start_urls = ['http://guia.bcn.cat/index.php?pg=search&q=*:*']
>
> def parse(self, response):
> sel = Selector(response)
> sites = sel.xpath("//div[@id='llista-resultats']")
> items = []
> for site in sites:
> item = BcnItem()
> item['title'] = site.xpath(
> "//div[@id='llista-resultats']//h3/a/text()").extract()
> item['url'] = site.xpath("//div[@id='llista-resultats']//h3/a/@href").
> extract()
> item['when'] = site.xpath(
> "//div[@id='llista-resultats']//div[@class='dades']/dl/dd/text()").extract
> ()
> items.append(item)
> return items
>
>
> I think that the error is because I'm using "*//*" on each item, but i
> didn't achived get information that is descendant of "*sites =
> sel.xpath("//div[@id='llista-resultats']")*".
>
> Here my post on
> StackOverflow<http://stackoverflow.com/questions/20908790/i-cant-scrape-div-parameters-of-the-website-scrapy>
>
> Thanks for all
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/groups/opt_out.
>
--
You received this message because you are subscribed to the Google Groups
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.