Try: site.xpath("descendant::div[....]")

On Fri, Jan 3, 2014 at 3:36 PM, Carlos Espeleta <[email protected]> wrote:

> I'm using scrapy to get information of this 
> website<http://guia.bcn.cat/index.php?pg=search&q=*:*>
>
> The code that I want scrape has the following structure:
>
>     <div id="llista-resultats">
>      <div>
>       <h3>
>        <a href="URL"> Title </a>
>       <div class="dades">
>        <dl>
>         <dt> </dt>
>         <dd> </dd>
>         ...
>       </div>
>      <div>
>       And repetar again
>
>
> I have done tests and I know how to get the information, but the problem
> that I have with the following code is that I get all the titles, then all
> the URLs, etc and that I want is select the first title with the first URL.
>
>
>   class BcnSpider(CrawlSpider):
>         name = 'bcn'
>         allowed_domains = ['guia.bcn.cat']
>         start_urls = ['http://guia.bcn.cat/index.php?pg=search&q=*:*']
>
>         def parse(self, response):
>     sel = Selector(response)
>     sites = sel.xpath("//div[@id='llista-resultats']")
>     items = []
>     for site in sites:
>     item = BcnItem()
>     item['title'] = site.xpath(
> "//div[@id='llista-resultats']//h3/a/text()").extract()
>     item['url'] = site.xpath("//div[@id='llista-resultats']//h3/a/@href").
> extract()
>     item['when'] = site.xpath(
> "//div[@id='llista-resultats']//div[@class='dades']/dl/dd/text()").extract
> ()
>     items.append(item)
>     return items
>
>
> I think that the error is because I'm using "*//*" on each item, but i
> didn't achived get information that is descendant of "*sites =
> sel.xpath("//div[@id='llista-resultats']")*".
>
> Here my post on 
> StackOverflow<http://stackoverflow.com/questions/20908790/i-cant-scrape-div-parameters-of-the-website-scrapy>
>
> Thanks for all
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to