Re: Newbie: XPath Basics

Joey Espinosa Wed, 25 Jan 2017 13:30:06 -0800

First, this part is wrong:

      for div in response.xpath("//div[@class]='lesson-status-icon'"):


The attribute match should be completely contained within the square
brackets:

      for div in response.xpath("//div[@class='lesson-status-icon']"):

Additionally, notice that the anchor tags you're trying to get are
*not* children
of the divs you're selecting. They are *siblings*, which means they are at
the same "level" in the markup hierarchy as the divs. If you are insistent
on selecting those divs (maybe because they're more reliably selectable to
you?), then you can use the "following-sibling" selector:

      for anchor in
response.xpath("//div[@class='lesson-status-icon']/following-sibling::a/@href"):
          print anchor.extract()

I can't check it right now, but give that a shot.


On Tue, Jan 24, 2017 at 9:32 PM Peter <p...@dirac.org> wrote:

> Trying to scrape some URLs from this page (the stuff highlighted in yellow
> is what I'm looking for):
>
>
>
>
>
> I didn't quite understand the section on selectors and XPath, but this was
> my attempt at getting those URLs:
>
>
>    def grab_page(self, response):
>       for div in response.xpath("//div[@class]='lesson-status-icon'"):
>          print( div.xpath("a[@href]").extract() )
>          print( div.xpath("a[@href]/text()").extract() )
>          print( div.extract() )
>       for div in response.xpath("//div[@class]='lesson-status-icon'").
> xpath("/a[@href]"):
>          print( div.text().extract() )
>
>
> I'm flailing and drowning.  Can someone please put me on the right path?
> What's the right syntax to grab the URLs?
>
>
> Thanks!!!
>
>
>
>
>
>
>
>
>
>
> <https://lh3.googleusercontent.com/-MeiUXL6STxs/WIgIqtFhtPI/AAAAAAAACHU/KO9L90WRgMI9xKA2azq9upycDjQYXxo4ACLcB/s1600/cpod.jpg>
>
> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scrapy-users+unsubscr...@googlegroups.com.
> To post to this group, send email to scrapy-users@googlegroups.com.
> Visit this group at https://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>
-- 
Respectfully,

*Joey Espinosa*
Chief Technology Officer
*Vote.org* <https://www.vote.org/>
Phone: (305) 747-1711

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Re: Newbie: XPath Basics

Reply via email to