Best way to record tag name (and tag attributes, if possible)

JS Wed, 04 Feb 2015 17:56:49 -0800

Hi,

I am extracting links (only) from a select number of webpages and would 
like to record the respective tag name (ie. a, img, iframe, etc) along with 
attributes such as name, target, etc.  I know I can individual select 
elements by tag name using "find_elements_by_tag_name", but I'm wondering 
if there is a better, more efficient way to achieve what i'm looking for. 
 Selecting each individual type of tag will result in lengthy, repetitive 
code.


def parse_url(self, response):
  anchor_tags = self.driver.find_elements_by_tag_name('a')
  for tag in anchor_tags:
    item = ExampleItem()
    item['tag_name'] = 'a'
    item['link'] = tag.get_attribute('src')



Any thoughts on this?

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Best way to record tag name (and tag attributes, if possible)

Reply via email to