You can scrape that with a little help of splash:
https://github.com/scrapy-plugins/scrapy-splash

$ cat myspider.py
class MySpider(scrapy.Spider):
    name = 'myspider'
    start_urls = ['https://sapui5.hana.ondemand.com/']

    def parse(self, response):
        url = '
https://sapui5.hana.ondemand.com/sdk/#docs/api/symbols/sap.html'
        yield SplashRequest(url, self.parse_page,
                            args={
                                'wait': 5.,
                                'iframes': True,
                                'html': True,
                            },
                            endpoint='render.json')

    def parse_page(self, response):
        iframe_html = response.data['childFrames'][0]['html']
        sel = scrapy.Selector(text=iframe_html)
        for div in sel.css('#content .sectionItem'):
            name = div.css('a::text').extract_first()
            desc = div.css('.description::text').extract_first() or ''
            print(': '.join([name, desc]))

$ scrapy runspider myspider.py
...
2016-05-28 01:36:18 [scrapy] DEBUG: Crawled (200) <GET
https://sapui5.hana.ondemand.com/sdk/#docs/api/symbols/sap.html via
http://127.0.0.1:8050/render.json> (referer: None)
apf: Analysis Path Framework
ca:
chart: Chart controls based on Vizframe
collaboration: SAP UI library: SAP Collaboration for Social Media
Integration.
gantt: UI5 library: sap.gantt.
landvisz: sap.landvisz library for UI developments
...



Rolando

On Fri, May 27, 2016 at 6:40 PM, David Fishburn <[email protected]>
wrote:

> Yes, public site:
>
> https://sapui5.hana.ondemand.com/sdk/#docs/api/symbols/sap.html
>
> I want to iterate over the "Namespaces & Classes" section.
>
> Then I want follow each of those links and do the same on those pages.
>
> Thanks,
> David
>
>
>> --
> You received this message because you are subscribed to the Google Groups
> "scrapy-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/scrapy-users.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to