HI ALL

I am trying to click the next page link using scrapy + python + selenium 
webdriver

Platform : Python + scrapy + Selenum webdriver


Note:

I am trying to get all 11 page jobs, but its repeatedly looping with first 
page jobs alone, not crawled all pages.



Here is my Html code:

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

    <div id="win0divHRS_APPL_WRK_HRS_LST_NEXT">
    <span class="PSHYPERLINK" title="Next In List">
    <a id="HRS_APPL_WRK_HRS_LST_NEXT" class="PSHYPERLINK" 
href="javascript:submitAction_win0(document.win0,'HRS_APPL_WRK_HRS_LST_NEXT');" 
tabindex="74" ptlinktgt="pt_replace" 
name="HRS_APPL_WRK_HRS_LST_NEXT">Next</a>
    </span>
    </div>


<!-- end snippet -->

Here is my spider code :


<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

    def __init__(self):
            self.driver = webdriver.Firefox()

        def parse(self,response):
            
self.driver.get('https://eapplicant.northshore.org/psc/psapp/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL')
            selector = Selector(response)

            while True:
                try:
                    next = 
self.driver.find_element_by_id('HRS_APPL_WRK_HRS_LST_NEXT')
                    links = []
                    for link in 
selector.xpath('.//*[@id="HRS_CE_JO_EXT_I$scroll$0"]'):
                        for link in 
selector.css('span.PSEDITBOX_DISPONLY').re('.*>(\d+)<.*'):
                    #intjid = 
selector.css('span.PSEDITBOX_DISPONLY').re('.*>(\d+)<.*')
                            abc = 
'https://eapplicant.northshore.org/psp/psapp/EMPLOYEE/HRMS/c/HRS_HRAM.HRS_CE.GBL?Page=HRS_CE_JOB_DTL&Action=A&JobOpeningId='+link+'&SiteId=1&PostingSeq=1'
                            #print abc
                        yield Request(abc,callback=self.parse_iframe, 
headers={"X-Requested-With": "XMLHttpRequest"}, dont_filter=True)

                    
                except NoSuchElementException:
                    break
            next.click()

            #self.driver.close()

        def parse_iframe(self,response):
            selector = Selector(response)
            url = 
selector.xpath('//*[@id="ptifrmtgtframe"]/@src').extract()[0]
            yield Request(url,callback=self.parse_listing_page, 
headers={"X-Requested-With": "XMLHttpRequest"}, dont_filter=True)

<!-- end snippet -->


Please let me know how to iterate the next page jobs using selenium 
webdriver + scrapy 

Is there any one guide me crawlled all 11 pages.

Thanks advance



-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

Reply via email to