I have a spider that want to crawl some data of a movie site, but I found 
that these data was generated through ajax after the page loaded,and my 
code is:
require 'nokogiri'
require 'watir-webdriver'

browser = Watir::Browser.new 
browser.goto 
'http://www.tudou.com/albumplay/2Dk1-JIVpzo/yp927-uKGMs.html?FR=LIAN'
browser.element(:css => "#digBury .dig_container").wait_until_present
puts '***************************************'
puts browser.html
puts '***************************************'
doc = Nokogiri::HTML(browser.html)
content = doc.css(".dig_container .num")
browser.close

from the output I can get the content which I want:
在此输入代码...

<div id="digBury" class="dig_wrap">
<div class="dig_container">
<a title="喜欢就挖一下吧,登录后双倍威力" class="btn" href="#">
<i class="iconfont"></i>
<i class="tip">+1</i>
<span class="num">332</span>
</a>

</div>
</div>

but I know that on the server I must use headless ,so I changed my code to:
require 'nokogiri'
require 'watir-webdriver'
require 'headless'

headless = Headless.new 
headless.start

browser = Watir::Browser.new 
browser.goto 
'http://www.tudou.com/albumplay/2Dk1-JIVpzo/yp927-uKGMs.html?FR=LIAN'
browser.element(:css => "#digBury .dig_container").wait_until_present
puts '***************************************'
puts browser.html
puts '***************************************'
doc = Nokogiri::HTML(browser.html)
content = doc.css(".dig_container .num")
browser.close
headless.destroy

this time I can't get my result,and the result is:
<div id="digBury" class="dig_wrap disabled">
<a title="挖" class="btn" href="#">
<i class="iconfont"></i>
<span class="btn_desc">挖</span>
</a>
</div>

the diffrence is I have added headless and the effect is  the ajax request 
don't send or the ajax response I missed ,how can i fix this problem?

-- 
-- 
Before posting, please read http://watir.com/support. In short: search before 
you ask, be nice.

[email protected]
http://groups.google.com/group/watir-general
[email protected]

--- 
You received this message because you are subscribed to the Google Groups 
"Watir General" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to