python-list@python.org:
hi ,everyone,
i want to scrap something from
http://search.dangdang.com/search_pub.php?key=python
my code is :

import urllib
import lxml.html
down='http://search.dangdang.com/search_pub.php?key=python'
file=urllib.urlopen(down).read()
root=lxml.html.fromstring(file)
tnodes = root.xpath("//div[@class='listitem 
detail']//li[@class='maintitle']//a")
for i,x in  enumerate(tnodes):
   print i,"  ",x.get('name'),x.get('href'),x.get('onclick'),x.text,"\n"

the output is :
0    p_name 
http://product.dangdang.com/product.aspx?product_id=20872365&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20872365_1_22591_p','','','');
 None

1    p_name 
http://product.dangdang.com/product.aspx?product_id=20255354&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20255354_2_12605_p','','','');
 None

2    p_name 
http://product.dangdang.com/product.aspx?product_id=20836565&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20836565_3_2361_p','','','');
 None

3    p_name 
http://product.dangdang.com/product.aspx?product_id=21004615&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','21004615_4_3387_p','','','');
 None

4    p_name 
http://product.dangdang.com/product.aspx?product_id=21063086&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','21063086_5_18815_p','','','');
 None

5    pr_name 
http://product.dangdang.com/product.aspx?product_id=20678461&ref=search-1-pub 
s('click','python','01.54.04.03,01.54.06.18','','86_1_25','','','20678461_6_3967_p','','','RECO');
 None

6    pr_name 
http://product.dangdang.com/product.aspx?product_id=20650363&ref=search-1-pub 
s('click','python','01.54.19.00','','86_1_25','','','20650363_7_62_p','','','RECO');
 黑客之道:漏洞发掘的艺术(原书第二版)(赠1CD)(电子制品CD-ROM)(

7    pr_name 
http://product.dangdang.com/product.aspx?product_id=20767932&ref=search-1-pub 
s('click','python','01.54.19.00','','86_1_25','','','20767932_8_4475_p','','','RECO');
 Binary Hacks――黑客秘笈100选

8    p_name 
http://product.dangdang.com/product.aspx?product_id=20596189&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20596189_9_639_p','','','');
 None

9    p_name 
http://product.dangdang.com/product.aspx?product_id=20947680&ref=search-1-pub 
s('click','python','01.54.24.00,01.54.06.18','','86_1_25','','','20947680_10_7295_p','','','');
 None

10    p_name 
http://product.dangdang.com/product.aspx?product_id=21050368&ref=search-1-pub 
s('click','python','01.54.19.00','','86_1_25','','','21050368_11_7039_p','','','');
 None

11    p_name 
http://product.dangdang.com/product.aspx?product_id=20667966&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20667966_12_383_p','','','');
 None

12    p_name 
http://product.dangdang.com/product.aspx?product_id=21022493&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','21022493_13_5183_p','','','');
 None

13    pr_name 
http://product.dangdang.com/product.aspx?product_id=479654&ref=search-1-pub 
s('click','python','01.54.06.08,01.54.06.18','','86_1_25','','','479654_14_2095_p','','','RECO');
 Perl语言编程(第三版)

14    pr_name 
http://product.dangdang.com/product.aspx?product_id=20999855&ref=search-1-pub 
s('click','python','01.54.10.00','','86_1_25','','','20999855_15_6715_p','','','RECO');
 程序员的思维修炼:开发认知潜能的九堂课

15    pr_name 
http://product.dangdang.com/product.aspx?product_id=20696203&ref=search-1-pub 
s('click','python','01.54.06.08','','86_1_25','','','20696203_16_31615_p','','','RECO');
 Perl语言入门(第五版)(原书名:Learning Perl,5/e)

16    p_name 
http://product.dangdang.com/product.aspx?product_id=20670643&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20670643_17_24_p','','','');
 可爱的

17    p_name 
http://product.dangdang.com/product.aspx?product_id=20362210&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20362210_18_32_p','','','');
 学习

18    p_name 
http://product.dangdang.com/product.aspx?product_id=9053236&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','9053236_19_4_p','','',''); 
学习

19    p_name 
http://product.dangdang.com/product.aspx?product_id=20850780&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','20850780_20_1055_p','','','');
 None

20    pr_name 
http://product.dangdang.com/product.aspx?product_id=20449068&ref=search-1-pub 
s('click','python','01.54.06.08','','86_1_25','','','20449068_21_38_p','','','RECO');
 精通Perl

21    p_name 
http://product.dangdang.com/product.aspx?product_id=21127816&ref=search-1-pub 
s('click','python','01.54.24.00,01.54.06.18','','86_1_25','','','21127816_22_12545_p','','','');
 None

22    p_name 
http://product.dangdang.com/product.aspx?product_id=21107633&ref=search-1-pub 
s('click','python','01.54.06.18','','86_1_25','','','21107633_23_19245_p','','','');
 Hadoop权威指南(第2版)修订升级版

23    None  http://bang.dangdang.com/product_redirect.php?product_id=9317290 
None None

24    p_name 
http://product.dangdang.com/product.aspx?product_id=9317290&ref=search-1-pub 
s('click','python','01.54.06.06,01.49.01.11,01.54.26.00','','86_1_25','','','9317290_24_81727_p','','','');
 Java编程思想(第4版)

25    p_name 
http://product.dangdang.com/product.aspx?product_id=20773186&ref=search-1-pub 
s('click','python','01.54.06.17','','86_1_25','','','20773186_25_80479_p','','','');
 Android应用开发揭秘

the problem  is  x.text  ,for example:

1.
<a name="p_name" target="_blank" 
href="http://product.dangdang.com/product.aspx?product_id=20872365&ref=search-1-pub";
 
onclick="s('click','python','01.54.06.18','','86_1_25','','','20872365_1_22591_p','','','');">
<font class="skcolor_ljg">Python</font>
基础教程(第2版)
</a>
what i want to get is   "Python 基础教程(第2版)",the output is None

2:
<a name="p_name" target="_blank" 
href="http://product.dangdang.com/product.aspx?product_id=20670643&ref=search-1-pub";
 
onclick="s('click','python','01.54.06.18','','86_1_25','','','20670643_17_24_p','','','');">
可爱的
<font class="skcolor_ljg">Python</font>
</a>
what i want to get is "可爱的python",the output is  可爱的

would you mind to tell me how to revise my code?
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to