some help for regex with scrapy

peter zhu Sat, 03 Sep 2016 10:05:35 -0700

Hey,guys!
 http://www.nowdl.cn/all.html
my steps:
1,scrapy shell http://www.nowdl.cn/all.html
2,response.xpath('/html/body/div[3]/ul/li/a').extract()
i want to extract the content before suffix ".php"
for example:
u'<a href="http://www.nowdl.cn/city/beijing/*beijing*.php"; 
target="_blank">\u5317\u4eac</a>',
i need bold fonts "beijing" and want to chang unicode "\u5317\u4eac" 
-->"北京市"
now my question is:
1,how to use the regex to extract the contents which i need?
2,how to change the unicode to chinese?
thks any suggestions!


-- 
You received this message because you are subscribed to the Google Groups 
"scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to scrapy-users+unsubscr...@googlegroups.com.
To post to this group, send email to scrapy-users@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.

some help for regex with scrapy

Reply via email to