Hey,guys! http://www.nowdl.cn/all.html my steps: 1,scrapy shell http://www.nowdl.cn/all.html 2,response.xpath('/html/body/div[3]/ul/li/a').extract() i want to extract the content before suffix ".php" for example: u'<a href="http://www.nowdl.cn/city/beijing/*beijing*.php" target="_blank">\u5317\u4eac</a>', i need bold fonts "beijing" and want to chang unicode "\u5317\u4eac" -->"北京市" now my question is: 1,how to use the regex to extract the contents which i need? 2,how to change the unicode to chinese? thks any suggestions!
-- You received this message because you are subscribed to the Google Groups "scrapy-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscr...@googlegroups.com. To post to this group, send email to scrapy-users@googlegroups.com. Visit this group at https://groups.google.com/group/scrapy-users. For more options, visit https://groups.google.com/d/optout.