for testing purpose, i wrote a small scrapy script that should find next
page based on <a> text.
However if i was able to do it in utf-8, i encountrer some issue with web
page encoded in "windows-1250" while my scrapy script and by default text
is written in utf-8
let's have a look at: https://www.vsetkyfirmy.sk/autoskoly
the bottom pagination display "Next page" in local language "Ďalšie >>
<https://www.vsetkyfirmy.sk/autoskoly/strana_2.html>" and i would like to
retrieve the complete url of this <a> so if we are on the first page:
https://www.vsetkyfirmy.sk/autoskoly/strana_2.html, if we are on the page
2, https://www.vsetkyfirmy.sk/autoskoly/strana_3.html, etc...
however this webpage is encoded in "windows-1250" and in my scrapy script
i'm confused as i use utf-8 and the following code to retrieve the <a> url:
t = Selector(response).xpath('//*[text()[contains(.,
but once done...scrapy says:
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL
bytes or control characters
So what should i do to achieve what i want ?
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
To post to this group, send email to email@example.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.