On Wed, 05 Oct 2011 21:39:17 -0700, Greg wrote:
Here is the final code for those who are struggling with similar
problems:
## open and decode file
# In this case, the encoding comes from the charset argument in a meta
tag
# e.g. meta charset=iso-8859-2
fileObj = open(filePath,r).read()
Am 06.10.2011 05:40, schrieb Steven D'Aprano:
(4) Do all your processing in Unicode, not bytes.
(5) Encode the text into bytes using UTF-8 encoding.
(6) Write the bytes to a file.
Just wondering, why do you split the latter two parts? I would have used
codecs.open() to open the file and
On Thu, Oct 6, 2011 at 8:29 PM, Ulrich Eckhardt
ulrich.eckha...@dominalaser.com wrote:
Just wondering, why do you split the latter two parts? I would have used
codecs.open() to open the file and define the encoding in a single step. Is
there a downside to this approach?
Those two steps still
On 6 oct, 06:39, Greg gregor.hochsch...@googlemail.com wrote:
Brilliant! It worked. Thanks!
Here is the final code for those who are struggling with similar
problems:
## open and decode file
# In this case, the encoding comes from the charset argument in a meta
tag
# e.g. meta
On Thursday 2011 October 06 10:41, jmfauth wrote:
or (Python2/Python3)
import io
with io.open('abc.txt', 'r', encoding='iso-8859-2') as f:
... r = f.read()
...
repr(r)
u'a\nb\nc\n'
with io.open('def.txt', 'w', encoding='utf-8-sig') as f:
... t = f.write(r)
...
In mailman.1785.1317928997.27778.python-l...@python.org xDog Walker
thud...@gmail.com writes:
What is this io of which you speak?
It was introduced in Python 2.6.
--
John Gordon A is for Amy, who fell down the stairs
gor...@panix.com B is for Basil, assaulted
Hi, I am having some encoding problems when I first parse stuff from a
non-english website using BeautifulSoup and then write the results to
a txt file.
I have the text both as a normal (text) and as a unicode string
(utext):
print repr(text)
'Branie zak\xc2\xb3adnik\xc3\xb3w'
print repr(utext)
On Wed, 05 Oct 2011 16:35:59 -0700, Greg wrote:
Hi, I am having some encoding problems when I first parse stuff from a
non-english website using BeautifulSoup and then write the results to a
txt file.
If you haven't already read this, you should do so:
Brilliant! It worked. Thanks!
Here is the final code for those who are struggling with similar
problems:
## open and decode file
# In this case, the encoding comes from the charset argument in a meta
tag
# e.g. meta charset=iso-8859-2
fileObj = open(filePath,r).read()
fileContent =
On Thu, Oct 6, 2011 at 3:39 PM, Greg gregor.hochsch...@googlemail.com wrote:
Brilliant! It worked. Thanks!
Here is the final code for those who are struggling with similar
problems:
## open and decode file
# In this case, the encoding comes from the charset argument in a meta
tag
# e.g.
10 matches
Mail list logo