So it is just a random sequence of junk.
It will be a matter of finding the real start of the record (in this
case a %) and throwing the junk away. I was misled by the note in the
codecs class that BOMs were being prepended. Should have looked more
carefully.
Mea culpa.
--
On 25/9/2013 06:38, J. Bagg wrote:
So it is just a random sequence of junk.
It will be a matter of finding the real start of the record (in this
case a %) and throwing the junk away.
Please join the list. Your present habit of starting a new thread for
each of your messages is getting old.
I'm having trouble with the BOM that is now prepended to codecs files.
The files have to be read by java servlets which expect a clean file
without any BOM.
Is there a way to stop the BOM being written?
It is seriously messing up my work as the servlets do not expect it to
be there. I could
On Tue, 24 Sep 2013 10:42:22 +0100, J. Bagg wrote:
I'm having trouble with the BOM that is now prepended to codecs files.
The files have to be read by java servlets which expect a clean file
without any BOM.
Is there a way to stop the BOM being written?
Of course there is :-) but first we
J. Bagg wrote:
I'm having trouble with the BOM that is now prepended to codecs files.
The files have to be read by java servlets which expect a clean file
without any BOM.
Is there a way to stop the BOM being written?
I think if you specify the byte order explicitly with UTF-16-LE or
I'm using:
outputfile = codecs.open (fn, 'w+', 'utf-8', errors='strict')
to write as I know that the files are unicode compliant. I run the raw
files that are delivered through a Python script to check the unicode
and report problem characters which are then edited. The files use a
whole
On 24/09/2013 14:01, J. Bagg wrote:
I'm using:
outputfile = codecs.open (fn, 'w+', 'utf-8', errors='strict')
Well for the life of me I can't make that produce a BOM on 2.7 or 3.4.
In other words:
code
import codecs
with codecs.open(temp.txt, w+, utf-8, errors=strict) as f:
f.write(abc)
On 24/9/2013 09:01, J. Bagg wrote:
Why would you start a new thread? just do a Reply-List (or Reply-All
and remove the extra names) to the appropriate message on the existing
thread.
I'm using:
outputfile = codecs.open (fn, 'w+', 'utf-8', errors='strict')
That won't be adding a BOM. It
I've checked the original files using od and they don't have BOMs.
I'll remove them in the servlet. The overhead is probably small enough
unless somebody is doing a massive search. We have a limit anyway to
prevent somebody stealing the entire set of data.
I started writing the Python search
J. Bagg wrote:
I've checked the original files using od and they don't have BOMs.
I'll remove them in the servlet. The overhead is probably small enough
unless somebody is doing a massive search. We have a limit anyway to
prevent somebody stealing the entire set of data.
I started
Le mardi 24 septembre 2013 11:42:22 UTC+2, J. Bagg a écrit :
I'm having trouble with the BOM that is now prepended to codecs files.
The files have to be read by java servlets which expect a clean file
without any BOM.
Is there a way to stop the BOM being written?
It is
My editor is JEdit. I use it on a Win 7 machine but have everything set
up for *nix files as that is the machine I'm normally working on.
The files are mailed to me as updates. The library where the indexers
work do use MS computers but this is restricted to EndNote with an
exporter into the
On Wed, Sep 25, 2013 at 4:43 AM, wxjmfa...@gmail.com wrote:
- The *mark* (once the Unicode.org terminology in FAQ) indicating
a unicode encoded raw text file is neither a byte order mark,
nor a signature, it is an encoded code point, the encoded
U+FEFF, 'ZERO WIDTH NO-BREAK SPACE', code
J. Bagg j.b...@kent.ac.uk writes:
I've checked the original files using od and they don't have BOMs.
I'll remove them in the servlet. The overhead is probably small enough
unless somebody is doing a massive search. We have a limit anyway to
prevent somebody stealing the entire set of data.
14 matches
Mail list logo