Unicode uses them to indicate to the application reading the text file
which order the following bytes are in. Since UTF-8 uses compound
characters to indicate the scary-high end of the unicode character
table (two bytes needed to encode some characters) the order that the
bits arrived in is of critical importance. Text files may be little-
endian or big-endian, and unless you know what order to expect, you
can't really know.
Walter
On Jul 4, 2011, at 3:02 AM, Sebastian wrote:
Thank you for your reply!
Stripping the first chars is possible of course, but I don't
understand why these chars are there.
It was working before! I could just upload the utf-8 csv and everthing
was working great before. I don't really know what I changed that now
these chars are appearing.
Sebastian
On 1 Jul., 15:12, Frederick Cheung <[email protected]> wrote:
On Jul 1, 11:48 am, Sebastian <[email protected]> wrote:
OK,
it was working perfectly when I just made sure that my csv file is
in
utf-8 encoding format.
I deleted some of my programm, so I had to write a lot of stuff
again.
If I now upload a csv file which is in utf-8 format and then I have
every time in the first row that the first three character are: \xEF
\xBBxBF
That's a utf BOM: a magic unicode character that tells whoever is
reading the stream what endianness is and also allows to tell UTF8
apart from utf16
You can safely strip them from the file.
I read that these is something about unicode and ordering, but i
don't
know where these hex chars come from.
Also every german special character is also shown in this hex code,
e.g. "k\xC3\xBChler" should be "kühler"
That is probably just an output thing if you are seeing this in a
terminal window- \xC3\xBC is the utf8 sequence for ü
Fred
If I use files in other encodings there are not these three chars in
the beginning, but every special char is "?"
Has anyone an idea where this comes from?
Cheers,
Sebastian
On 22 Jun., 13:26, Sebastian <[email protected]> wrote:
file.temp is an object. I have a form where a csv can be
uploaded, but
it is never stored. That's why I use tempfile. That means that I
probably have no path to use in that method.
BUT, the open and foreach method for the CSV class is working
with an
object whenever I don't have a german special character in my csv
file
or when my csv file is already in utf-8 encoding format.
On 22 Jun., 12:05, Chirag Singhal <[email protected]> wrote:
What does file.tempfile return?
If it is a file object, then we have a problem, we need to pass
in file path
here.
So call path on the file object and pass that as the first
argument.
--
You received this message because you are subscribed to the Google
Groups "Ruby on Rails: Talk" group.
To post to this group, send email to rubyonrails-
[email protected].
To unsubscribe from this group, send email to [email protected]
.
For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en
.
--
You received this message because you are subscribed to the Google Groups "Ruby on
Rails: Talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en.