db wrote:
In this template I write a few Mysql variables.
Those variable often have german characters. This characters (Gösing in
stead of Gösing). The german characters in the html template are shown
correctly.
The problem is then with these variables: apparently, the Mysql
variables are
John Perks and Sarah Mount wrote:
If the ascii can't be recognized as UTF16, then surely the codec
shouldn't have allowed it to be encoded in the first place? I could
understand if it was trying to decode ascii into (native) UTF32.
Please don't call the thing you are trying to decode ascii.
Ivan Voras wrote:
Since the .encoding attribute of file objects are read-only, what is the
proper way to process large utf-8 text files?
You should use codecs.open, or codecs.getreader to get a StreamReader
for UTF-8.
Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list
Jeremy Bowers wrote:
Then I'd honor his consistency of belief, but still consider it impolite
in general, as asking someone to do tons of work overall to save you a bit
is almost always impolite.
This is not what he did, though - he did not break the protocol by
sending in patches by email
Xah Lee wrote:
Thanks. Is it true that any unicode chars can also be used inside regex
literally?
e.g.
re.search(ur'+',mystring,re.U)
I tested this case and apparently i can.
Yes. In fact, when you write u\u2003 or u doesn't matter
to re.search. Either way you get a Unicode object with
Xah Lee wrote:
how to represent the unicode em space in regex?
You will have to pass a Unicode literal as the regular expression,
e.g.
fracture=re.split(u'\u2003*\\|\u2003*',myline,re.U)
Notice that, in raw Unicode literals, you can still use \u to
escape characters, e.g.
JanC wrote:
This is difficult to do right, if you have to consider all the laws in
different countries...
Right. So he points out that his explanations are for US copyright law
only, and then that legislation even in different US states, or perhaps
even in districts, might be different.
Stefan Waizmann wrote:
I would like the distutils are creating a binary distribution only - means
create the distribution file with *.pyc files WITHOUT the *.py files. Any
ideas?
You will need to create your own command. You can either specialize the
build command, to not copy the source code
Steve Horsley wrote:
It is my understanding that the BOM (U+feff) is actually the Unicode
character Non-breaking zero-width space.
My understanding is that this used to be the case. According to
http://www.unicode.org/faq/utf_bom.html#38
the application should now specify specific processing,
Mike Brown wrote:
Very strange how it only shows up after the 1st import attempt seems to
succeed, and it doesn't ever show up if I run the code directly or run the
code in the command-line interpreter.
The reason for that is that the Python byte code stores the Unicode
literal in UTF-8. The
Francis Girard wrote:
If I understand well, into the UTF-8 unicode binary representation, some
systems add at the beginning of the file a BOM mark (Windows?), some don't.
(Linux?). Therefore, the exact same text encoded in the same UTF-8 will
result in two different binary files, and of a
Francis Girard wrote:
Well, no text files can't be concatenated ! Sooner or later, someone will use
cat on the text files your application did generate. That will be a lot of
fun for the new unicode aware super-cat.
Well, no. For example, Python source code is not typically concatenated,
nor is
Steven Bethard wrote:
Yeah, I agree it's weird. I suspect if someone supplied a patch for
this behavior it would be accepted -- I don't think this should break
backwards compatibility (much).
Notice that the right thing to do would be to pass encoding and errors
to __unicode__. If the string
Kent Johnson wrote:
Could this be handled with a try / except in unicode()? Something like
this:
Perhaps. However, this would cause a significant performance hit, and
possbibly undesired side effects. So due process would require that the
interface of __unicode__ first, and then change the actual
aurora wrote:
What is the processing of getting a PEP work out? Does the work and
discussion carry out in the python-dev mailing list? I would be glad to
help out especially on this particular issue.
See PEP 1 for the PEP process. The main point is that discussion is
*not* carried out on any
aurora wrote:
Lots of errors. Amount them are gzip (binary?!) and strftime??
For gzip, this is not surprising. It contains things like
self.fileobj.write('\037\213')
which is not intended to denote characters.
How about
b'' - 8bit string; '' unicode string
and no automatic conversion.
This
aurora wrote:
The Java
has a much more usable model with unicode used internally and
encoding/decoding decision only need twice when dealing with input and
output.
In addition to Fredrik's comment (that you should use the same model
in Python) and Walter's comment (that you can enforce it by
Thomas Guettler wrote:
Is there a way to import a file without creating
a .pyc file?
That is part of PEP 304, which is not implemented
yet, and apparently currently stalled due to lack
of interest.
Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list
Luis P. Mendes wrote:
From your experience, do you think that if this wrong XML code could be
meant to be read only by somekind of Microsoft parser, the error will
not occur?
This is very unlikely. MSXML would never do this incorrectly.
Regards,
Martin
--
[EMAIL PROTECTED] wrote:
Do you know this for a fact?
I'm going by newsgroup messages from around the time that I was
proposing to put together a standard block cipher module for Python.
Ah, newsgroup messages. Anybody could respond, whether they have insight
or not.
The PSF does comply with the
Luis P. Mendes wrote:
with:DataSetNode = stringNode.childNodes[0]
print DataSetNode.toxml()
I get:
lt;DataSetgt;
~ lt;Ordergt;
~lt;Customergt;439lt;/Customergt;
~ lt;/Ordergt;
lt;/DataSetgt;
___-
so far so good, but when I
Irmen de Jong wrote:
The unescaping is usually done for you by the xml parser that you use.
Usually, but not in this case. If you have a text that looks like
XML, and you want to put it into an XML element, the XML file uses
lt; and gt;. The XML parser unescapes that as and . However, it
does not
Irmen de Jong wrote:
Usually, but not in this case. If you have a text that looks like
XML, and you want to put it into an XML element, the XML file uses
lt; and gt;. The XML parser unescapes that as and . However, it
does not then consider the and as markup, and it shouldn't.
That's also what
Luis P. Mendes wrote:
When I access the url via the Firefox browser and look into the source
code, I also get:
?xml version=1.0 encoding=utf-8?
string xmlns=httplt;DataSetgt;
~ lt;Ordergt;
~lt;Customergt;439lt;/Customergt;
~ lt;/Ordergt;
lt;/DataSetgt;/string
Please do try to
Jarek Zgoda wrote:
So why are there non-UNICODE versions of wxPython??? To save memory or
something???
Win95, Win98, WinME have problems with unicode.
This problem can be solved - on W9x, wxPython would have to
pass all Unicode strings to WideCharToMultiByte, using
CP_ACP, and then pass the
Luis P. Mendes wrote:
I get the following result:
?xml version=1.0 encoding=utf-8?
string xmlns=http://www..;lt;DataSetgt;
~ lt;Ordergt;
Most likely, this result is correct, and your document
really does contain
lt;Ordergt;
I don't get any elements. But, if I access the same url via a
Ricardo Bugalho wrote:
thanks for the information. But what I was really looking for was
informaion on when and why Python started doing it (previously, it always
used sys.getdefaultencoding())) and why it was done only for 'print' when
stdout is a terminal instead of always.
It does that since
Torsten Mohr wrote:
Geht sowas auch in Python?
Nicht direkt. Es ist blich, dass Funktionen, die Ergebnisse
(Rckgabewerte) liefern, dies mittels return tun:
def vokale(string):
result = [c for c in string if c in aeiou]
return .join(result)
x = Hallo, Welt
x = vokale(x)
Falls man mehrere
28 matches
Mail list logo