On Sun, 12 Jan 2014 17:51:41 +1000, Nick Coghlan ncogh...@gmail.com wrote:
On 12 January 2014 04:38, R. David Murray rdmur...@bitdance.com wrote:
But! Our goal should be to help people convert to Python3. So how can
we find out what the specific problems are that real-world programs are
On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.infowrote:
On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
AFAIK (and just for the record), there could be both Latin1 text and
UTF-16
in a PDF (and other encodings too), depending on the font used:
[...]
On 12 Jan 2014 21:53, Juraj Sukop juraj.su...@gmail.com wrote:
On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.info
wrote:
On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
AFAIK (and just for the record), there could be both Latin1 text and
UTF-16
in a
On Sun, Jan 12, 2014 at 2:16 PM, Nick Coghlan ncogh...@gmail.com wrote:
Why are you proposing to do the *join* in text space? Encode all the parts
separately, concatenate them with b'\n'.join() (or whatever separator is
appropriate). It's only the *text formatting operation* that needs to be
On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote:
On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.infowrote:
On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
AFAIK (and just for the record), there could be both Latin1 text and
UTF-16
in a
On Sun, Jan 12, 2014 at 11:16:37PM +1000, Nick Coghlan wrote:
content = '\n'.join([
'header',
'part 2 %.3f' % number,
binary_image_data.decode('latin-1'),
utf16_string.encode('utf-16be').decode('latin-1'),
'trailer']).encode('latin-1')
Wait a second, this is how I understood it but what Nick said made me think
otherwise...
On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano st...@pearwood.infowrote:
On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote:
On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.info
Daniel Holth writes:
-1 on adding more surrogateesapes by default. It's a pain to track
down where the encoding errors came from.
What do you mean by default? It was quite explicit in the code I
posted, and it's the only reasonable thing to do with text data
without known (but ASCII
On 01/12/2014 12:39 PM, Stephen J. Turnbull wrote:
Daniel Holth writes:
-1 on adding more surrogateesapes by default. It's a pain to track
down where the encoding errors came from.
What do you mean by default? It was quite explicit in the code I
posted, and it's the only reasonable
Why not just use six.byte_format(fmt, *args)?
It works on both Python2 and Python3 and accepts the numerical format
specifiers, plus '%b' for inserting bytes and '%a' for converting text
to ascii.
Admittedly it doesn't exist yet,
but it could and it would save a lot of arguing :)
(Apologies
On 01/12/2014 01:59 PM, Mark Shannon wrote:
Why not just use six.byte_format(fmt, *args)?
It works on both Python2 and Python3 and accepts the numerical format
specifiers, plus '%b' for inserting bytes and '%a'
for converting text to ascii.
Sounds like the second best option!
Admittedly
On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop juraj.su...@gmail.com wrote:
On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano st...@pearwood.info
wrote:
First, utf16_string confuses me. What is it? If it is a Unicode
string, i.e.:
It is a Unicode string which happens to contain code points
Steven D'Aprano writes:
then the name is horribly misleading, and it is best handled like this:
content = '\n'.join([
'header',
'part 2 %.3f' % number,
binary_image_data.decode('latin-1'),
utf16_string, # Misleading name, actually Unicode string
On 01/12/2014 02:31 PM, Stephen J. Turnbull wrote:
This corrupts binary_image_data. Each byte 127 will be replaced by
two bytes. In the second case, you can use latin1 to encode, it it
gives you what you want.
This kind of subtlety is precisely why MAL warned about use of latin1
to smuggle
On Mon, Jan 13, 2014 at 07:31:16AM +0900, Stephen J. Turnbull wrote:
Steven D'Aprano writes:
then the name is horribly misleading, and it is best handled like this:
content = '\n'.join([
'header',
'part 2 %.3f' % number,
Ethan Furman writes:
This kind of subtlety is precisely why MAL warned about use of latin1
to smuggle bytes.
And why I've been fighting Steven D'Aprano on it.
No, I think you haven't been fighting Steven d'A on it. You're
talking about parsing and generating structured binary files,
On 01/12/2014 04:02 PM, Stephen J. Turnbull wrote:
So when you talk about we, I suspect you are not the we everybody
else is arguing with. In particular, AIUI your use case is not
included in the use cases most of us -- including Steven -- are
thinking about.
Ah, so even in the minority I'm
Steven D'Aprano writes:
Of course you're right, but I have understood the above as being a
sketch and not real code. (E.g. does header really mean the literal
string header, or does it stand in for something which is a header?)
In real code, one would need to have some way of telling
On 11 January 2014 08:58, Ethan Furman et...@stoneleaf.us wrote:
On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 17:33:57 -0500
Eric V. Smith e...@trueblade.com wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
Eric V. Smith
On 11 January 2014 12:28, Ethan Furman et...@stoneleaf.us wrote:
On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 20:53:09 -0500
Eric V. Smith e...@trueblade.com wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
3892. See for example
For not caring much, your own stubbornness is quite notable throughout this
discussion. Stones and glass houses. :)
That said:
Twisted and Mercurial aren't the only ones who are hurt by this, at all.
I'm aware of at least two other projects who are actively hindered in their
support or migration
On 1/11/2014 1:44 AM, Stephen Hansen wrote:
There's been a number of examples given: PDF, HTTP, network streams
that switch inline from text-ish to binary and back-again.. But, we
can focus that down to a very narrow and not at all uncommon situation
in the latter.
PDF has been mentioned a
+kristjan=ccpgames@python.org]
On Behalf Of Nick Coghlan
Sent: 11. janúar 2014 08:43
To: Ethan Furman
Cc: python-dev@python.org
Subject: Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args)
to Python 3.5
No, it's the POSIX text model is completely broken and we're not letting
On Sat, Jan 11, 2014 at 5:14 AM, Cameron Simpson c...@zip.com.au wrote:
Hi Juraj,
Hello Cameron.
data = b' '.join( bytify( [ 10, 0, obj, binary_image_data, ... ] ) )
Thanks for the suggestion! The problem with bytify is that some items
might require different formatting than other
On Sat, Jan 11, 2014 at 6:36 AM, Steven D'Aprano st...@pearwood.infowrote:
I'm sorry, I don't understand what you mean here. I'm honestly not
trying to be difficult, but you sound confident that you understand what
you are doing, but your description doesn't make sense to me. To me, it
looks
Am 11.01.2014 09:43, schrieb Nick Coghlan:
On 11 January 2014 12:28, Ethan Furman et...@stoneleaf.us wrote:
On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 20:53:09 -0500
Eric V. Smith e...@trueblade.com wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out
Am 11.01.2014 10:44, schrieb Stephen Hansen:
I mean, its not like the bytes type lacks knowledge of the subset of bytes
that happen to be 7-bit ascii-compatible and can't perform text-ish operations
on them--
Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 bit
Am 11.01.2014 14:49, schrieb Georg Brandl:
Am 11.01.2014 10:44, schrieb Stephen Hansen:
I mean, its not like the bytes type lacks knowledge of the subset of bytes
that happen to be 7-bit ascii-compatible and can't perform text-ish
operations
on them--
Python 3.3.3
On 11.01.2014 14:54, Georg Brandl wrote:
Am 11.01.2014 14:49, schrieb Georg Brandl:
Am 11.01.2014 10:44, schrieb Stephen Hansen:
I mean, its not like the bytes type lacks knowledge of the subset of bytes
that happen to be 7-bit ascii-compatible and can't perform text-ish
operations
on
On Sat, 11 Jan 2014 08:26:57 +0100
Georg Brandl g.bra...@gmx.net wrote:
Am 11.01.2014 03:04, schrieb Antoine Pitrou:
On Fri, 10 Jan 2014 20:53:09 -0500
Eric V. Smith e...@trueblade.com wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
3892. See for example
On 12 January 2014 01:15, M.-A. Lemburg m...@egenix.com wrote:
On 11.01.2014 14:54, Georg Brandl wrote:
Am 11.01.2014 14:49, schrieb Georg Brandl:
Am 11.01.2014 10:44, schrieb Stephen Hansen:
I mean, its not like the bytes type lacks knowledge of the subset of
bytes
that happen to be 7-bit
On Sat, Jan 11, 2014 at 01:56:56PM +0100, Juraj Sukop wrote:
On Sat, Jan 11, 2014 at 6:36 AM, Steven D'Aprano st...@pearwood.infowrote:
If you consider PDF as binary with occasional pieces of ASCII text, then
working with bytes makes sense. But I wonder whether it might be better
to
On Sun, 12 Jan 2014 01:34:26 +1000
Nick Coghlan ncogh...@gmail.com wrote:
Yes, it bloody well does. The number of people who have told me that
using Python 3 is what allowed them to finally understand how Unicode
works vastly exceeds the number of wire protocol and file format devs
that have
On 01/11/2014 07:38 AM, Steven D'Aprano wrote:
The point that I am making is that many people want to add formatting
operations to bytes so they can put ASCII strings inside bytes. But (as
far as I can tell) they don't need to do this, because they can treat
Unicode strings containing code
On 11.01.2014 16:34, Nick Coghlan wrote:
On 12 January 2014 01:15, M.-A. Lemburg m...@egenix.com wrote:
On 11.01.2014 14:54, Georg Brandl wrote:
Am 11.01.2014 14:49, schrieb Georg Brandl:
Am 11.01.2014 10:44, schrieb Stephen Hansen:
I mean, its not like the bytes type lacks knowledge of the
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2014-01-11, 10:56 GMT, you wrote:
I don't know what the fuss is about.
I just cannot resist:
When you are calm while everybody else is in the state of
panic, you haven’t understood the problem.
-- one of many collections of
On 01/11/2014 12:43 AM, Nick Coghlan wrote:
In particular, the bytes type is, and always will be, designed for
pure binary manipulation [...]
I apologize for being blunt, but this is a lie.
Lets take a look at the methods defined by bytes:
dir(b'')
['__add__', '__class__', '__contains__',
On 01/11/2014 07:34 AM, Nick Coghlan wrote:
On 12 January 2014 01:15, M.-A. Lemburg wrote:
We don't have to be pedantic about the bytes/text separation.
It doesn't help in real life.
Yes, it bloody well does. The number of people who have told me that
using Python 3 is what allowed them to
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
On 01/11/2014 07:38 AM, Steven D'Aprano wrote:
The point that I am making is that many people want to add formatting
operations to bytes so they can put ASCII strings inside bytes. But (as
far as I can tell) they don't need to do
tl;dr: At the end I'm volunteering to look at real code that is having
porting problems.
On Sat, 11 Jan 2014 17:33:17 +0100, M.-A. Lemburg m...@egenix.com wrote:
asciistr is interesting in that it coerces to bytes instead
of to Unicode (as is the case in Python 2).
At the moment it doesn't
M.-A. Lemburg writes:
I complete agree with Stephen, that bytes are in fact often
an encoding of text. If that text is ASCII compatible, I don't
see any reason why we should not continue to expose the C lib
standard string APIs available for text manipulations on bytes.
We already *have*
On Sat, Jan 11, 2014 at 04:15:35PM +0100, M.-A. Lemburg wrote:
I think we need to step back a little from the purist view
of things and give more emphasis on the practicality beats
purity Zen.
I complete agree with Stephen, that bytes are in fact often
an encoding of text. If that text is
On Sat, Jan 11, 2014 at 05:33:17PM +0100, M.-A. Lemburg wrote:
FWIW: I quite liked the Python 2 model, but perhaps that's because
I already knww how Unicode works, so could use it to make my
life easier ;-)
/incredulous
I would really love to see you justify that claim. How do you use the
On 2014-01-11 05:36, Steven D'Aprano wrote:
[snip]
Latin-1 has the nice property that every byte decodes into the character
with the same code point, and visa versa. So:
for i in range(256):
assert bytes([i]).decode('latin-1') == chr(i)
assert chr(i).encode('latin-1') == bytes([i])
On 01/11/2014 10:36 AM, Steven D'Aprano wrote:
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
unicode to bytes
bytes to unicode using latin1
unicode to bytes
Where do you get this from? I don't follow your logic. Start with a text
template:
template =
MRAB writes:
with open(outfile.pdf, w, encoding=latin-1) as f:
f.write(pdf)
[snip]
The second example won't work because you're forgetting about the
handling of line endings in text mode.
Not so fast! Forgot, yes (me too!), but not work? Not quite:
with
On 01/11/2014 11:49 AM, Stephen J. Turnbull wrote:
MRAB writes:
with open(outfile.pdf, w, encoding=latin-1) as f:
f.write(pdf)
[snip]
The second example won't work because you're forgetting about the
handling of line endings in text mode.
Not so fast! Forgot, yes (me
On Sat, 11 Jan 2014 11:54:26 -0800, Ethan Furman et...@stoneleaf.us wrote:
On 01/11/2014 11:49 AM, Stephen J. Turnbull wrote:
MRAB writes:
with open(outfile.pdf, w, encoding=latin-1) as f:
f.write(pdf)
[snip]
The second example won't work because you're
On Jan 11, 2014, at 10:34 AM, Nick Coghlan ncogh...@gmail.com wrote:
Yes, it bloody well does. The number of people who have told me that
using Python 3 is what allowed them to finally understand how Unicode
works vastly exceeds the number of wire protocol and file format devs
that have
On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote:
We already *have* a type in Python 3.3 that provides text
manipulations on arrays of 8-bit objects: str (per PEP 393).
BTW: I don't know why so many people keep asking for use cases.
Isn't it obvious that text data without known (but ASCII
On Sat, Jan 11, 2014 at 4:28 PM, Terry Reedy tjre...@udel.edu wrote:
On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote:
We already *have* a type in Python 3.3 that provides text
manipulations on arrays of 8-bit objects: str (per PEP 393).
BTW: I don't know why so many people keep asking for
On 01/11/2014 12:45 PM, Donald Stufft wrote:
FWIW as one of the people who it took Python3 to finally figure out how to
actually use unicode, it was the absence of encode on bytes and decode on
str that actually did it. Giving bytes a format method would not have affected
that either way I
On Fri, Jan 10, 2014 at 9:13 PM, Juraj Sukop juraj.su...@gmail.com wrote:
On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou solip...@pitrou.netwrote:
Also, when you say you've never encountered UTF-16 text in PDFs, it
sounds like those people who've never encountered any non-ASCII data in
On Sat, Jan 11, 2014 at 07:22:30PM +, MRAB wrote:
with open(outfile.pdf, w, encoding=latin-1) as f:
f.write(pdf)
[snip]
The second example won't work because you're forgetting about the
handling of line endings in text mode.
So I did! Thank you for the correction.
--
Steven
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2014-01-11, 18:09 GMT, you wrote:
We are NOT going back to the confusing incoherent mess that
is the Python 2 model of bolting Unicode onto the side of
POSIX . . .
We are not asking for that.
Yes, you do. Maybe not you personally, but
On Sat, Jan 11, 2014 at 04:28:34PM -0500, Terry Reedy wrote:
The problem with some criticisms of using 'unicode in Python 3' is that
there really is no such thing. Unicode in 3.0 to 3.2 used the old
internal model inherited from 2.x. Unicode in 3.3+ uses a different
internal model that is
On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
AFAIK (and just for the record), there could be both Latin1 text and UTF-16
in a PDF (and other encodings too), depending on the font used:
[...]
In Python2, txt is just a str, but in Python3 handling everything as latin1
On 11Jan2014 13:15, Juraj Sukop juraj.su...@gmail.com wrote:
On Sat, Jan 11, 2014 at 5:14 AM, Cameron Simpson c...@zip.com.au wrote:
data = b' '.join( bytify( [ 10, 0, obj, binary_image_data, ... ] ) )
Thanks for the suggestion! The problem with bytify is that some items
might require
On 12 Jan 2014 03:29, Ethan Furman et...@stoneleaf.us wrote:
On 01/11/2014 12:43 AM, Nick Coghlan wrote:
In particular, the bytes type is, and always will be, designed for
pure binary manipulation [...]
I apologize for being blunt, but this is a lie.
Lets take a look at the methods
On 01/11/2014 06:29 PM, Steven D'Aprano wrote:
On Sat, Jan 11, 2014 at 11:05:36AM -0800, Ethan Furman wrote:
On 01/11/2014 10:36 AM, Steven D'Aprano wrote:
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
unicode to bytes
bytes to unicode using latin1
unicode to bytes
On 12 January 2014 02:33, M.-A. Lemburg m...@egenix.com wrote:
On 11.01.2014 16:34, Nick Coghlan wrote:
While that was an *expedient* (and, in fact, necessary) solution at
the time, the fact it is still thoroughly confusing people 13 years
later shows it is not a *comprehensible* solution.
On 12 January 2014 04:38, R. David Murray rdmur...@bitdance.com wrote:
But! Our goal should be to help people convert to Python3. So how can
we find out what the specific problems are that real-world programs are
facing, look at the *actual code*, and help that project figure out the
best
On Fri, 10 Jan 2014 11:32:05 +1000
Nick Coghlan ncogh...@gmail.com wrote:
It's consistent with bytearray.join's behaviour:
x = bytearray()
x.join([babc])
bytearray(b'abc')
x
bytearray(b'')
Yeah, I guess I'm OK with us being consistent on that one. It's still
weird, but also
(Sorry if this messes-up the thread order, it is meant as a reply to the
original RFC.)
Dear list,
newbie here. After much hesitation I decided to put forward a use case
which bothers me about the current proposal. Disclaimer: I happen to write
a library which is directly influenced by this.
As
On 1/10/2014 12:17 PM, Juraj Sukop wrote:
(Sorry if this messes-up the thread order, it is meant as a reply to the
original RFC.)
Dear list,
newbie here. After much hesitation I decided to put forward a use case
which bothers me about the current proposal. Disclaimer: I happen to
write a
On 06/01/2014 13:24, Victor Stinner wrote:
Hi,
bytes % args and bytes.format(args) are requested by Mercurial and
Twisted projects. The issue #3982 was stuck because nobody proposed a
complete definition of the new features. Here is a try as a PEP.
Apologies if this has already been said,
Am 10.01.2014 18:56, schrieb Eric V. Smith:
On 1/10/2014 12:17 PM, Juraj Sukop wrote:
(Sorry if this messes-up the thread order, it is meant as a reply to the
original RFC.)
Dear list,
newbie here. After much hesitation I decided to put forward a use case
which bothers me about the
On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop juraj.su...@gmail.com wrote:
As you may know, PDF operates over bytes and an integer or floating-point
number is written down as-is, for example 100 or 1.23.
Just to be clear here -- is PDF specifically bytes+ascii?
Or could there be
2014/1/10 Juraj Sukop juraj.su...@gmail.com:
In the case of PDF, the embedding of an image into PDF looks like:
10 0 obj
/Type /XObject
/Width 100
/Height 100
/Alternates 15 0 R
/Length 2167
stream
...binary image data...
On 1/10/2014 5:12 PM, Victor Stinner wrote:
2014/1/10 Juraj Sukop juraj.su...@gmail.com:
In the case of PDF, the embedding of an image into PDF looks like:
10 0 obj
/Type /XObject
/Width 100
/Height 100
/Alternates 15 0 R
/Length 2167
On Fri, 10 Jan 2014 12:56:19 -0500
Eric V. Smith e...@trueblade.com wrote:
I agree. I don't see any reason to exclude int and float. See Guido's
messages http://bugs.python.org/issue3982#msg180423 and
http://bugs.python.org/issue3982#msg180430 for some justification and
discussion.
If you
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
Eric V. Smith e...@trueblade.com wrote:
I agree. I don't see any reason to exclude int and float. See Guido's
messages http://bugs.python.org/issue3982#msg180423 and
http://bugs.python.org/issue3982#msg180430 for
On Fri, 10 Jan 2014 17:20:32 -0500
Eric V. Smith e...@trueblade.com wrote:
Isn't the point of the PEP to make it easier to port 2.x code to 3.5?
Is
there really existing code like this in 2.x?
No, but so what? The point of the PEP is not to allow arbitrary
Python 2 code to run without
On Fri, 10 Jan 2014 17:33:57 -0500
Eric V. Smith e...@trueblade.com wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
Eric V. Smith e...@trueblade.com wrote:
I agree. I don't see any reason to exclude int and float. See Guido's
messages
On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 17:33:57 -0500
Eric V. Smith e...@trueblade.com wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
Eric V. Smith e...@trueblade.com wrote:
I agree. I don't see any reason to exclude int and
On Fri, 10 Jan 2014 14:58:15 -0800
Ethan Furman et...@stoneleaf.us wrote:
On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 17:33:57 -0500
Eric V. Smith e...@trueblade.com wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
Eric V. Smith
On Fri, 10 Jan 2014 18:14:45 -0500
Eric V. Smith e...@trueblade.com wrote:
Because embedding the ASCII equivalent of ints and floats in byte streams
is a common operation?
Again, if you're representing ASCII, you're representing text and
should use a str object.
Yes, but is there
On Fri, Jan 10, 2014 at 10:52 PM, Chris Barker chris.bar...@noaa.govwrote:
On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop juraj.su...@gmail.comwrote:
As you may know, PDF operates over bytes and an integer or floating-point
number is written down as-is, for example 100 or 1.23.
Just to be
On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner
victor.stin...@gmail.comwrote:
What not building 10 0 obj ... stream and endstream endobj in
Unicode and then encode to ASCII? Example:
data = b''.join((
(%d %d obj ... stream % (10, 0)).encode('ascii'),
binary_image_data,
(endstream
On Sat, 11 Jan 2014 00:43:39 +0100
Juraj Sukop juraj.su...@gmail.com wrote:
Basically, to .encode('ascii') every possible
number is not exactly simple or pretty.
Well it strikes me that the PDF format itself is not exactly simple or
pretty. It might be convenient that Python 2 allows you, in
On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop juraj.su...@gmail.com wrote:
What this all means is that the PDF objects are expressed in ASCII,
stream objects like images and fonts may have a binary part and I never
saw those UTF+16 strings.
hmm -- I wonder if they are out there in the wild,
On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou solip...@pitrou.netwrote:
Also, when you say you've never encountered UTF-16 text in PDFs, it
sounds like those people who've never encountered any non-ASCII data in
their programs.
Let me clarify: one does not think in writing text in
On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
With Victor's consent, I overhauled PEP 460 and made the feature set
more restricted and consistent with the bytes/str separation.
From the PEP:
=
Python 3 generally mandates that text be stored and manipulated as
unicode (i.e. str
On Fri, 10 Jan 2014 16:23:53 -0800
Ethan Furman et...@stoneleaf.us wrote:
On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
With Victor's consent, I overhauled PEP 460 and made the feature set
more restricted and consistent with the bytes/str separation.
From the PEP:
=
On 1/10/2014 8:12 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 16:23:53 -0800
Ethan Furman et...@stoneleaf.us wrote:
On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
With Victor's consent, I overhauled PEP 460 and made the feature set
more restricted and consistent with the bytes/str
On Fri, 10 Jan 2014 20:53:09 -0500
Eric V. Smith e...@trueblade.com wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
3892. See for example http://bugs.python.org/issue3982#msg180432 .
Then we might as well not do anything, since any attempt to advance
things is met
On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 20:53:09 -0500
Eric V. Smith e...@trueblade.com wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
3892. See for example http://bugs.python.org/issue3982#msg180432 .
Then we might as well not do
On Fri, 10 Jan 2014 18:28:41 -0800
Ethan Furman et...@stoneleaf.us wrote:
Is it safe to assume you don't use Python for the use-cases under discussion?
You know, I've done quite a bit of network programming. I've also done
an experimental port of Twisted to Python 3. I know what a network
To avoid implicit conversion between str and bytes, I propose adding only
limited %-format,
not .format() or .format_map().
limited %-format means:
%c accepts integer or bytes having one length.
%r is not supported
%s accepts only bytes.
%a is only format accepts arbitrary object.
And other
On 01/10/2014 06:39 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 18:28:41 -0800
Ethan Furman wrote:
Is it safe to assume you don't use Python for the use-cases under discussion?
You know, I've done quite a bit of network programming.
No, I didn't, that's why I asked.
I've also done an
On 01/10/2014 06:39 PM, Antoine Pitrou wrote:
I know what a network protocol with ill-defined encodings
looks like.
For the record, I've been (and I suspect Eric and some others have also been) talking about well-defined encodings. For
the DBF files that I work with, there is binary,
To avoid implicit conversion between str and bytes, I propose adding only
limited %-format,
not .format() or .format_map().
limited %-format means:
%c accepts integer or bytes having one length.
%r is not supported
%s accepts only bytes.
%a is only format accepts arbitrary object.
And other
On 11Jan2014 00:43, Juraj Sukop juraj.su...@gmail.com wrote:
On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner
victor.stin...@gmail.comwrote:
What not building 10 0 obj ... stream and endstream endobj in
Unicode and then encode to ASCII? Example:
data = b''.join((
(%d %d obj ...
On Fri, Jan 10, 2014 at 06:17:02PM +0100, Juraj Sukop wrote:
As you may know, PDF operates over bytes and an integer or floating-point
number is written down as-is, for example 100 or 1.23.
I'm sorry, I don't understand what you mean here. I'm honestly not
trying to be difficult, but you
Am 11.01.2014 03:04, schrieb Antoine Pitrou:
On Fri, 10 Jan 2014 20:53:09 -0500
Eric V. Smith e...@trueblade.com wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
3892. See for example http://bugs.python.org/issue3982#msg180432 .
I agree.
Then we might as well
On 9 Jan 2014 11:29, INADA Naoki songofaca...@gmail.com wrote:
And I think everyone was well intentioned - and python3 covers most of
the
bases, but working with binary data is not only a wire-protocol
programmer's
problem.
If you're working with binary data, use the binary API offered by
On Thu, 09 Jan 2014 03:54:13 +
MRAB pyt...@mrabarnett.plus.com wrote:
I'm thinking that the i format could be used for signed integers and
the u for unsigned integers. The width would be the number of bytes.
You would also need to have a way of specifying the endianness.
For example:
On Jan 08, 2014, at 01:51 PM, Stephen J. Turnbull wrote:
Benjamin Peterson writes:
I agree. This is a very important, much-requested feature for low-level
networking code.
I hear it's much-requested, but is there any description of typical
use cases?
The two unported libraries that are
On 9 Jan 2014 06:43, Antoine Pitrou solip...@pitrou.net wrote:
Hi,
With Victor's consent, I overhauled PEP 460 and made the feature set
more restricted and consistent with the bytes/str separation.
+1
I was initially dubious about the idea, but the proposed semantics look
good to me.
We
On Fri, 10 Jan 2014 05:26:04 +1000
Nick Coghlan ncogh...@gmail.com wrote:
We should probably include format_map for consistency with the str API.
Yes, you're right.
However, I
also added bytearray into the mix, as bytearray objects should
generally support the same operations as bytes
1 - 100 of 167 matches
Mail list logo