On Fri, Jun 06, 2014 at 12:51:11PM +1200, Greg Ewing wrote:
> Steven D'Aprano wrote:
> >(1) I asked if it would be okay for MicroPython to *optionally* use
> >nominally Unicode strings limited to ASCII. Pretty much the only
> >response to this as been Guido saying "That would be a pretty lousy
>
Steven D'Aprano wrote:
> (1) I asked if it would be okay for MicroPython to *optionally* use
> nominally Unicode strings limited to ASCII. Pretty much the only
> response to this as been Guido saying "That would be a pretty lousy
> option", and since nobody has really defended the suggestion,
Paul Sokolovsky wrote:
All these changes are what let me dream on and speculate on
possibility that Python4 could offer an encoding-neutral string type
(which means based on bytes)
Can you elaborate on exactly what you have in mind?
You seem to want something different from Python 3 str,
Python
Nathaniel Smith writes:
>> > tmp1 = a + b
>> > tmp1 += c
>> > tmp1 /= c
>> > result = tmp1
>>
>> Could this transformation be done in the ast? And would that help?
>
> I don't think it could be done in the ast because I don't think you can
> work with anonymous temporaries there. B
Nathaniel Smith wrote:
I'd be a
little nervous about whether anyone has implemented, say, an iadd with
side effects such that you can tell whether a copy was made, even if the
object being copied is immediately destroyed.
I can think of at least one plausible scenario where
this could occur:
On 05/06/14 22:51, Nathaniel Smith wrote:
This gets evaluated as:
tmp1 = a + b
tmp2 = tmp1 + c
result = tmp2 / c
All these temporaries are very expensive. Suppose that a, b, c are
arrays with N bytes each, and N is large. For simple arithmetic like
this, then costs are dominated by
Nathaniel Smith wrote:
I.e., BIN_ADD could do
if (Py_REFCNT(left) == 1)
result = PyNumber_InPlaceAdd(left, right);
else
result = PyNumber_Add(left, right)
Upside: all packages automagically benefit!
Potential downsides to consider:
- Subtle but real and user-visible change in Python se
On Fri, Jun 6, 2014 at 11:47 AM, Nathaniel Smith wrote:
> Unfortunately we don't actually know whether Cython is the only culprit
> (such code *could* be written by hand), and even if we fixed Cython it would
> take some unknowable amount of time before all downstream users upgraded
> their Cython
On 5 Jun 2014 23:58, "Terry Reedy" wrote:
>
> On 6/5/2014 4:51 PM, Nathaniel Smith wrote:
>
>> In fact, AFAICT it's 100% correct for libraries being called by
>> regular python code (which is why I'm able to quote benchmarks at you
>> :-)). The bytecode eval loop always holds a reference to all op
On 6 Jun 2014 02:16, "Nikolaus Rath" wrote:
>
> Nathaniel Smith writes:
> > Such optimizations are important enough that numpy operations always
> > give the option of explicitly specifying the output array (like
> > in-place operators but more general and with clumsier syntax). Here's
> > an exa
Nathaniel Smith writes:
> Such optimizations are important enough that numpy operations always
> give the option of explicitly specifying the output array (like
> in-place operators but more general and with clumsier syntax). Here's
> an example small-array benchmark that IIUC uses Jacobi iteratio
Steven D'Aprano wrote:
(1) I asked if it would be okay for MicroPython to *optionally* use
nominally Unicode strings limited to ASCII. Pretty much the only
response to this as been Guido saying "That would be a pretty lousy
option",
It would be limiting to have this as the *only* way of
deali
On 6 Jun 2014 05:13, "Glenn Linderman" wrote:
>
> On 6/5/2014 11:41 AM, Daniel Holth wrote:
>>
>> discover new things
>> like dance-encoded strings, bytes decoded using an incorrect encoding
>> intended to be transcoded into the correct encoding later, surrogates
>> that work perfectly until .enco
On Thu, Jun 5, 2014 at 11:12 PM, Paul Moore wrote:
> On 5 June 2014 22:47, Nathaniel Smith wrote:
>> To make sure I understand correctly, you're suggesting something like
>> adding a new set of special method slots, __te_add__, __te_mul__,
>> etc.
>
> I wasn't thinking in that much detail, TBH. I
On 6/5/2014 4:51 PM, Nathaniel Smith wrote:
In fact, AFAICT it's 100% correct for libraries being called by
regular python code (which is why I'm able to quote benchmarks at you
:-)). The bytecode eval loop always holds a reference to all operands,
and then immediately DECREFs them after the ope
On 5 June 2014 22:47, Nathaniel Smith wrote:
> To make sure I understand correctly, you're suggesting something like
> adding a new set of special method slots, __te_add__, __te_mul__,
> etc.
I wasn't thinking in that much detail, TBH. I'm not sure adding a
whole set of new slots is sensible for
On Thu, Jun 5, 2014 at 10:37 PM, Paul Moore wrote:
> On 5 June 2014 21:51, Nathaniel Smith wrote:
>> Is there a better idea I'm missing?
>
> Just a thought, but the temporaries come from the stack manipulation
> done by the likes of the BINARY_ADD opcode. (After all the bytecode
> doesn't use tem
On 5 June 2014 21:51, Nathaniel Smith wrote:
> Is there a better idea I'm missing?
Just a thought, but the temporaries come from the stack manipulation
done by the likes of the BINARY_ADD opcode. (After all the bytecode
doesn't use temporaries, it's a stack machine). Maybe BINARY_ADD and
friends
Hi all,
There's a very valuable optimization -- temporary elision -- which
numpy can *almost* do. It gives something like a 10-30% speedup for
lots of common real-world expressions. It would probably would be
useful for non-numpy code too. (In fact it generalizes the str += str
special case that's
Le 04/06/2014 02:51, Chris Angelico a écrit :
On Wed, Jun 4, 2014 at 3:17 PM, Nick Coghlan wrote:
It would. The downsides of a UTF-8 representation would be slower
iteration and much slower (O(N)) indexing/slicing.
There's no reason for iteration to be slower. Slicing would get
O(slice offset
On 6/5/2014 11:41 AM, Daniel Holth wrote:
discover new things
like dance-encoded strings, bytes decoded using an incorrect encoding
intended to be transcoded into the correct encoding later, surrogates
that work perfectly until .encode(), str(bytes), APIs that disagree
with you about whether the
On 6/5/2014 3:10 AM, Paul Sokolovsky wrote:
Hello,
On Wed, 04 Jun 2014 22:15:30 -0400
Terry Reedy wrote:
think you are again batting at a strawman. If you mean 'read from a
file', and all you want to do is read bytes from and write bytes to
external 'files', then there is obviously no need to
On Thu, Jun 5, 2014 at 11:59 AM, Paul Moore wrote:
> On 5 June 2014 14:15, Nick Coghlan wrote:
>> As I've said before in other contexts, find me Windows, Mac OS X and
>> JVM developers, or educators and scientists that are as concerned by
>> the text model changes as folks that are primarily focu
On 5 June 2014 14:15, Nick Coghlan wrote:
> As I've said before in other contexts, find me Windows, Mac OS X and
> JVM developers, or educators and scientists that are as concerned by
> the text model changes as folks that are primarily focused on Linux
> system (including network) programming, an
On Thu, 05 Jun 2014 12:03:15 +0200, Victor Stinner
wrote:
> Would it be possible to add a new "Asyncio" component on
> bugs.python.org? If this component is selected, the default nosy list
> for asyncio would be used (guido, yury and me, there is already such
> list in the nosy list completion).
On Wed, Jun 04, 2014 at 11:17:18AM +1000, Steven D'Aprano wrote:
> There is a discussion over at MicroPython about the internal
> representation of Unicode strings. Micropython is aimed at embedded
> devices, and so minimizing memory use is important, possibly even
> more important than performa
On 5 June 2014 22:37, Paul Sokolovsky wrote:
> On Thu, 5 Jun 2014 22:20:04 +1000
> Nick Coghlan wrote:
>> problems caused by trusting the locale encoding to be correct, but the
>> startup code will need non-trivial changes for that to happen - the
>> C.UTF-8 locale may even become widespread befo
On 5 June 2014 22:10, Stefan Krah wrote:
> Paul Sokolovsky wrote:
>> In this regard, I'm glad to participate in mind-resetting discussion.
>> So, let's reiterate - there's nothing like "the best", "the only right",
>> "the only correct", "righter than", "more correct than" in CPython's
>> impleme
Hello,
On Thu, 5 Jun 2014 22:20:04 +1000
Nick Coghlan wrote:
[]
> problems caused by trusting the locale encoding to be correct, but the
> startup code will need non-trivial changes for that to happen - the
> C.UTF-8 locale may even become widespread before we get there).
... And until those go
On 5 June 2014 22:01, Paul Sokolovsky wrote:
>
> All these changes are what let me dream on and speculate on
> possibility that Python4 could offer an encoding-neutral string type
> (which means based on bytes)
>
To me, an "encoding neutral string type" means roughly "characters are
atomic", and
On 5 June 2014 22:01, Paul Sokolovsky wrote:
>> Aside from
>> some of the POSIX locale handling issues on Linux, many of the
>> concerns are with the usability of bytes and bytearray, not with str -
>> that's why binary interpolation is coming back in 3.5, and there will
>> likely be other usabili
Paul Sokolovsky wrote:
> In this regard, I'm glad to participate in mind-resetting discussion.
> So, let's reiterate - there's nothing like "the best", "the only right",
> "the only correct", "righter than", "more correct than" in CPython's
> implementation of Unicode storage. It is *arbitrary*. W
Hello,
On Thu, 5 Jun 2014 21:43:16 +1000
Nick Coghlan wrote:
> On 5 June 2014 21:25, Paul Sokolovsky wrote:
> > Well, I understand the plan - hoping that people will "get over
> > this". And I'm personally happy to stay away from this "trolling",
> > but any discussion related to Unicode goes i
On 5 June 2014 21:25, Paul Sokolovsky wrote:
> Well, I understand the plan - hoping that people will "get over this".
> And I'm personally happy to stay away from this "trolling", but any
> discussion related to Unicode goes in circles and returns to feeling
> that Unicode at the central role as p
On 5 June 2014 17:54, Stephen J. Turnbull wrote:
> What matters to you is that str (unicode) is an opaque type -- there
> is no specification of the internal representation in the language
> reference, and in fact several different ones coexist happily across
> existing Python implementations -- a
Hello,
On Thu, 05 Jun 2014 16:54:11 +0900
"Stephen J. Turnbull" wrote:
> Paul Sokolovsky writes:
>
> > Please put that in perspective when alarming over O(1) indexing of
> > inherently problematic niche datatype. (Again, it's not my or
> > MicroPython's fault that it was forced as standard s
Hello,
On Wed, 04 Jun 2014 22:15:30 -0400
Terry Reedy wrote:
> On 6/4/2014 6:52 PM, Paul Sokolovsky wrote:
>
> > "Well" is subjective (or should be defined formally based on the
> > requirements). With my MicroPython hat on, an implementation which
> > receives a string, transcodes it, leading
Hi,
Would it be possible to add a new "Asyncio" component on
bugs.python.org? If this component is selected, the default nosy list
for asyncio would be used (guido, yury and me, there is already such
list in the nosy list completion).
Full text search for "asyncio" returns too many results.
Vict
Serhiy Storchaka writes:
> Yes, I remember. I thing that hybrid FSR-UTF16 (like FSR, but UTF-16 is
> used instead of UCS4) is the better choice for CPython. I suppose that
> with populating emoticons and other icon characters in nearest 5 or 10
> years, even English text will often contain
05.06.14 05:25, Terry Reedy написав(ла):
I mentioned it as an alternative during the '393 discussion. I more than
half agree that the FSR is the better choice for CPython, which had no
particular attachment to UTF-16 in the way that I think Jython, for
instance, does.
Yes, I remember. I thing t
Paul Sokolovsky writes:
> Please put that in perspective when alarming over O(1) indexing of
> inherently problematic niche datatype. (Again, it's not my or
> MicroPython's fault that it was forced as standard string type. Maybe
> if CPython seriously considered now-standard UTF-8 encoding, re
04.06.14 23:50, Glenn Linderman написав(ла):
3) (Most space efficient) One cached entry, that caches the last
codepoint/byte position referenced. UTF-8 is able to be traversed in
either direction, so "next/previous" codepoint access would be
relatively fast (and such are very common operations, e
05.06.14 03:03, Greg Ewing написав(ла):
Serhiy Storchaka wrote:
html.HTMLParser, json.JSONDecoder, re.compile, tokenize.tokenize don't
use iterators. They use indices, str.find and/or regular expressions.
Common use case is quickly find substring starting from current
position using str.find or
05.06.14 03:08, Greg Ewing написав(ла):
Serhiy Storchaka wrote:
A language which doesn't support O(1) indexing is not Python, it is
only Python-like language.
That's debatable, but even if it's true, I don't think
there's anything wrong with MicroPython being only a
"Python-like language". As
Glenn Linderman writes:
> 3) (Most space efficient) One cached entry, that caches the last
> codepoint/byte position referenced. UTF-8 is able to be traversed in
> either direction, so "next/previous" codepoint access would be
> relatively fast (and such are very common operations, even whe
45 matches
Mail list logo