[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Neil Schemenauer
For those who would like to test with something compatible with
Python 3.7.3, I made re-based branches here:

 https://github.com/nascheme/cpython/tree/obmalloc_radix_v37
 https://github.com/nascheme/cpython/tree/obmalloc_big_pools_v37

They should be ABI compatible with Python 3.7.3.  So, if you just
re-build the "python" executable, you don't have to rebuild anything
else.  Both those use the same arena/pool sizes and they both have
Tim's arena thrashing fix.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K5DCROCGGVNWWLC6XM6XMCTJACESNEYS/


[Python-Dev] Re: python3 -bb and hash collisions

2019-06-21 Thread Daniel Holth
The answer bytes == str is just False. That doesn't put b'' in your
database by accident. It could be useful to separate the two kinds of
warnings.

On Fri, Jun 21, 2019, 18:57 Ivan Pozdeev via Python-Dev <
python-dev@python.org> wrote:

> On 22.06.2019 1:08, Daniel Holth wrote:
>
> Thanks. I think I might like an option to disable str(bytes) without
> disabling str != bytes. Unless the second operation would also corrupt
> output.
>
> You can't compare str to bytes without knowing the encoding the bytes are
> supposed to be in (see
> https://stackoverflow.com/questions/49991870/python-default-string-encoding
> for details).
>
> And if you do know the encoding, you can as well compare
> `str.encode(encoding) != bytes` / `str != bytes.decode(encoding)`.
>
>
> Came across this kind of set in the hyper http library which uses a set to
> accept certain headers with either str or bytes keys.
>
> On Tue, Jun 18, 2019, 13:05 Christian Heimes  wrote:
>
>> On 18/06/2019 18.32, Daniel Holth wrote:
>> > set([u"foo", b"foo]) will error because the two kinds of string have the
>> > same hash, and this causes a comparison. Is that correct?
>>
>> Yes, it will fail with -bb, because it turns comparison between str and
>> bytes into an error. This can also happen with other strings when
>> hash(u'somestring') & mask == hash(b'otherbytes') & mask. The mask of a
>> set starts with PySet_MINSIZE - 1 == 8 and increases over team.
>>
>> Christian
>>
>>
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/ZIF2MRBWSMSCFP6E7PZOBI5KYP46QZPK/
>>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to 
> python-dev-leave@python.orghttps://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/R6E7FAR36UO6XHQSIAVF4DIM7G23ADJP/
>
> --
> Regards,
> Ivan
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/XAN44UH5X5PYNSHY5ONULXIJF4DLBXF6/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/36DKFLVTBABEZPDX7MYHP7H2TVDZTOHG/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Neil Schemenauer
On 2019-06-21, Tim Peters wrote:
> [Thomas Wouters ]
> > Getting rid of address_in_range sounds like a nice idea, and I
> > would love to test how feasible it is -- I can run such a change
> > against a wide selection of code at work, including a lot of
> > third-party extension modules, but I don't see an easy way to do
> > it right now.
> 
> Neil's branch is here:
> 
>  https://github.com/nascheme/cpython/tree/obmalloc_radix_tree

If you can test vs some real-world programs, that would be great.
I was trying to run some tests this afternoon.  Testing with Python
3.8+ is a pain because of the PyCode_New and tp_print changes.  I've
just added two fixes to the head of the obmalloc_radix_tree branch
so that you can compile code generated by old versions of Cython.
Without those fixes, building 3rd party extensions can be a real
pain.

> My PR uses 16K pools and 1M arenas, quadrupling the status quo.
> Because "why not?" ;-)
> 
> Neil's branch has _generally_, but not always, used 16 MiB arenas.
> The larger the arenas in his branch, the smaller the radix tree needs
> to grow.

Currently I have it like your big pool branch (16 KB, 1MB).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TF2JI7G5ZMCGUMM3AWNSCQDYVFNRPMQ4/


[Python-Dev] Re: python3 -bb and hash collisions

2019-06-21 Thread Ivan Pozdeev via Python-Dev

On 22.06.2019 1:08, Daniel Holth wrote:
Thanks. I think I might like an option to disable str(bytes) without disabling str != bytes. Unless the second operation would also 
corrupt output.


You can't compare str to bytes without knowing the encoding the bytes are supposed to be in (see 
https://stackoverflow.com/questions/49991870/python-default-string-encoding for details).


And if you do know the encoding, you can as well compare `str.encode(encoding) 
!= bytes` / `str != bytes.decode(encoding)`.



Came across this kind of set in the hyper http library which uses a set to 
accept certain headers with either str or bytes keys.

On Tue, Jun 18, 2019, 13:05 Christian Heimes mailto:christ...@python.org>> wrote:

On 18/06/2019 18.32, Daniel Holth wrote:
> set([u"foo", b"foo]) will error because the two kinds of string have the
> same hash, and this causes a comparison. Is that correct?

Yes, it will fail with -bb, because it turns comparison between str and
bytes into an error. This can also happen with other strings when
hash(u'somestring') & mask == hash(b'otherbytes') & mask. The mask of a
set starts with PySet_MINSIZE - 1 == 8 and increases over team.

Christian


___
Python-Dev mailing list -- python-dev@python.org 

To unsubscribe send an email to python-dev-le...@python.org 

https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZIF2MRBWSMSCFP6E7PZOBI5KYP46QZPK/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R6E7FAR36UO6XHQSIAVF4DIM7G23ADJP/


--
Regards,
Ivan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XAN44UH5X5PYNSHY5ONULXIJF4DLBXF6/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Victor Stinner
Le ven. 21 juin 2019 à 23:19, Thomas Wouters  a écrit :
> Is this really feasible in a world where the allocators can be selected (and 
> the default changed) at runtime?

The memory allocation must not be changed after the Python
pre-initialization. What's done after pre-initialization is more to
put "hook" which executes code before/after an allocation, but don't
replace the allocator.

It simply doesn't work to switch from pymalloc to malloc "at runtime".
Calling PyMem_Free(ptr) would call free(ptr). If the memory block was
allocated by pymalloc, free(ptr) does simply crash.

Victor
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/C5AI3SL77AV6QLRNTJ4PZH7MCYR2ZQAC/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Tim Peters
[Tim]
>> I don't think we need to cater anymore to careless code that mixes
>> system memory calls with O calls (e.g., if an extension gets memory
>> via `malloc()`, it's its responsibility to call `free()`), and if not
>> then `address_in_range()` isn't really necessary anymore either, and
>> then we could increase the pool size.  O would, however, need a new
>> way to recognize when its version of malloc punted to the system
>> malloc.

[Thomas Wouters ]
> Is this really feasible in a world where the allocators can be selected (and
> the default changed) at runtime?

I think so.  See the "Memory Management" section of the Python/C API
Reference Manual.  It's always been "forbidden" to, e.g., allocate a
thing with PyMem_New() but release it with free().  Ditto mixing a
PyMem_Raw... allocator with a PyMem... deallocator, or PyObject...
one.  Etc.

A type's tp_dealloc implementation should damn well which memory
family the type's allocator used,

However, no actual proposal on the table changes any "fact on the
ground" here.  They're all as forgiving of slop as the status quo.

> And what would be an efficient way of detecting allocations punted to
> malloc, if not address_in_range?

_The_ most efficient way is the one almost all allocators used long
ago:  use some "hidden" bits right before the address returned to the
user to store info about the block being returned.  Like 1 bit to
distinguish between "obmalloc took this out of one of its pools" and
"obmalloc got this from PyMem_Raw... (whatever that maps to - obmalloc
doesn't care)".  That would be much faster than what we do now.

But on current 64-bit boxes, "1 bit" turns into "16 bytes" to maintain
alignment, so space overhead becomes 100% for the smallest objects
obmalloc can return :-(

Neil Schemenauer takes a different approach in the recent "radix tree
arena map for obmalloc" thread here.  We exchanged ideas on that until
it got to the point that the tree levels only need to trace out
prefixes of obmalloc arena addresses.  That is, the new space burden
of the radix tree appears quite reasonably small.

It doesn't appear to be possible to make it faster than the current
address_in_range(), but in small-scale testing so far speed appears
comparable.


> Getting rid of address_in_range sounds like a nice idea, and I would love to 
> test
> how feasible it is -- I can run such a change against a wide selection of code
> at work, including a lot of third-party extension modules, but I don't see an 
> easy
> way to do it right now.

Neil's branch is here:

 https://github.com/nascheme/cpython/tree/obmalloc_radix_tree

It's effectively a different _implementation_ of the current
address_in_range(), one that doesn't ever need to read possibly
uninitialized memory, and couldn't care less about the OS page size.

For the latter reason, it's by far the clearest way to enable
expanding pool size above 4 KiB.  My PR also eliminates the pool size
limitation:

https://github.com/python/cpython/pull/13934

but at the cost of breaking bigger pools up internally into 4K regions
so the excruciating current address_in_range black magic still works.

Neil and I are both keen _mostly_ to increase pool and arena sizes.
The bigger they are, the more time obmalloc can spend in its fastest
code paths.

A question we can't answer yet (or possibly ever) is how badly that
would hurt Python returning arenas to the system, in long-running apps
the go through phases of low and high memory need.

I don't run anything like that - not a world I've ever lived in.  All
my experiments so far say, for programs that are neither horrible nor
wonderful in this respect:

1. An arena size of 4 KiB is most effective for that.
2. There's significant degradation in moving even to 8 KiB arenas.
3. Which continues getting worse the larger the arenas.
4. Until reaching 128 KiB, at which point the rate of degradation falls a lot.

So the current 256 KiB arenas already suck for such programs.

For "horrible" programs, not even tiny 4K arenas help much.

For "wonderful" programs, not even 16 MiB arenas hurt arena recycling
effectiveness.

So if you have real programs keen to "return memory to the system"
periodically, it would be terrific to get info about how changing
arena size affects their behavior in that respect.

My PR uses 16K pools and 1M arenas, quadrupling the status quo.
Because "why not?" ;-)

Neil's branch has _generally_, but not always, used 16 MiB arenas.
The larger the arenas in his branch, the smaller the radix tree needs
to grow.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7ZIFV2BEL64FQGC35F7QUPK3SHVR3VGT/


[Python-Dev] Re: python3 -bb and hash collisions

2019-06-21 Thread Daniel Holth
Thanks. I think I might like an option to disable str(bytes) without
disabling str != bytes. Unless the second operation would also corrupt
output.

Came across this kind of set in the hyper http library which uses a set to
accept certain headers with either str or bytes keys.

On Tue, Jun 18, 2019, 13:05 Christian Heimes  wrote:

> On 18/06/2019 18.32, Daniel Holth wrote:
> > set([u"foo", b"foo]) will error because the two kinds of string have the
> > same hash, and this causes a comparison. Is that correct?
>
> Yes, it will fail with -bb, because it turns comparison between str and
> bytes into an error. This can also happen with other strings when
> hash(u'somestring') & mask == hash(b'otherbytes') & mask. The mask of a
> set starts with PySet_MINSIZE - 1 == 8 and increases over team.
>
> Christian
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ZIF2MRBWSMSCFP6E7PZOBI5KYP46QZPK/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R6E7FAR36UO6XHQSIAVF4DIM7G23ADJP/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-21 Thread Thomas Wouters
On Sun, Jun 2, 2019 at 7:57 AM Tim Peters  wrote:

> I don't think we need to cater anymore to careless code that mixes
> system memory calls with O calls (e.g., if an extension gets memory
> via `malloc()`, it's its responsibility to call `free()`), and if not
> then `address_in_range()` isn't really necessary anymore either, and
> then we could increase the pool size.  O would, however, need a new
> way to recognize when its version of malloc punted to the system
> malloc.
>

Is this really feasible in a world where the allocators can be selected
(and the default changed) at runtime? And what would be an efficient way of
detecting allocations punted to malloc, if not address_in_range?

Getting rid of address_in_range sounds like a nice idea, and I would love
to test how feasible it is -- I can run such a change against a wide
selection of code at work, including a lot of third-party extension
modules, but I don't see an easy way to do it right now.

-- 
Thomas Wouters 

Hi! I'm an email virus! Think twice before sending your email to help me
spread!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OA4Z525KYYFHBTTVI5QXVTJH72ZCGQ2S/


[Python-Dev] Summary of Python tracker Issues

2019-06-21 Thread Python tracker


ACTIVITY SUMMARY (2019-06-14 - 2019-06-21)
Python tracker at https://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open7038 (+16)
  closed 42078 (+65)
  total  49116 (+81)

Open issues with patches: 2828 


Issues opened (63)
==

#24214: UTF-8 incremental decoder doesn't support surrogatepass correc
https://bugs.python.org/issue24214  reopened by vstinner

#32846: Deletion of large sets of strings is extra slow
https://bugs.python.org/issue32846  reopened by inada.naoki

#34162: idlelib/NEWS.txt for 3.8.0 (and backports)
https://bugs.python.org/issue34162  reopened by terry.reedy

#34875: Change .js mime to "text/javascript"
https://bugs.python.org/issue34875  reopened by mylesborins

#35998: test_asyncio: test_start_tls_server_1() TimeoutError on Fedora
https://bugs.python.org/issue35998  reopened by vstinner

#36732: test_asyncio: test_huge_content_recvinto() fails randomly
https://bugs.python.org/issue36732  reopened by vstinner

#37285: Python 2.7 setup.py incorrectly double-joins SDKROOT
https://bugs.python.org/issue37285  opened by mistydemeo

#37287: picke cannot dump Exception subclasses with different super() 
https://bugs.python.org/issue37287  opened by bquinlan

#37289: regression in Cython when pickling objects
https://bugs.python.org/issue37289  opened by tcaswell

#37291: AST - code cleanup
https://bugs.python.org/issue37291  opened by David Carlier

#37292: _xxsubinterpreters: Can't unpickle objects defined in __main__
https://bugs.python.org/issue37292  opened by Crusader Ky

#37293: concurrent.futures.InterpreterPoolExecutor
https://bugs.python.org/issue37293  opened by Crusader Ky

#37294: concurrent.futures.ProcessPoolExecutor and multiprocessing.poo
https://bugs.python.org/issue37294  opened by maggyero

#37295: Possible optimizations for math.comb()
https://bugs.python.org/issue37295  opened by rhettinger

#37296: pdb next vs __next__
https://bugs.python.org/issue37296  opened by tsingi

#37297: function changed when pickle bound method object
https://bugs.python.org/issue37297  opened by georgexsh

#37298: IDLE: Revise html to tkinker converter for help.html
https://bugs.python.org/issue37298  opened by terry.reedy

#37301: CGIHTTPServer doesn't handle long POST requests
https://bugs.python.org/issue37301  opened by vsbogd

#37302: Add an "onerror" callback parameter to the tempfile.TemporaryD
https://bugs.python.org/issue37302  opened by Jeffrey.Kintscher

#37305: Add MIME type for Web App Manifest
https://bugs.python.org/issue37305  opened by filips123

#37307: isinstance/issubclass doc isn't clear on whether it's an AND o
https://bugs.python.org/issue37307  opened by leewz

#37308: Possible mojibake in mmap.mmap() when using the tagname parame
https://bugs.python.org/issue37308  opened by ZackerySpytz

#37309: idlelib/NEWS.txt for 3.9.0 and backports
https://bugs.python.org/issue37309  opened by terry.reedy

#37310: Solaris 11.3 w/ Studio 12.6 test_ctypes fail
https://bugs.python.org/issue37310  opened by gmarler

#37311: Solaris 11.3 w/ Studio 12.6 test_support fail
https://bugs.python.org/issue37311  opened by gmarler

#37313: test_concurrent_futures stopped after 25 hours on AMD64 Window
https://bugs.python.org/issue37313  opened by vstinner

#37314: Compilation failed on AMD64 Debian root 3.8: undefined referen
https://bugs.python.org/issue37314  opened by vstinner

#37317: asyncio gather doesn't handle custom exceptions that inherit f
https://bugs.python.org/issue37317  opened by cmermingas

#37319: Deprecate using random.randrange() with non-integers
https://bugs.python.org/issue37319  opened by serhiy.storchaka

#37322: test_ssl: test_pha_required_nocert() emits a ResourceWarning
https://bugs.python.org/issue37322  opened by vstinner

#37323: test_asyncio: test_debug_mode_interop() fails using -Werror
https://bugs.python.org/issue37323  opened by vstinner

#37324: collections: remove deprecated aliases to ABC classes
https://bugs.python.org/issue37324  opened by vstinner

#37326: Windows LICENSE.txt do not contain libffi license
https://bugs.python.org/issue37326  opened by indygreg

#37328: remove deprecated HTMLParser.unescape
https://bugs.python.org/issue37328  opened by inada.naoki

#37329: [2.7] valgrind python2 -m test.regrtest test___all__: definite
https://bugs.python.org/issue37329  opened by vstinner

#37330: open(): remove 'U' mode, deprecated since Python 3.3
https://bugs.python.org/issue37330  opened by vstinner

#37334: Add a cancel method to asyncio Queues
https://bugs.python.org/issue37334  opened by Martin.Teichmann

#37335: Add 646 ASCII alias to locale coercion tests.
https://bugs.python.org/issue37335  opened by kulikjak

#37336: os.sendfile() support missing for AIX platform
https://bugs.python.org/issue37336  opened by Michael.Felt

#37337: Add _PyObject_VectorcallMethod() function
https://bugs.python.org/issue37337  opened 

[Python-Dev] Re: bug(?) - unexpected frames being skipped in extract_stack with closures

2019-06-21 Thread Ed Peschko
Steven,

Yes and I posted to python-dev for a reason - I'm almost positive that
this is a bug - or at least an inconsistency - in how python handles
stack frames WRT closures in some instances.  In fact, the reason I
posted is because we hit this inconsistency in handling production
code - we need to have a reliable stack trace for all the functions we
call in logs so we can better track down issues when they occur and be
able to tie those issues to underlying code.

If I add a pdb.set_trace() to the location inside the closure, I get a
different - in fact the correct - stack trace. Otherwise, like I said,
the stack trace points back to the place where the closure was
defined, not the actual place the closure was called.

Unfortunately, it looks like this bug is not in a simple example that
I can readily reproduce (I just tried). so if I see it again i'll try
to simplify it to a point where it still manifests and post that.

Ed

On Fri, Jun 21, 2019 at 1:35 AM Steve Holden  wrote:
>
> Hi Ed,
>
> Your note probably won't receive any other reply than this, because the 
> python-dev list is specifically for discussions about the development _of_, 
> rather than _with_, Python.
>
> A more appropriate forum is probably the Python list 
> (python-l...@python.org), about which you can discover more details at 
> Python-list Info Page.
>
> Kind regards,
> Steve Holden
>
>
> On Thu, Jun 20, 2019 at 3:40 AM Ed Peschko  wrote:
>>
>> all,
>>
>> I'm writing a function meant to print out the context of a given
>> function call when executed - for example:
>>
>> 1. def main():
>> 2.
>> 3. _st = stack_trace_closure("/path/to/log")
>> 4. _st()
>> 5. _st()
>>
>> would print out
>>
>> /path/to/file.py:4
>> /path/to/file.py:5
>>
>> for each line when executed. Basic idea is to create a closure and
>> associate that closure with a filename, then run that closure to print
>> to the log without needing to give the filename over and over again.
>>
>> So far so good. But when I write this function, the frames given by
>> getframeinfo or extract_stack skip the actual calling point of the
>> function, instead giving back the *point where the closure was
>> defined*.  (in the above example, it would print /path/to/file.py:3,
>> /path/to/file.py:3 instead of incrementing to show 4 and 5).
>>
>> However, when I insert a pdb statement, it gives me the expected
>> calling frame where _st is actually called.
>>
>> What's going on here? It looks an awful lot like a bug to me, like an
>> extra frame is being optimized out of of the closure's stack
>> prematurely.
>>
>> I've tried this in python2.7 and python3.3, both show this.
>>
>> thanks much for any info,
>>
>> Ed
>>
>> code follows:
>> ---
>>
>> def stack_trace_closure(message, file_name=None, frame=3):
>>
>> fh = open(file_name, "w+")
>>
>> def _helper():
>> return stack_trace(message, frame, fh)
>>
>> return _helper
>>
>> def stack_trace(message _frame, fh):
>>
>> _bt = traceback.extract_stack()
>>
>>  fh.write( "%s:%s - %s" % (_bt[_frame][0], _bt[_frame][1], _message))
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/4MKHPCRNAJACKIBMLILMQMUPTEVFD3HW/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DQBKRUI5ZMU6F3JHIVIZKT32DBOOQOLQ/


[Python-Dev] Re: _Py_Identifier should support non-ASCII string?

2019-06-21 Thread Antoine Pitrou
On Fri, 21 Jun 2019 12:22:21 +0900
Inada Naoki  wrote:
> On Fri, Jun 21, 2019 at 1:28 AM Victor Stinner  wrote:
> >
> > Le jeu. 20 juin 2019 à 11:15, Inada Naoki  a écrit 
> > :  
> > > Can we change _PyUnicode_FromId to use _PyUnicode_FromASCII?  
> >
> > How would a developer detect a mistake (non-ASCII) character? Does
> > _PyUnicode_FromASCII() raise an exception, even in release mode?  
> 
> No.  That's one of the reasons why _PyUnicode_FromASCII is much faster
> than PyUnicode_FromString().

Much faster... how much? And how does it impact overall startup time?

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ABUSBBTU7RCCOE7636HAHAXWZUO66CFU/


[Python-Dev] Re: _Py_Identifier should support non-ASCII string?

2019-06-21 Thread Inada Naoki
OK.  I start optimizing PyUnicode_GetString() already.

It was 2x slower than _PyUnicode_FromASCII.
But it can be only 1.5x slower than _PyUnicode_FromASCII.

And as a bonus,  `b"foo".decode()` become 10% faster too.

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3HL4M5MLA2KUIZCV6AFFXL67ZKMDSXTV/


[Python-Dev] Re: _Py_Identifier should support non-ASCII string?

2019-06-21 Thread Serhiy Storchaka

20.06.19 19:28, Victor Stinner пише:

Le jeu. 20 juin 2019 à 11:15, Inada Naoki  a écrit :

Can we change _PyUnicode_FromId to use _PyUnicode_FromASCII?


How would a developer detect a mistake (non-ASCII) character? Does
_PyUnicode_FromASCII() raise an exception, even in release mode?

The function is only called once (that's the whole purpose of the
Py_IDENTIFER() API. Is it really worth it?


I concur with Victor. The initialization code of the _Py_IDENTIFER API 
is not not performance sensitive.


And looking on the cases where _PyUnicode_FromASCII is used currently, I 
think that we can get rid of _PyUnicode_FromASCII in most of them for 
performance reasons. For example, when format a complex we first create 
two dynamically allocated 8-bit buffers in PyOS_double_to_string, then 
convert them to Unicode objects using _PyUnicode_FromASCII, then parse 
them, then build the final result in several steps using the 
_PyUnicodeWriter API. I think it can be done in more optimal way.


So it may be that we will remove _PyUnicode_FromASCII at end.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KAORN7DBHUT5OHAYC7DH24XNVH2T6Q5Y/


[Python-Dev] Re: bug(?) - unexpected frames being skipped in extract_stack with closures

2019-06-21 Thread Steve Holden
Hi Ed,

Your note probably won't receive any other reply than this, because the
python-dev list is specifically for discussions about the development _of_,
rather than _with_, Python.

A more appropriate forum is probably the Python list (python-l...@python.org),
about which you can discover more details at Python-list Info Page
.

Kind regards,
Steve Holden


On Thu, Jun 20, 2019 at 3:40 AM Ed Peschko  wrote:

> all,
>
> I'm writing a function meant to print out the context of a given
> function call when executed - for example:
>
> 1. def main():
> 2.
> 3. _st = stack_trace_closure("/path/to/log")
> 4. _st()
> 5. _st()
>
> would print out
>
> /path/to/file.py:4
> /path/to/file.py:5
>
> for each line when executed. Basic idea is to create a closure and
> associate that closure with a filename, then run that closure to print
> to the log without needing to give the filename over and over again.
>
> So far so good. But when I write this function, the frames given by
> getframeinfo or extract_stack skip the actual calling point of the
> function, instead giving back the *point where the closure was
> defined*.  (in the above example, it would print /path/to/file.py:3,
> /path/to/file.py:3 instead of incrementing to show 4 and 5).
>
> However, when I insert a pdb statement, it gives me the expected
> calling frame where _st is actually called.
>
> What's going on here? It looks an awful lot like a bug to me, like an
> extra frame is being optimized out of of the closure's stack
> prematurely.
>
> I've tried this in python2.7 and python3.3, both show this.
>
> thanks much for any info,
>
> Ed
>
> code follows:
> ---
>
> def stack_trace_closure(message, file_name=None, frame=3):
>
> fh = open(file_name, "w+")
>
> def _helper():
> return stack_trace(message, frame, fh)
>
> return _helper
>
> def stack_trace(message _frame, fh):
>
> _bt = traceback.extract_stack()
>
>  fh.write( "%s:%s - %s" % (_bt[_frame][0], _bt[_frame][1], _message))
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/4MKHPCRNAJACKIBMLILMQMUPTEVFD3HW/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RVIY2MGM5DP4R7U2PT4SV5SVCU7NAC7D/