[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-17 Thread Ma Lin


Ma Lin  added the comment:

Last benchmark was wrong, \Ob3 option was not enabled.

Apply `pgo_ob3.diff`, it slows, so I close this issue.

+-++--+
| Benchmark   | py39_pgo_a | py39_pgo_b   |
+=++==+
| 2to3| 461 ms | 465 ms: 1.01x slower (+1%)   |
+-++--+
| chameleon   | 13.4 ms| 13.7 ms: 1.03x slower (+3%)  |
+-++--+
| chaos   | 138 ms | 141 ms: 1.02x slower (+2%)   |
+-++--+
| crypto_pyaes| 141 ms | 143 ms: 1.01x slower (+1%)   |
+-++--+
| deltablue   | 9.01 ms| 9.20 ms: 1.02x slower (+2%)  |
+-++--+
| django_template | 64.7 ms| 65.4 ms: 1.01x slower (+1%)  |
+-++--+
| dulwich_log | 78.2 ms| 78.8 ms: 1.01x slower (+1%)  |
+-++--+
| fannkuch| 640 ms | 668 ms: 1.04x slower (+4%)   |
+-++--+
| float   | 165 ms | 163 ms: 1.01x faster (-1%)   |
+-++--+
| genshi_text | 40.7 ms| 41.5 ms: 1.02x slower (+2%)  |
+-++--+
| genshi_xml  | 87.2 ms| 88.4 ms: 1.01x slower (+1%)  |
+-++--+
| go  | 309 ms | 314 ms: 1.01x slower (+1%)   |
+-++--+
| hexiom  | 12.3 ms| 12.7 ms: 1.03x slower (+3%)  |
+-++--+
| json_dumps  | 16.7 ms| 16.8 ms: 1.01x slower (+1%)  |
+-++--+
| json_loads  | 32.1 us| 32.5 us: 1.01x slower (+1%)  |
+-++--+
| logging_format  | 14.6 us| 15.0 us: 1.03x slower (+3%)  |
+-++--+
| logging_silent  | 247 ns | 257 ns: 1.04x slower (+4%)   |
+-++--+
| logging_simple  | 13.2 us| 13.6 us: 1.03x slower (+3%)  |
+-++--+
| mako| 22.1 ms| 22.8 ms: 1.03x slower (+3%)  |
+-++--+
| meteor_contest  | 135 ms | 137 ms: 1.01x slower (+1%)   |
+-++--+
| nbody   | 184 ms | 191 ms: 1.04x slower (+4%)   |
+-++--+
| nqueens | 132 ms | 137 ms: 1.04x slower (+4%)   |
+-++--+
| pathlib | 156 ms | 162 ms: 1.04x slower (+4%)   |
+-++--+
| pickle  | 16.3 us| 15.4 us: 1.05x faster (-5%)  |
+-++--+
| pickle_dict | 39.7 us| 40.0 us: 1.01x slower (+1%)  |
+-++--+
| pickle_list | 5.93 us| 6.15 us: 1.04x slower (+4%)  |
+-++--+
| pickle_pure_python  | 581 us | 587 us: 1.01x slower (+1%)   |
+-++--+
| pidigits| 243 ms | 242 ms: 1.00x faster (-0%)   |
+-++--+
| pyflate | 885 ms | 908 ms: 1.03x slower (+3%)   |
+-++--+
| python_startup  | 27.8 ms| 28.0 ms: 1.01x slower (+1%)  |
+-++--+
| python_startup_no_site  | 22.0 ms| 22.1 ms: 1.00x slower (+0%)  |
+-++--+
| raytrace| 630 ms | 632 ms: 1.00x slower (+0%)   |
+-++--+
| regex_compile   | 215

[issue42369] Reading ZipFile not thread-safe

2020-11-16 Thread Ma Lin


Change by Ma Lin :


--
nosy: +malin

___
Python tracker 
<https://bugs.python.org/issue42369>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Ma Lin


Ma Lin  added the comment:

In PGO build, the improvement is not much.

(3.9 branch, with PGO, build.bat -p X64 --pgo)

+-+--+--+
| Benchmark   | baseline-pgo | ob3-pgo  |
+=+==+==+
| 2to3| 464 ms   | 462 ms: 1.01x faster (-1%)   |
+-+--+--+
| chameleon   | 14.0 ms  | 13.5 ms: 1.03x faster (-3%)  |
+-+--+--+
| crypto_pyaes| 142 ms   | 143 ms: 1.00x slower (+0%)   |
+-+--+--+
| django_template | 65.0 ms  | 65.4 ms: 1.01x slower (+1%)  |
+-+--+--+
| fannkuch| 665 ms   | 650 ms: 1.02x faster (-2%)   |
+-+--+--+
| float   | 166 ms   | 164 ms: 1.01x faster (-1%)   |
+-+--+--+
| genshi_text | 41.4 ms  | 41.0 ms: 1.01x faster (-1%)  |
+-+--+--+
| genshi_xml  | 88.1 ms  | 87.0 ms: 1.01x faster (-1%)  |
+-+--+--+
| go  | 315 ms   | 311 ms: 1.01x faster (-1%)   |
+-+--+--+
| hexiom  | 12.7 ms  | 12.6 ms: 1.01x faster (-1%)  |
+-+--+--+
| json_dumps  | 16.7 ms  | 16.6 ms: 1.01x faster (-1%)  |
+-+--+--+
| json_loads  | 33.5 us  | 32.1 us: 1.04x faster (-4%)  |
+-+--+--+
| logging_simple  | 13.6 us  | 13.3 us: 1.02x faster (-2%)  |
+-+--+--+
| mako| 22.7 ms  | 22.8 ms: 1.01x slower (+1%)  |
+-+--+--+
| meteor_contest  | 136 ms   | 138 ms: 1.01x slower (+1%)   |
+-+--+--+
| nbody   | 189 ms   | 186 ms: 1.02x faster (-2%)   |
+-+--+--+
| nqueens | 135 ms   | 135 ms: 1.01x faster (-1%)   |
+-+--+--+
| pathlib | 157 ms   | 154 ms: 1.02x faster (-2%)   |
+-+--+--+
| pickle  | 16.8 us  | 16.4 us: 1.02x faster (-2%)  |
+-+--+--+
| pickle_dict | 41.3 us  | 40.4 us: 1.02x faster (-2%)  |
+-+--+--+
| pickle_list | 6.34 us  | 6.42 us: 1.01x slower (+1%)  |
+-+--+--+
| pickle_pure_python  | 588 us   | 584 us: 1.01x faster (-1%)   |
+-+--+--+
| pidigits| 242 ms   | 242 ms: 1.00x faster (-0%)   |
+-+--+--+
| pyflate | 905 ms   | 898 ms: 1.01x faster (-1%)   |
+-+--+--+
| python_startup  | 28.0 ms  | 27.9 ms: 1.00x faster (-0%)  |
+-+--+--+
| regex_compile   | 220 ms   | 218 ms: 1.01x faster (-1%)   |
+-+--+--+
| regex_v8| 33.1 ms  | 32.9 ms: 1.01x faster (-1%)  |
+-+--+--+
| richards| 88.9 ms  | 88.3 ms: 1.01x faster (-1%)  |
+-+--+--+
| scimark_fft | 494 ms   | 486 ms: 1.02x faster (-2%)   |
+-+--+--+
| scimark_lu  | 210 ms   | 207 ms: 1.02x faster (-2%)   |
+-+--+--+
| scimark_monte_carlo | 141 ms   | 137 ms: 1.03x faster (-3%)   |
+-+--+--+
| scimark_sor | 263 ms   | 255 ms: 1.03x faster

[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Ma Lin


Ma Lin  added the comment:

> Could you please try again with PGO?

Please wait.

BTW, this option was advised in another project.
In that project, even enable `\Ob3`, it still slower than GCC 9 build.
If you are interested, see: https://github.com/facebook/zstd/issues/2314

--

___
Python tracker 
<https://bugs.python.org/issue42366>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42366] Use MSVC2019 and /Ob3 option to compile Windows builds

2020-11-16 Thread Ma Lin


New submission from Ma Lin :

MSVC2019 has a new option `/Ob3`, it specifies more aggressive inlining than 
/Ob2:
https://docs.microsoft.com/en-us/cpp/build/reference/ob-inline-function-expansion?view=msvc-160

If use this option in MSVC2017, it will emit a warning:
cl : Command line warning D9002 : ignoring unknown option '/Ob3'

Just apply `Ob3.diff`, get this improvement:
(Python 3.9 branch, No PGO, build.bat -p X64)

+-+--+--+
| Benchmark   | baseline | ob3  |
+=+==+==+
| 2to3| 563 ms   | 552 ms: 1.02x faster (-2%)   |
+-+--+--+
| chameleon   | 16.5 ms  | 16.1 ms: 1.03x faster (-3%)  |
+-+--+--+
| chaos   | 200 ms   | 197 ms: 1.02x faster (-2%)   |
+-+--+--+
| crypto_pyaes| 186 ms   | 184 ms: 1.01x faster (-1%)   |
+-+--+--+
| deltablue   | 13.0 ms  | 12.6 ms: 1.03x faster (-3%)  |
+-+--+--+
| dulwich_log | 94.5 ms  | 93.9 ms: 1.01x faster (-1%)  |
+-+--+--+
| fannkuch| 806 ms   | 761 ms: 1.06x faster (-6%)   |
+-+--+--+
| float   | 211 ms   | 199 ms: 1.06x faster (-6%)   |
+-+--+--+
| genshi_text | 48.3 ms  | 47.7 ms: 1.01x faster (-1%)  |
+-+--+--+
| go  | 446 ms   | 437 ms: 1.02x faster (-2%)   |
+-+--+--+
| hexiom  | 16.6 ms  | 15.9 ms: 1.04x faster (-4%)  |
+-+--+--+
| json_dumps  | 19.9 ms  | 19.3 ms: 1.03x faster (-3%)  |
+-+--+--+
| json_loads  | 45.5 us  | 43.9 us: 1.04x faster (-3%)  |
+-+--+--+
| logging_format  | 21.4 us  | 20.7 us: 1.03x faster (-3%)  |
+-+--+--+
| logging_silent  | 343 ns   | 319 ns: 1.07x faster (-7%)   |
+-+--+--+
| mako| 29.0 ms  | 27.6 ms: 1.05x faster (-5%)  |
+-+--+--+
| meteor_contest  | 168 ms   | 162 ms: 1.04x faster (-3%)   |
+-+--+--+
| nbody   | 256 ms   | 244 ms: 1.05x faster (-5%)   |
+-+--+--+
| nqueens | 168 ms   | 162 ms: 1.04x faster (-4%)   |
+-+--+--+
| pathlib | 175 ms   | 168 ms: 1.04x faster (-4%)   |
+-+--+--+
| pickle  | 17.9 us  | 17.3 us: 1.04x faster (-4%)  |
+-+--+--+
| pickle_dict | 41.0 us  | 33.2 us: 1.24x faster (-19%) |
+-+--+--+
| pickle_list | 6.73 us  | 5.89 us: 1.14x faster (-12%) |
+-+--+--+
| pickle_pure_python  | 829 us   | 793 us: 1.05x faster (-4%)   |
+-+--+--+
| pidigits| 243 ms   | 243 ms: 1.00x faster (-0%)   |
+-+--+--+
| pyflate | 1.21 sec | 1.18 sec: 1.03x faster (-2%) |
+-+--+--+
| raytrace| 947 ms   | 915 ms: 1.03x faster (-3%)   |
+-+--+--+
| regex_compile   | 291 ms   | 284 ms: 1.03x faster (-2%)   |
+-+--+--+
| regex_dna   | 217 ms   | 222 ms: 1.02x slower (+2%)   |
+-+--+--+
| regex_effbot| 3.97 ms  | 4.13 ms: 1.04x slower (+4%)  |
+-+--+--+
| regex_v8| 35.2 ms  | 34.6 ms: 1.02x faster (-2%)  |
+-+--+--+
| richards

[issue42304] [easy C] long type performance waste in 64-bit Windows build

2020-11-10 Thread Ma Lin


Ma Lin  added the comment:

> I do not think that this is suitable for newcomers because you need to have 
> deep understanding why it was written in such form at first place and what 
> will be changed if you change it.

I agree contributors need to understand code, rather than simply replace the 
type. Maybe two weeks is enough to understand the code.

> And it could negatively affect performance, especially on 32-bit platforms.

`long` type can be replaced by `ssize_t`.
`unsigned long` type can be replaced by `size_t`.
And use `PyLong_FromSize_t`/`PyLong_FromSize_t`, then there is no negative 
impact.

> I don't think that it's worth it to optimize this one.

Although the speedup is small, it's free.
I don't see it as optimization, just no more waste.

> I suggest to fix in it bpo-38252.

I forgot it in that issue, I just searched "0x80808080" in the code, it was 
missed.

--

___
Python tracker 
<https://bugs.python.org/issue42304>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42304] [easy C] long type performance waste in 64-bit Windows build

2020-11-10 Thread Ma Lin


Ma Lin  added the comment:

> What is the problem exactly?

There are several different problems, such as:
https://github.com/python/cpython/blob/v3.10.0a2/Modules/mathmodule.c#L2033

In addition, `utf16_decode` also has this problem, I forgot this:
https://github.com/python/cpython/blob/v3.10.0a2/Objects/stringlib/codecs.h#L465

Maybe these small problems are suitable for newcomer to familiarize the 
contribution process.

--

___
Python tracker 
<https://bugs.python.org/issue42304>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42304] [easy C] long type performance waste in 64-bit Windows build

2020-11-09 Thread Ma Lin


New submission from Ma Lin :

C type `long` is 4-byte integer in 64-bit Windows build (MSVC behavior). [1]
In other compilers, `long` is 8-byte integer in 64-bit build.

This leads to a bit unnecessary performance waste, issue38252 fixed this 
problem in a situation.

Search `SIZEOF_LONG` in CPython code, there's still a few long type waste.

Novices are welcome to try contribution.

[1] https://stackoverflow.com/questions/384502

--
components: Windows
messages: 380638
nosy: malin, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: [easy C] long type performance waste in 64-bit Windows build
type: performance
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue42304>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-10-28 Thread Ma Lin


Ma Lin  added the comment:

I modify lzma module to use different growth factors, see attached picture 
different_factors.png

1.5x should be the growth factor of _PyBytesWriter under Windows.

So if change _PyBytesWriter to use memory blocks, maybe there will be no 
performance improvement.

Over allocate factor of _PyBytesWriter:

# ifdef MS_WINDOWS
# define OVERALLOCATE_FACTOR 2
# else
# define OVERALLOCATE_FACTOR 4
# endif

(I'm using Windows 10)

--
Added file: https://bugs.python.org/file49544/different_factors.png

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38252] Use 8-byte step to detect ASCII sequence in 64bit Windows builds

2020-10-16 Thread Ma Lin


Ma Lin  added the comment:

Although the improvement is not great, it's a very hot code path.

Could you review the PR?

--
components: +Windows
nosy: +paul.moore, tim.golden

___
Python tracker 
<https://bugs.python.org/issue38252>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41735] Thread locks in zlib module may go wrong in rare case

2020-09-07 Thread Ma Lin


Change by Ma Lin :


--
pull_requests: +21213
pull_request: https://github.com/python/cpython/pull/22132

___
Python tracker 
<https://bugs.python.org/issue41735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41735] Thread locks in zlib module may go wrong in rare case

2020-09-07 Thread Ma Lin


Change by Ma Lin :


--
pull_requests: +21211
pull_request: https://github.com/python/cpython/pull/22130

___
Python tracker 
<https://bugs.python.org/issue41735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41735] Thread locks in zlib module may go wrong in rare case

2020-09-06 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +21208
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/22126

___
Python tracker 
<https://bugs.python.org/issue41735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41735] Thread locks in zlib module may go wrong in rare case

2020-09-06 Thread Ma Lin


New submission from Ma Lin :

The code in zlib module:

self->zst.next_in = data->buf;  // set next_in
...
ENTER_ZLIB(self);   // acquire thread lock

`self->zst` is a `z_stream` struct defined in zlib, used to record states of a 
compress/decompress stream:

typedef struct z_stream_s {
Bytef*next_in;  /* next input byte */
uInt avail_in;  /* number of bytes available at next_in */
uLongtotal_in;  /* total number of input bytes read so far */

Bytef*next_out; /* next output byte will go here */
uInt avail_out; /* remaining free space at next_out */
uLongtotal_out; /* total number of bytes output so far */

... // Other states
} z_stream;

Setting `next_in` before acquiring the thread lock may mix up 
compress/decompress state in other threads.

Moreover, modify `ENTER_ZLIB` macro, don't release the GIL when the thread lock 
can be acquired immediately. This behavior is the same as the bz2/lzma modules.

--
components: Library (Lib)
messages: 376473
nosy: malin
priority: normal
severity: normal
status: open
title: Thread locks in zlib module may go wrong in rare case
type: behavior
versions: Python 3.10, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue41735>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37095] [Feature Request]: Add zstd support in tarfile

2020-08-29 Thread Ma Lin


Ma Lin  added the comment:

I have spent two weeks, almost complete the code, a preview:
https://github.com/animalize/cpython/pull/8/files

Write directly for stdlib, since there are already zstd modules on pypi.
In addition, the API of zstd is simple, not as complicated as lzma.

Can also use these:
1, argument clinic
2, multi-phase init
3. internal function _PyLong_AsInt

--

___
Python tracker 
<https://bugs.python.org/issue37095>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35228] Index search in CHM help crashes viewer

2020-08-28 Thread Ma Lin


Ma Lin  added the comment:

> when I delete the file %APPDATA%\Microsoft\HTML Help\hh.dat,
> the problem seems to go away.

It doesn't work for me.
Moreover, `Binary Index=Yes` no longer works on my PC.

A few days ago, I installed a clean Windows 10 2004, then CHM's index cannot be 
used.

--

___
Python tracker 
<https://bugs.python.org/issue35228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35228] Index search in CHM help crashes viewer

2020-08-27 Thread Ma Lin


Ma Lin  added the comment:

> More realistically, including the docs as unbundled HTML files
> and relying on the default browser is probably an all-around better idea.

CHM's index function is very convenient, I almost always use this feature when 
I use CHM.

How about use tkinter to write a doc indexing tool, it reads the indexes in 
this page:
https://docs.python.org/3/genindex-all.html

Its behavior is the same as CHM's index, except that the link is opened in 
browser.

This indexing tool can be packaged with Python installer, just like IDLE, then 
MacOS/Linux users can also use the index.

--

___
Python tracker 
<https://bugs.python.org/issue35228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37095] [Feature Request]: Add zstd support in tarfile

2020-08-15 Thread Ma Lin


Ma Lin  added the comment:

There are two zstd modules on pypi:

https://pypi.org/project/zstd/
https://pypi.org/project/zstandard/

The first one is too simple.

The second one is powerful, but has too many APIs:
ZstdCompressorIterator
ZstdDecompressorIterator
ZstdCompressionReader
ZstdCompressionWriter
ZstdCompressionChunkerIterator
(multi-thread compression)

IMO these are not necessary for stdlib.

In addition, it needs to add something, such as the `max_length` parameter, and 
a `ZstdFile` class that can be integrated with the tarfile module. These 
workloads are not big.

I looked at the zstd API, it's a bit simpler than lzma/bz2/zlib. If spend a 
month, should be able to make a zstd module for stdlib. Then discuss the 
detailed API on Python-Ideas.
 
I once wanted to do this job, but it seems my time does not allow it. If anyone 
wants to do this work, please reply here.

FYI, Python 3.10 schedule:
3.10.0 beta 1: 2021-05-03 (No new features beyond this point.)

--

___
Python tracker 
<https://bugs.python.org/issue37095>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41555] re.sub replaces twice

2020-08-15 Thread Ma Lin


Ma Lin  added the comment:

There can be at most one empty match at a position. IIRC, Perl's regex engine 
has very similar behavior.
If don't want empty match, use + is fine.

--

___
Python tracker 
<https://bugs.python.org/issue41555>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41555] re.sub replaces twice

2020-08-14 Thread Ma Lin


Ma Lin  added the comment:

The re.sub() doc said:
Changed in version 3.7: Empty matches for the pattern are replaced when 
adjacent to a previous non-empty match.

IMO 3.7+ behavior is more reasonable, and it fixed a bug, see issue25054.

--
nosy: +malin

___
Python tracker 
<https://bugs.python.org/issue41555>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41265] lzma/bz2 module: inefficient buffer growth algorithm

2020-08-05 Thread Ma Lin


Ma Lin  added the comment:

A more thorough solution was used, see issue41486.

So I close this issue.

--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41265>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +20886
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/21740

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin


Change by Ma Lin :


Added file: https://bugs.python.org/file49368/benchmark_real.py

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin


Change by Ma Lin :


Added file: https://bugs.python.org/file49367/benchmark.py

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin


Change by Ma Lin :


Added file: https://bugs.python.org/file49365/0to200MB_step2MB.png

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin


Change by Ma Lin :


Added file: https://bugs.python.org/file49366/0to20MB_step64KB.png

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin


Change by Ma Lin :


Added file: https://bugs.python.org/file49364/0to2GB_step30MB.png

___
Python tracker 
<https://bugs.python.org/issue41486>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41486] Add _BlocksOutputBuffer for bz2/lzma/zlib module

2020-08-05 Thread Ma Lin

New submission from Ma Lin :

  bz2/lzma module's current growth algorithm

bz2/lzma module's initial output buffer size is 8KB [1][2], and they are using 
this output buffer growth algorithm [3][4]:

newsize = size + (size >> 3) + 6

[1] https://github.com/python/cpython/blob/v3.9.0b4/Modules/_bz2module.c#L109
[2] https://github.com/python/cpython/blob/v3.9.0b4/Modules/_lzmamodule.c#L124
[3] https://github.com/python/cpython/blob/v3.9.0b4/Modules/_lzmamodule.c#L133
[4] https://github.com/python/cpython/blob/v3.9.0b4/Modules/_bz2module.c#L121

For many case, the output buffer is resized too many times.
You may paste this code to REPL to see the growth step:

size = 8*1024
for i in range(1, 120):
print('Step %d ' % i, format(size, ','), 'bytes')
size = size + (size >> 3) + 6

Step 1  8,192 bytes
Step 2  9,222 bytes
Step 3  10,380 bytes
Step 4  11,683 bytes
Step 5  13,149 bytes
Step 6  14,798 bytes
...

  zlib module's current growth algorithm

zlib module's initial output buffer size is 16KB [5], in each growth the buffer 
size doubles [6]. 

[5] https://github.com/python/cpython/blob/v3.9.0b4/Modules/zlibmodule.c#L32
[6] https://github.com/python/cpython/blob/v3.9.0b4/Modules/zlibmodule.c#L174

This algorithm has a higher risk of running out of memory:

...
Step 14  256 MB
Step 15  512 MB
Step 16  1 GB
Step 17  2 GB
Step 18  4 GB
Step 19  8 GB
Step 20  16 GB
Step 21  32 GB
Step 22  64 GB
...

  Add _BlocksOutputBuffer for bz2/lzma/zlib module

Proposed PR uses a list of bytes object to represent output buffer.
It can eliminate the overhead of resizing (bz2/lzma), and prevent excessive 
memory footprint (zlib).

I only tested decompression, because the result is more obvious than 
compression.

For special data benchmark (all data consists of b'a'), see these attached 
pictures, _BlocksOutputBuffer has linear performance:
(Benchmark by attached file benchmark.py)

0to2GB_step30MB.png(Decompress from 0 to 2GB, 30MB step)
0to200MB_step2MB.png   (Decompress from 0 to 200MB, 2MB step)
0to20MB_step64KB.png   (Decompress from 0 to 20MB, 64KB step)

After switching to _BlocksOutputBuffer, the code of bz2/lzma is more concise, 
the code of zlib is basically translated statement by statement, IMO it's safe 
and easy for review.

  Real data benchmark

For real data, the weight of resizing output buffer is not big, so the 
performance improvement is not as big as above pictures:
(Benchmark by attached file benchmark_real.py)

- bz2 -

linux-2.6.39.4.tar.bz2
input size: 76,097,195, output size: 449,638,400
best of 5: [baseline_raw] 12.954 sec -> [patched_raw] 11.600 sec, 1.12x faster 
(-10%)

firefox-79.0.linux-i686.tar.bz2
input size: 74,109,706, output size: 228,055,040
best of 5: [baseline_raw] 8.511 sec -> [patched_raw] 7.829 sec, 1.09x faster 
(-8%)

ffmpeg-4.3.1.tar.bz2
input size: 11,301,038, output size: 74,567,680
best of 5: [baseline_raw] 1.915 sec -> [patched_raw] 1.671 sec, 1.15x faster 
(-13%)

gimp-2.10.20.tar.bz2
input size: 33,108,938, output size: 214,179,840
best of 5: [baseline_raw] 5.794 sec -> [patched_raw] 4.964 sec, 1.17x faster 
(-14%)

sde-external-8.56.0-2020-07-05-lin.tar.bz2
input size: 26,746,086, output size: 92,129,280
best of 5: [baseline_raw] 3.153 sec -> [patched_raw] 2.835 sec, 1.11x faster 
(-10%)

- lzma -

linux-5.7.10.tar.xz
input size: 112,722,840, output size: 966,062,080
best of 5: [baseline_raw] 9.813 sec -> [patched_raw] 7.434 sec, 1.32x faster 
(-24%)

linux-2.6.39.4.tar.xz
input size: 63,243,812, output size: 449,638,400
best of 5: [baseline_raw] 5.256 sec -> [patched_raw] 4.200 sec, 1.25x faster 
(-20%)

gcc-9.3.0.tar.xz
input size: 70,533,868, output size: 618,608,640
best of 5: [baseline_raw] 6.398 sec -> [patched_raw] 4.878 sec, 1.31x faster 
(-24%)

Python-3.8.5.tar.xz
input size: 18,019,640, output size: 87,531,520
best of 5: [baseline_raw] 1.315 sec -> [patched_raw] 1.098 sec, 1.20x faster 
(-16%)

firefox-79.0.source.tar.xz
input size: 333,220,776, output size: 2,240,573,440
best of 5: [baseline_raw] 25.339 sec -> [patched_raw] 19.661 sec, 1.29x faster 
(-22%)

- zlib -

linux-5.7.10.tar.gz
input size: 175,493,557, output size: 966,062,080
best of 5: [baseline_raw] 2.360 sec -> [patched_raw] 2.401 sec, 1.02x slower 
(+2%)

linux-2.6.39.4.tar.gz
input size: 96,011,459, output size: 449,638,400
best of 5: [baseline_raw] 1.215 sec -> [patched_raw] 1.216 sec, 1.00x slower 
(+0%)

gcc-9.3.0.tar.gz
input size: 124,140,228, output size: 618,608,640
best of 5: [baseline_raw] 1.668 sec -> [patched_raw] 1.555 sec, 1.07x faster 
(-7%)

Python-3.8.5.tgz
input size: 24,149,093, output size: 87,531,520
best of 5: [baseline_raw] 0.263 sec -> [patched_raw] 0.253 sec, 1.04x faster 
(-4%)

openjdk-14.0.2_linux-x64_bin.tar.gz
input size: 198,606,190, output size: 335,175,680
best of 5: [baseline_raw] 1

[issue41330] Inefficient error-handle for CJK encodings

2020-08-03 Thread Ma Lin


Ma Lin  added the comment:

I'm working on issue41265.
If nothing happens, I also would like to write a zstd module for stdlib before 
the end of the year, but I dare not promise this.

If anyone wants to work on this issue, very grateful.

--

___
Python tracker 
<https://bugs.python.org/issue41330>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41452] Inefficient BufferedReader.read(-1)

2020-08-01 Thread Ma Lin


Ma Lin  added the comment:

Some underlying stream has fast-path for .readall().
So close this issue.

--
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue41452>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41330] Inefficient error-handle for CJK encodings

2020-07-31 Thread Ma Lin


Ma Lin  added the comment:

At least fix this bug:

the error-handler object is not cached, it needs to be
looked up from a dict every time, which is very inefficient.

The code:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/cjkcodecs/multibytecodec.c#L81-L98

I will submit a PR at some point.

--

___
Python tracker 
<https://bugs.python.org/issue41330>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41452] Inefficient BufferedReader.read(-1)

2020-07-31 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +20842
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/21698

___
Python tracker 
<https://bugs.python.org/issue41452>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41452] Inefficient BufferedReader.read(-1)

2020-07-31 Thread Ma Lin


New submission from Ma Lin :

BufferedReader's constructor has a `buffer_size` parameter, it's the size of 
this buffer:

When reading data from BufferedReader object, a larger
amount of data may be requested from the underlying raw
stream, and kept in an internal buffer.

The doc of BufferedReader[1]


If call the BufferedReader.read(size) function:

1, When `size` is a positive number, it reads `buffer_size`
   bytes from the underlying stream. This is expected behavior.

2, When `size` is -1, it tries to call underlying stream's
   readall() function [2]. In this case `buffer_size` is not
   be respected.
   
   The underlying stream may be `RawIOBase`, its readall()
   function read `DEFAULT_BUFFER_SIZE` bytes in each read [3].
   
   `DEFAULT_BUFFER_SIZE` currently only 8KB, which is very
   inefficient for BufferedReader.read(-1). If `buffer_size`
   bytes is read every time, will be the expected performance.

Attached file demonstrates this problem.


[1] doc of BufferedReader:
https://docs.python.org/3/library/io.html#io.BufferedReader

[2] BufferedReader.read(-1) tries to call underlying stream's readall() 
function:
https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/bufferedio.c#L1538-L1542

[3] RawIOBase.readall() read DEFAULT_BUFFER_SIZE each time:
https://github.com/python/cpython/blob/v3.9.0b5/Modules/_io/iobase.c#L968-L969

--
components: IO
files: demo.py
messages: 374652
nosy: malin
priority: normal
severity: normal
status: open
title: Inefficient BufferedReader.read(-1)
type: performance
versions: Python 3.10
Added file: https://bugs.python.org/file49354/demo.py

___
Python tracker 
<https://bugs.python.org/issue41452>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41265] lzma/bz2 module: inefficient buffer growth algorithm

2020-07-22 Thread Ma Lin


Ma Lin  added the comment:

I'm working on a patch.
lzma decompressing speed increases:

baseline: 0.275722 sec
patched:  0.140405 sec
(Uncompressed data size 52.57 MB)


The new algorithm looks like this:

#define INITIAL_BUFFER_SIZE (16*1024)

static inline Py_ssize_t
get_newsize(Py_ssize_t size)
{
const Py_ssize_t MB = 1024*1024;
const Py_ssize_t GB = 1024*1024*1024;

if (size <= 1*MB) {
return size << 2;   // x4
} else if (size <= 128*MB) {
return size << 1;   // x2
} else if (size <= 1*GB) {
return size + (size >> 1);  // x1.5
} else if (size <= 2*GB) {
return size + (size >> 2);  // x1.25
} else {
return size + (size >> 3);  // x1.125
}
}

--

___
Python tracker 
<https://bugs.python.org/issue41265>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41330] Inefficient error-handle for CJK encodings

2020-07-18 Thread Ma Lin


Ma Lin  added the comment:

> But how many new Python web application use CJK codec instead of UTF-8?

A CJK character usually takes 2-bytes in CJK encodings, but takes 3-bytes in 
UTF-8.

I tested a Chinese book:
in GBK: 853,025 bytes
in UTF-8: 1,267,523 bytes

For CJK content, UTF-8 is wasteful, maybe CJK encodings will not be eliminated.

--

___
Python tracker 
<https://bugs.python.org/issue41330>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41330] Inefficient error-handle for CJK encodings

2020-07-18 Thread Ma Lin


Ma Lin  added the comment:

IMO "xmlcharrefreplace" is useful for Web application.

For example, the page's charset is "gbk", then this statement can generate the 
bytes content easily & safely:

s.encode('gbk', 'xmlcharrefreplace')

Maybe some HTML-related frameworks use this way to escape characters, such as 
Sphinx [1].


Attached file `error_handers_fast_paths.txt` summarized all current 
error-handler fast-paths.

[1] Sphinx use 'xmlcharrefreplace' to escape
https://github.com/sphinx-doc/sphinx/blob/e65021fb9b0286f373f01dc19a5777e5eed49576/sphinx/builders/html/__init__.py#L1029

--
Added file: https://bugs.python.org/file49324/error_handers_fast_paths.txt

___
Python tracker 
<https://bugs.python.org/issue41330>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41330] Inefficient error-handle for CJK encodings

2020-07-17 Thread Ma Lin


New submission from Ma Lin :

CJK encode/decode functions only have three error-handler fast-paths:
replace
ignore
strict  
See the code: [1][2]

If use other built-in error-handlers, need to get the error-handler object, and 
call it with an Unicode Exception argument. See the code: [3]

But the error-handler object is not cached, it needs to be looked up from a 
dict every time, which is very inefficient.


Another possible optimization is to write fast-path for common error-handlers, 
Python has these built-in error-handlers:

strict
replace
ignore
backslashreplace
xmlcharrefreplace
namereplace
surrogateescape
surrogatepass (only for utf-8/utf-16/utf-32 family)

For example, maybe `xmlcharrefreplace` is heavily used in Web application, it 
can be implemented as a fast-path, so that no need to call the error-handler 
object every time.
Just like the `xmlcharrefreplace` fast-path in `PyUnicode_EncodeCharmap` [4].

[1] encode function:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/cjkcodecs/multibytecodec.c#L192

[2] decode function:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/cjkcodecs/multibytecodec.c#L347

[3] `call_error_callback` function:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/cjkcodecs/multibytecodec.c#L82

[4] `xmlcharrefreplace` fast-path in `PyUnicode_EncodeCharmap`:
https://github.com/python/cpython/blob/v3.9.0b4/Objects/unicodeobject.c#L8662

--
components: Unicode
messages: 373871
nosy: ezio.melotti, malin, vstinner
priority: normal
severity: normal
status: open
title: Inefficient error-handle for CJK encodings
type: performance
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue41330>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37095] [Feature Request]: Add zstd support in tarfile

2020-07-14 Thread Ma Lin


Ma Lin  added the comment:

> Add zstd support in tarfile

This requires the stdlib to contain a Zstandard module.

You can ask in the Idea forum:
https://discuss.python.org/c/ideas

--
nosy: +malin

___
Python tracker 
<https://bugs.python.org/issue37095>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41210] Docs: More description of reason about LZMA1 data handling with FORMAT_ALONE

2020-07-13 Thread Ma Lin


Ma Lin  added the comment:

It is better to raise a warning when using problematic combination.

But IMO either "raising a warning" or "adding more description to doc" is too 
dependent on the implementation detail of liblzma.

--

___
Python tracker 
<https://bugs.python.org/issue41210>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41265] lzma/bz2 module: inefficient buffer growth algorithm

2020-07-10 Thread Ma Lin


Ma Lin  added the comment:

Maybe the zlib module can also use the same algorithm.

zlib module's initial buffer size is 16KB [1], each time the size doubles [2].

[1] zlib module's initial buffer size:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/zlibmodule.c#L32

[2] zlib module buffer growth algorithm:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/zlibmodule.c#L174

--
nosy: +inada.naoki, rhettinger

___
Python tracker 
<https://bugs.python.org/issue41265>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41265] lzma/bz2 module: inefficient buffer growth algorithm

2020-07-10 Thread Ma Lin


New submission from Ma Lin :

lzma/bz2 modules are using the same buffer growth algorithm: [1][2]

newsize = size + (size >> 3) + 6;

lzma/bz2 modules' default output buffer is 8192 bytes [3][4], so the growth 
step is below.

For many cases, maybe the buffer is resized too many times.
Is it possible to design a new growth algorithm that grows faster when the size 
is not very large.

  1: 8,196 bytes
  2: 9,226 bytes
  3: 10,385 bytes
  4: 11,689 bytes
  5: 13,156 bytes
  6: 14,806 bytes
  7: 16,662 bytes
  8: 18,750 bytes
  9: 21,099 bytes
 10: 23,742 bytes
 11: 26,715 bytes
 12: 30,060 bytes
 13: 33,823 bytes
 14: 38,056 bytes
 15: 42,819 bytes
 16: 48,177 bytes
 17: 54,205 bytes
 18: 60,986 bytes
 19: 68,615 bytes
 20: 77,197 bytes
 21: 86,852 bytes
 22: 97,714 bytes
 23: 109,934 bytes
 24: 123,681 bytes
 25: 139,147 bytes
 26: 156,546 bytes
 27: 176,120 bytes
 28: 198,141 bytes
 29: 222,914 bytes
 30: 250,784 bytes
 31: 282,138 bytes
 32: 317,411 bytes
 33: 357,093 bytes
 34: 401,735 bytes
 35: 451,957 bytes
 36: 508,457 bytes
 37: 572,020 bytes
 38: 643,528 bytes
 39: 723,975 bytes
 40: 814,477 bytes
 41: 916,292 bytes
 42: 1,030,834 bytes
 43: 1,159,694 bytes
 44: 1,304,661 bytes
 45: 1,467,749 bytes
 46: 1,651,223 bytes
 47: 1,857,631 bytes
 48: 2,089,840 bytes
 49: 2,351,076 bytes
 50: 2,644,966 bytes
 51: 2,975,592 bytes
 52: 3,347,547 bytes
 53: 3,765,996 bytes
 54: 4,236,751 bytes
 55: 4,766,350 bytes
 56: 5,362,149 bytes
 57: 6,032,423 bytes
 58: 6,786,481 bytes
 59: 7,634,797 bytes
 60: 8,589,152 bytes
 61: 9,662,802 bytes
 62: 10,870,658 bytes
 63: 12,229,496 bytes
 64: 13,758,189 bytes
 65: 15,477,968 bytes
 66: 17,412,720 bytes
 67: 19,589,316 bytes
 68: 22,037,986 bytes
 69: 24,792,740 bytes
 70: 27,891,838 bytes
 71: 31,378,323 bytes
 72: 35,300,619 bytes
 73: 39,713,202 bytes
 74: 44,677,358 bytes
 75: 50,262,033 bytes
 76: 56,544,793 bytes
 77: 63,612,898 bytes
 78: 71,564,516 bytes
 79: 80,510,086 bytes
 80: 90,573,852 bytes
 81: 101,895,589 bytes
 82: 114,632,543 bytes
 83: 128,961,616 bytes
 84: 145,081,824 bytes
 85: 163,217,058 bytes
 86: 183,619,196 bytes
 87: 206,571,601 bytes
 88: 232,393,057 bytes
 89: 261,442,195 bytes
 90: 294,122,475 bytes
 91: 330,887,790 bytes
 92: 372,248,769 bytes
 93: 418,779,871 bytes
 94: 471,127,360 bytes
 95: 530,018,286 bytes
 96: 596,270,577 bytes
 97: 670,804,405 bytes
 98: 754,654,961 bytes
 99: 848,986,837 bytes
100: 955,110,197 bytes

[1] lzma buffer growth algorithm:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/_lzmamodule.c#L133

[2] bz2 buffer growth algorithm:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/_bz2module.c#L121

[3] lzma default buffer size:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/_lzmamodule.c#L124

[4] bz2 default buffer size:
https://github.com/python/cpython/blob/v3.9.0b4/Modules/_bz2module.c#L109

--
components: Library (Lib)
messages: 373454
nosy: malin
priority: normal
severity: normal
status: open
title: lzma/bz2 module: inefficient buffer growth algorithm
type: performance
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue41265>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-07 Thread Ma Lin


Ma Lin  added the comment:

There was a similar issue (issue21872).

When decompressing a lzma.FORMAT_ALONE format data, and it doesn't have the end 
marker (but has the correct "Uncompressed Size" in the .lzma header), sometimes 
the last one to dozens bytes can't be output.

issue21872 fixed the problem in `_lzmamodule.c`. But if liblzma strictly 
follows zlib's API (IMO it should), there should be no this problem.


I debugged your code with attached file `lzmabcj.bin`, when it output 12796 
bytes, the output buffer still has 353 bytes space. So it seems to be a problem 
of liblzma.

IMHO, we first wait the reply from liblzma maintainer, if Lasse Collin thinks 
this is a bug, let us wait for the upstream fix. And I will report the 
issue21872 to see if he can fix the problem in upstream as well.

--

___
Python tracker 
<https://bugs.python.org/issue41210>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-06 Thread Ma Lin


Ma Lin  added the comment:

The docs[1] said:

Compression filters:
FILTER_LZMA1 (for use with FORMAT_ALONE)
FILTER_LZMA2 (for use with FORMAT_XZ and FORMAT_RAW)

But your code uses a combination of `FILTER_LZMA1` and `FORMAT_RAW`, is this ok?

[1] https://docs.python.org/3/library/lzma.html#specifying-custom-filter-chains

--

___
Python tracker 
<https://bugs.python.org/issue41210>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41210] LZMADecompressor.decompress(FORMAT_RAW) truncate output when input is paticular LZMA+BCJ data

2020-07-05 Thread Ma Lin


Change by Ma Lin :


--
components: +Library (Lib) -Extension Modules
nosy: +malin

___
Python tracker 
<https://bugs.python.org/issue41210>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35859] Capture behavior depends on the order of an alternation

2020-06-29 Thread Ma Lin


Ma Lin  added the comment:

Do I need to write a detailed review guide? I suppose that after reading it 
from beginning to end, it will be easy to understand PR 12427, no need to read 
anything else.

Or plan to replace the sre module with the regex module in a future version?

--

___
Python tracker 
<https://bugs.python.org/issue35859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin


Ma Lin  added the comment:

Why you always want to use "utf-8" encoded identifier as group name in `bytes` 
pattern.

The direction is: a group name written in `bytes` pattern, and will convert to 
`str.
Not this direction: `str` group name -(utf8)-> `bytes` pattern -> `str` group 
name

--

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin

Ma Lin  added the comment:

Please look at these:

>>> orig_name = "Ř"
>>> orig_ch = orig_name.encode("cp1250") # Because why not?
>>> orig_ch
b'\xd8'
>>> name = list(re.match(b"(?P<" + orig_ch + b">)", 
b"").groupdict().keys())[0]
>>> name
'Ø'  # '\xd8'
>>> name == orig_name
False
>>> name.encode("latin-1")
b'\xd8'
>>> name.encode("latin-1") == orig_ch
True

"Ř" (\u0158) --cp1250--> b'\xd8'
"Ø" (\u00d8) --latin-1--> b'\xd8'

--

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin


Ma Lin  added the comment:

> this limitation to the latin-1 subset is not compatible with the 
> documentation, which says that valid Python identifiers are valid group names.

Not all latin-1 characters are valid identifier, for example:

>>> '\x94'.encode('latin1')
b'\x94'
>>> '\x94'.isidentifier()
False

There is a workaround, you can convert `bytes` to `str` with "latin-1" decoder 
before processing, IIRC there will be no extra overhead (memory/speed) during 
processing, then the name and content are the same type. :)

--

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin


Ma Lin  added the comment:

It seems you don't know some knowledge of encoding yet.

Naturally, `bytes` cannot contain character which Unicode code point is greater 
than \u00ff. So you can only use "latin1" encoding, which map from character to 
byte (or reverse) directly.

"utf-8", "utf-16" and "utf-32" are all encoding codecs, "utf-8" should not have 
a special status in this scene.

--
nosy:  -ezio.melotti, mrabarnett

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin

Ma Lin  added the comment:

In this case, you can only use 'latin1', which directly map one character 
(\u-\u00FF) to/from one byte.

If use 'utf-8', it may map one character to multiple bytes, such as 'Δ' -> 
b'\xce\x94'

'\x94' is an invalid identifier, it will raise an error:

>>> '\xce'.isidentifier()   # '\xce' is 'Î'
True
>>> '\x94'.isidentifier()
False

You may close this issue (I can't close it), we can continue the discussion.

--

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin

Ma Lin  added the comment:

`latin1` is the character set that Unicode code point from \u to \u00ff, 
and the characters are directly mapped from/to bytes.

So b'\xe9' is mapped to \u00e9, it is `é`.

Of course, characters with Unicode code point greater than 0xff are impossible 
to appear in `bytes`.

--

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin

Ma Lin  added the comment:

> a non-ascii group name will raise an error in bytes, even if encoded

Looks like this is a language limitation:

>>> b'é'
  File "", line 1
SyntaxError: bytes can only contain ASCII literal characters.

No problem if you use escaped character:

>>> re.match(b'(?P<\xe9>)', b'').groupdict()
{'é': b''}

There may be some inconveniences in your program, but IMO there is nothing 
wrong, maybe this issue can be closed.

--

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40980] group names of bytes regexes are strings

2020-06-15 Thread Ma Lin


Ma Lin  added the comment:

Group name is `str` is very reasonable. Essentially it is just a name, it has 
nothing to do with `bytes`.

Other names in Python are also `str` type, such as codec names, hashlib names.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue40980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29242] Crash on GC when compiling PyPy

2020-06-09 Thread Ma Lin


Ma Lin  added the comment:

I suggest not to close this issue, this is an opportunity to investigate 
whether Python3 has this problem as well.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue29242>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40861] On Windows, liblzma is always built without optimization

2020-06-06 Thread Ma Lin


Ma Lin  added the comment:

Good catch.

You can submit a PR to fix this. If you start from zero and do it slowly, it 
will take about a week or two.

--
components: +Windows -Build
nosy: +Ma Lin, paul.moore, steve.dower, tim.golden, zach.ware

___
Python tracker 
<https://bugs.python.org/issue40861>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40859] Update Windows build to use xz-5.2.5

2020-06-03 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +19847
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/20622

___
Python tracker 
<https://bugs.python.org/issue40859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40859] Update Windows build to use xz-5.2.5

2020-06-03 Thread Ma Lin


New submission from Ma Lin :

The Windows build is using xz-5.2.2, it was released on 2015-09-29.
xz-5.2.5 was released recently, maybe we can update this library.

When preparing cpython-source-deps, don't forget to copy 
`xz-5.2.5\windows\vs2019\config.h` to `xz-5.2.5\windows\` folder.

`\vs2019\config.h` and `\vs2017\config.h` are the same, except for the comment 
on the first line.

I tested xz-5.2.5 on my local machine, it passed test_lzma.py.

XZ Utils Release Notes:
https://git.tukaani.org/?p=xz.git;a=blob;f=NEWS;hb=HEAD

--
components: Windows
messages: 370693
nosy: Ma Lin, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: Update Windows build to use xz-5.2.5
versions: Python 3.10, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue40859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35859] Capture behavior depends on the order of an alternation

2020-05-31 Thread Ma Lin


Ma Lin  added the comment:

Is there hope to merge to 3.9 branch?

--

___
Python tracker 
<https://bugs.python.org/issue35859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40416] Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded file causes UnicodeDecodeError

2020-05-02 Thread Ma Lin


Ma Lin  added the comment:

I did a git bisect, this commit fixed the bug:

https://github.com/python/cpython/commit/ac22f6aa989f18c33c12615af1c66c73cf75d5e7

--

___
Python tracker 
<https://bugs.python.org/issue40416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40416] Calling TextIOWrapper.tell() in the middle of reading a gb2312-encoded file causes UnicodeDecodeError

2020-05-02 Thread Ma Lin


Ma Lin  added the comment:

On Windows 10, Python 3.7, I get the same message as above reply.

If use Python 3.8, it works well.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue40416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40060] socket.TCP_NOTSENT_LOWAT is missing in official macOS builds

2020-04-07 Thread Ma Lin


Ma Lin  added the comment:

It seems that people usually use the socket module like this, I think it's safe 
to respect this habit:

if hasattr(socket, "FLAG_NAME"):
do_something

If use PR19402, your program will have problem on the older version system, not 
only "don't break existing code".

So I think delete-at-runtime is a suitable way.

--

___
Python tracker 
<https://bugs.python.org/issue40060>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40060] socket.TCP_NOTSENT_LOWAT is missing in official macOS builds

2020-04-07 Thread Ma Lin


Ma Lin  added the comment:

Windows build encountered a similar problem, see issue32394.
The solution is to check the runtime system version when importing socket 
module, if it is an older system, delete the constants. [1]

issue32394 has a small script (winsdk_watchdog.py) to help find such constants, 
usage:
1, build a CPython build with old SDK.
2, use winsdk_watchdog.py, dump possible affected constants to a file 
`winsdk_dump.json`.
3, build a CPython build with new SDK.
4, use winsdk_watchdog.py, compare constants between two builds .

If a new constant is introduced by new SDK/API, we remove it on older system 
during runtime.
Otherwise we can ignore this new constant, this means it has nothing to do with 
the new SDK.
(msg311858 is a demo.)

We don't need to use winsdk_watchdog.py routinely, just use it when updating 
the building SDK, this process only takes about 10~20 minutes.

I think macOS build can also uses this process.

[1]
The commit:
https://github.com/python/cpython/commit/19e7d48ce89422091f9af93038b9fee075d46e9e

Note that there was a minor fix later:
https://github.com/python/cpython/commit/8905fcc85a6fc3ac394bc89b0bbf40897e9497a#diff-a47fd74731aeb547ad780900bb8e6953

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue40060>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Ma Lin


Ma Lin  added the comment:

I also planned to review this commit at some moment, I feel a bit unsteady 
about it.

If an optimization needs to be fine-tuned, and may introduces some pitfalls for 
future code maintenance, IMHO it is best to avoid doing this kind of 
optimization.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue39974>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39033] zipimport raises NameError: name '_boostrap_external' is not defined

2019-12-12 Thread Ma Lin


Ma Lin  added the comment:

Is it possible to scan stdlib to find similar bugs?

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue39033>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37527] Timestamp conversion on windows fails with timestamps close to EPOCH

2019-11-01 Thread Ma Lin


Ma Lin  added the comment:

issue29097 fixed bug in `datetime.fromtimestamp()`.
But this issue is about `datetime.timestamp()`, not fixed yet.

--

___
Python tracker 
<https://bugs.python.org/issue37527>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23692] Undocumented feature prevents re module from finding certain matches

2019-10-27 Thread Ma Lin


Change by Ma Lin :


--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue23692>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Ma Lin


Ma Lin  added the comment:

> I'd still retain \0 as a special case, since it really is useful.

Yes, maybe \0 is used widely, I didn't think of it.
Changing is troublesome, let's keep it as is.

--

___
Python tracker 
<https://bugs.python.org/issue38582>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38582] re: backreference number in replace string can't >= 100

2019-10-25 Thread Ma Lin


Ma Lin  added the comment:

Octal escape:
\oooCharacter with octal value ooo
As in Standard C, up to three octal digits are accepted.

It only accepts UCS1 characters (ooo <= 0o377):
>>> ord('\377')
255
>>> len('\378')
2
>>> '\378' == '\37' + '8'
True

IMHO this is not useful, and creates confusions.
Maybe it can be deprecated in language level.

--

___
Python tracker 
<https://bugs.python.org/issue38582>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread Ma Lin


Ma Lin  added the comment:

@veaba 
Post only in English is fine.

> Is this actually needed?
Maybe very very few people dynamically generate some large patterns.

> However, \g<...> is not accepted in a pattern.
> in the "regex" module I added support for it in a pattern too.
Yes, backreference number in pattern also can't >= 100
Support \g<...> in pattern is a good idea.

If fix this issue, may produce backward compatibility issue: the parser will 
confuse backreference numbers and octal escape numbers.
Maybe can clarify the limit (<=99) in the document is enough.

--

___
Python tracker 
<https://bugs.python.org/issue38582>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38582] re: backreference number in replace string can't >= 100

2019-10-24 Thread Ma Lin


Ma Lin  added the comment:

Backreference number in replace string can't >= 100
https://github.com/python/cpython/blob/v3.8.0/Lib/sre_parse.py#L1022-L1036

If none take this, I will try to fix this issue tomorrow.

--
nosy: +serhiy.storchaka
title: Regular match overflow -> re: backreference number in replace string 
can't >= 100
versions: +Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue38582>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38582] Regular match overflow

2019-10-24 Thread Ma Lin


Ma Lin  added the comment:

An simpler reproduce code:

```
import re

NUM = 99

# items = [ '(001)', '(002)', '(003)', ..., '(NUM)']
items = [r'(%03d)' % i for i in range(1, 1+NUM)]
pattern = '|'.join(items)

# repl = '\1\2\3...\NUM'
temp = ('\\' + str(i) for i in range(1, 1+NUM))
repl = ''.join(temp)

text = re.sub(pattern, repl, '(001)')
print(text)

# if NUM == 99
# output: (001)
# if NUM == 100
# output: (001@)
# if NUM == 101
# output: (001@A)
```

--
components: +Regular Expressions
nosy: +ezio.melotti, mrabarnett

___
Python tracker 
<https://bugs.python.org/issue38582>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38582] Regular match overflow

2019-10-24 Thread Ma Lin


Change by Ma Lin :


--
nosy: +Ma Lin
type: security -> 

___
Python tracker 
<https://bugs.python.org/issue38582>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38056] Overhaul Error Handlers section in codecs documentation

2019-10-12 Thread Ma Lin


Ma Lin  added the comment:

PR 15732 became an overhaul:

- replace/backslashreplace/surrogateescape were wrongly described as encoding 
only, in fact they can also be used in decoding.
- clarify the description of surrogatepass.
- add more descriptions to each handler.
- add two REPL examples.
- add indexes for Error Handler's name.
- add default parameter values in codecs.rst
- improve term "text encoding".

PR 15732 has a screenshot of the Error Handlers section.

--
components: +Unicode
nosy: +ezio.melotti, vstinner
title: Add examples for common text encoding Error Handlers -> Overhaul Error 
Handlers section in codecs documentation

___
Python tracker 
<https://bugs.python.org/issue38056>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13153] IDLE 3.x on Windows exits when pasting non-BMP unicode

2019-10-03 Thread Ma Lin


Ma Lin  added the comment:

> Thus this breaks editing the physical line past the astral character. We 
> cannot do anything with this.

I tried, it's sad the experience is not very good.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue13153>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38321] Compiler warnings when building Python 3.8

2019-10-01 Thread Ma Lin


Ma Lin  added the comment:

> This file is copied directly from https://github.com/libexpat/libexpat/ > 
> project. Would you mind to propose your patch there?

ok, I will report to there.

--

___
Python tracker 
<https://bugs.python.org/issue38321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38321] Compiler warnings when building Python 3.8

2019-09-30 Thread Ma Lin


Ma Lin  added the comment:

Other warnings:

c:\vstinner\python\master\objects\longobject.c(420): warning C4244: 'function': 
conversion from 'unsigned __int64' to 'sdigit', possible loss of data

c:\vstinner\python\master\objects\longobject.c(428): warning C4267: 'function': 
conversion from 'size_t' to 'sdigit', possible loss of data
-
These warnings only appear in master branch, I will fix it at some point.
(https://bugs.python.org/issue35696#msg352903)

--

___
Python tracker 
<https://bugs.python.org/issue38321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38321] Compiler warnings when building Python 3.8

2019-09-30 Thread Ma Lin


Ma Lin  added the comment:

On my Windows, some non-ASCII characters cause this warning:

d:\dev\cpython\modules\expat\xmltok.c : warning C4819: 
The file contains a character that cannot be represented in
the current code page (936). Save the file in Unicode format
to prevent data loss.

This patch fixes the warnings, it's applicable to master/3.8 branches.
https://github.com/animalize/cpython/commit/daced7575ec70ef1f888c6854760e230cda5ea64

Maybe this trivial problem is not worth a new commit, it can be fixed along 
with other warnings.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue38321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38252] Use 8-byte step to detect ASCII sequence in 64bit Windows builds

2019-09-23 Thread Ma Lin


Ma Lin  added the comment:

There are 4 functions have the similar code, see PR 16334.
Just replaced the `unsigned long` type with `size_t` type, got these benchmarks.
Can this be backported to 3.8 branch?

1.  bytes.isascii()

D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "b = b'x' * 
100_000_000; f = b.isascii;" "f()"

+---+---+--+
| Benchmark | isascii_a | isascii_b|
+===+===+==+
| timeit| 11.7 ms   | 7.84 ms: 1.50x faster (-33%) |
+---+---+--+

2.  bytes.decode('latin1')

D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "b = b'x' * 
100_000_000; f = b.decode;" "f('latin1')"

+---+--+-+
| Benchmark | latin1_a | latin1_b|
+===+==+=+
| timeit| 60.3 ms  | 57.4 ms: 1.05x faster (-5%) |
+---+--+-+

3.  bytes.decode('ascii')

D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "b = b'x' * 
100_000_000; f = b.decode;" "f('ascii')"

+---+-+-+
| Benchmark | ascii_a | ascii_b |
+===+=+=+
| timeit| 48.5 ms | 47.1 ms: 1.03x faster (-3%) |
+---+-+-+

4.  bytes.decode('utf8')

D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "b = b'x' * 
100_000_000; f = b.decode;" "f('utf8')"

+---+-+-+
| Benchmark | utf8_a  | utf8_b  |
+===+=+=+
| timeit| 48.3 ms | 47.1 ms: 1.03x faster (-3%) |
+---+-+-+

--

___
Python tracker 
<https://bugs.python.org/issue38252>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38252] Use 8-byte step to detect ASCII sequence in 64bit Windows builds

2019-09-23 Thread Ma Lin


Change by Ma Lin :


--
title: micro-optimize ucs1lib_find_max_char in Windows 64-bit build -> Use 
8-byte step to detect ASCII sequence in 64bit Windows builds

___
Python tracker 
<https://bugs.python.org/issue38252>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38252] micro-optimize ucs1lib_find_max_char in Windows 64-bit build

2019-09-23 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +15911
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/16334

___
Python tracker 
<https://bugs.python.org/issue38252>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38252] micro-optimize ucs1lib_find_max_char in Windows 64-bit build

2019-09-23 Thread Ma Lin


Ma Lin  added the comment:

Maybe @sir-sigurd can find more optimizations.

FYI, `_Py_bytes_isascii()` function [1] also has similar code.
[1] https://github.com/python/cpython/blob/v3.8.0b4/Objects/bytes_methods.c#L104

--

___
Python tracker 
<https://bugs.python.org/issue38252>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38252] micro-optimize ucs1lib_find_max_char in Windows 64-bit build

2019-09-22 Thread Ma Lin


New submission from Ma Lin :

C type `long` is 4-byte integer in 64-bit Windows build. [1]

But `ucs1lib_find_max_char()` function [2] uses SIZEOF_LONG, so it loses a 
little performance in 64-bit Windows build.

Below is the benchmark of using SIZEOF_SIZE_T and this change:

-   unsigned long value = *(unsigned long *) _p;
+   sizt_t value = *(sizt_t *) _p;

D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "b=b'a'*10_000_000; 
f=b.decode;" "f('latin1')"

before: 5.83 ms +- 0.05 ms
after : 5.58 ms +- 0.06 ms

[1] https://stackoverflow.com/questions/384502

[2] 
https://github.com/python/cpython/blob/v3.8.0b4/Objects/stringlib/find_max_char.h#L9

Maybe there can be more optimizations, so I didn't prepare a PR for this.

--
components: Interpreter Core
messages: 352970
nosy: Ma Lin, inada.naoki, serhiy.storchaka, sir-sigurd
priority: normal
severity: normal
status: open
title: micro-optimize ucs1lib_find_max_char in Windows 64-bit build
type: performance
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue38252>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35696] remove unnecessary operation in long_compare()

2019-09-20 Thread Ma Lin

Ma Lin  added the comment:

> I'd fix them, but I'm not sure if we are going to restore CHECK_SMALL_INT() 
> ¯\_(ツ)_/¯

I suggest we slow down, carefully sort out the recent commits for longobject.c:
https://bugs.python.org/issue37812#msg352837

Make the code has consistent style, better readability...

--

___
Python tracker 
<https://bugs.python.org/issue35696>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37812] Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)

2019-09-20 Thread Ma Lin

Ma Lin  added the comment:

Recent commits for longobject.c

Revision: 5e63ab05f114987478a21612d918a1c0276fe9d2
Author: Greg Price 
Date: 19-8-25 1:19:37
Message:
bpo-37812: Convert CHECK_SMALL_INT macro to a function so the return is 
explicit. (GH-15216)

The concern for this issue is: implicit return from macro.
We can add a comment before the call sites of CHECK_SMALL_INT macro, to explain 
that there is a possible return.

Revision: 6b519985d23bd0f0bd072b5d5d5f2c60a81a19f2
Author: animalize 
Date: 19-9-6 14:00:56
Message:
replace inline function `is_small_int` with a macro version (GH-15710)

Then this commit is not necessary.

Revision: c6734ee7c55add5fdc2c821729ed5f67e237a096
Author: Sergey Fedoseev 
Date: 19-9-12 22:41:14
Message:
bpo-37802: Slightly improve perfomance of PyLong_FromUnsigned*() (GH-15192)

This commit introduced a compiler warning due to this line [1]:
d:\dev\cpython\objects\longobject.c(412): warning C4244: “function”: from 
“unsigned long” to “sdigit ”,may lose data

[1] the line:
return get_small_int((ival)); \
https://github.com/python/cpython/blob/master/Objects/longobject.c#L386

Revision: 42acb7b8d29d078bc97b0cfd7c4911b2266b26b9
Author: HongWeipeng <961365...@qq.com>
Date: 19-9-18 23:10:15
Message:
bpo-35696: Simplify long_compare() (GH-16146)

IMO this commit reduces readability a bit.

We can sort out these problems.

--

___
Python tracker 
<https://bugs.python.org/issue37812>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38205] Python no longer compiles without small integer singletons

2019-09-18 Thread Ma Lin


Ma Lin  added the comment:

PR 16270 use Py_UNREACHABLE() in a single line.
It solves this particular issue.

--

___
Python tracker 
<https://bugs.python.org/issue38205>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38205] Python no longer compiles without small integer singletons

2019-09-18 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +15860
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/16270

___
Python tracker 
<https://bugs.python.org/issue38205>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38205] Python no longer compiles without small integer singletons

2019-09-18 Thread Ma Lin


Ma Lin  added the comment:

If use static inline function, and Py_UNREACHABLE() inside an if-else branch 
that should return a value, compiler may emit warning:
https://godbolt.org/z/YtcNSf

MSVC v19.14:
warning C4715: 'test': not all control paths return a value

clang 8.0.0:
warning: control may reach end of non-void function [-Wreturn-type]

Other compilers (gcc, icc) don't emit this warning.

This situation in real code:
https://github.com/python/cpython/blob/v3.8.0b4/Include/object.h#L600
https://github.com/python/cpython/blob/v3.8.0b4/Objects/longobject.c#L3088

--

___
Python tracker 
<https://bugs.python.org/issue38205>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37812] Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)

2019-09-17 Thread Ma Lin


Ma Lin  added the comment:

> I agree that both changes should be reverted.

There is another commit after the two commits:
https://github.com/python/cpython/commit/c6734ee7c55add5fdc2c821729ed5f67e237a096

It is troublesome to revert them.

PR 16146 is on-going, maybe we can request the author to replace 
`Py_UNREACHABLE()` with `assert(0)`.

--

___
Python tracker 
<https://bugs.python.org/issue37812>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37812] Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)

2019-09-17 Thread Ma Lin


Ma Lin  added the comment:

> It's not clear to me if anyone benchmarked to see if the
> conversion to a macro had any measurable performance benefit.

I tested on that day, also use this command: 

python.exe -m pyperf timeit -s "from collections import deque; consume = 
deque(maxlen=0).extend; r = range(256)" "consume(r)"  --duplicate=1000

I remember the results are:
inline function: 1.6  us
macro version  : 1.27 us
(32-bit release build by MSVC 2017)

Since the difference is too obvious, I tested it only once for each version.

--

___
Python tracker 
<https://bugs.python.org/issue37812>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38205] Python no longer compiles without small integer singletons

2019-09-17 Thread Ma Lin


Ma Lin  added the comment:

We can change Py_UNREACHABLE() to assert(0) in longobject.c
Or remove the article in Py_UNREACHABLE()

--

___
Python tracker 
<https://bugs.python.org/issue38205>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38205] Python no longer compiles without small integer singletons

2019-09-17 Thread Ma Lin


Ma Lin  added the comment:

This commit changed Py_UNREACHABLE() five days ago:

https://github.com/python/cpython/commit/3ab61473ba7f3dca32d779ec2766a4faa0657923

If remove this change, it can be compiled successfully.

--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue38205>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue21872] LZMA library sometimes fails to decompress a file

2019-09-13 Thread Ma Lin


Ma Lin  added the comment:

Some memos:

1, In liblzma, these missing bytes were copied inside `dict_repeat` function:

 788 case SEQ_COPY:
 789 // Repeat len bytes from distance of rep0.
 790 if (unlikely(dict_repeat(, rep0, ))) {

See liblzma's source code (xz-5.2 branch):
https://git.tukaani.org/?p=xz.git;a=blob;f=src/liblzma/lzma/lzma_decoder.c

2, Above replies said xz's command line tools can extract the problematic files 
successfully.

This is because xz checks `if (avail_out == 0)` first, then checks `if 
(avail_in == 0)`
See `uncompress` function in this source code (xz-5.2 branch):
https://git.tukaani.org/?p=xz.git;a=blob;f=src/xzdec/xzdec.c;hb=refs/heads/v5.2

This check order just avoids the problem.

--

___
Python tracker 
<https://bugs.python.org/issue21872>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38015] inline function generates slightly inefficient machine code

2019-09-09 Thread Ma Lin


Ma Lin  added the comment:

PR 15710 has been merged into the master, but the merge message is not shown 
here.
Commit: 
https://github.com/python/cpython/commit/6b519985d23bd0f0bd072b5d5d5f2c60a81a19f2

Maybe this issue can be closed.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38015>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38037] reference counter issue in signal module

2019-09-09 Thread Ma Lin


Change by Ma Lin :


--
pull_requests: +15407
pull_request: https://github.com/python/cpython/pull/15753

___
Python tracker 
<https://bugs.python.org/issue38037>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38056] Add examples for common text encoding Error Handlers

2019-09-08 Thread Ma Lin


Change by Ma Lin :


--
keywords: +patch
pull_requests: +15386
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/15732

___
Python tracker 
<https://bugs.python.org/issue38056>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38056] Add examples for common text encoding Error Handlers

2019-09-08 Thread Ma Lin

New submission from Ma Lin :

Text descriptions about `Error Handlers` are not very friendly to novices.
https://docs.python.org/3/library/codecs.html#error-handlers

For example:

'xmlcharrefreplace'
Replace with the appropriate XML character reference (only for encoding).  
Implemented in :func:`xmlcharrefreplace_errors`. 

'backslashreplace'
Replace with backslashed escape sequences. Implemented in 
:func:`backslashreplace_errors`.

'namereplace'
Replace with ``\N{...}`` escape sequences (only for encoding).  Implemented 
in :func:`namereplace_errors`.

Novices may not know what these are.
Giving some examples may help the reader to understand more intuitively.
The effect picture is attached.

I picked two characters:
ß  https://www.compart.com/en/unicode/U+00DF
♬ https://www.compart.com/en/unicode/U+266C

--
assignee: docs@python
components: Documentation
files: effect.png
messages: 351329
nosy: Ma Lin, docs@python
priority: normal
severity: normal
status: open
title: Add examples for common text encoding Error Handlers
versions: Python 3.7, Python 3.8, Python 3.9
Added file: https://bugs.python.org/file48599/effect.png

___
Python tracker 
<https://bugs.python.org/issue38056>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38015] inline function generates slightly inefficient machine code

2019-09-07 Thread Ma Lin


Ma Lin  added the comment:

> This change produces tiny, but measurable speed-up for handling small ints

I didn't get measurable change, I run this command a dozen times and take the 
best result:

D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "from collections 
import deque; consume = deque(maxlen=0).extend; r = range(256)" "consume(r)"  
--duplicate=1000

before: Mean +- std dev: 771 ns +- 16 ns
after:  Mean +- std dev: 770 ns +- 10 ns

Environment:
64-bit release build by MSVC 2017
CPU: i3 4160, System: latest Windows 10 64-bit

Check the machine code from godbolt.org, x64 MSVC v19.14 only saves one 
instruction:
movsxd  rax, ecx

x86-64 GCC 9.2 saves two instructions:
lea eax, [rdi+5]
cdqe

--

___
Python tracker 
<https://bugs.python.org/issue38015>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26868] Document PyModule_AddObject's behavior on error

2019-09-07 Thread Ma Lin


Change by Ma Lin :


--
nosy: +Ma Lin

___
Python tracker 
<https://bugs.python.org/issue26868>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38037] reference counter issue in signal module

2019-09-06 Thread Ma Lin


Change by Ma Lin :


--
title: Assertion failed: object has negative ref count -> reference counter 
issue in signal module

___
Python tracker 
<https://bugs.python.org/issue38037>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38015] inline function generates slightly inefficient machine code

2019-09-06 Thread Ma Lin


Ma Lin  added the comment:

This range has not been changed since "preallocated small integer pool" was 
introduced:

#define NSMALLPOSINTS   257
#define NSMALLNEGINTS   5

The commit (Jan 2007):
https://github.com/python/cpython/commit/ddefaf31b366ea84250fc5090837c2b764a04102


Is it worth increase the range?
FYI, build with MSVC 2017, the `small_ints` size:

32-bit build:
sizeof(PyLongObject)16 bytes
sizeof(small_ints)4192 bytes

64-bit build:
sizeof(PyLongObject)32 bytes
sizeof(small_ints)8384 bytes

--

___
Python tracker 
<https://bugs.python.org/issue38015>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   >