[issue39867] randrange(N) for N's in same dyadic blocs have excessive correlations when sharing identical seeds

2020-03-05 Thread jfbu


jfbu  added the comment:

"bug" is a strong word, which I did never employ myself, apart from using this 
channel of report. I rather think of a (non-documented) "deficiency", but I 
expect the consensus will again be that I am expressing a "raw expression". 
However reading more than once that "the correlations are indeed unsurprising" 
it is my turn to see there a raw expression. The correlations are unsurprising 
*only* if one looks at the source code and understand how a (to a very high 
degree) uniform distribution on a power of 2 range is reduced to distribution 
on the smaller range (keeping the extremely high uniformity). Thus, sorry, the 
correlations are to the contrary *very surprising* to the end user who has no 
knowledge of the internals.

Donald Knuth for example many decades ago in his work on MetaPost used a RNG 
which is a kind a primitive ancestor (in the family of those commented upon in 
AOCP) of the much more sophisticated one used nowadays by Python. He used the 
rescaling with rounding method to go from power of 2 range to non power of 2 
range. That method induces some non-uniformity and if I understand (without 
having checked) it was the one from Python < 3.2. At some point Python changed 
its way to another way, to cure non-uniformity (and other artefacts of the 
simple minded rescaling method). But there are other ways to reduce the 
non-uniformity to negligible levels which are not the choice made currently in 
Python code. (which for people not having _randbelow_with_getrandbits before 
their eyes is simply to draw a random integer with at most the number of bits 
of the limit N until one is found at most equal to N).

--

___
Python tracker 
<https://bugs.python.org/issue39867>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39867] randrange(N) for N's in same dyadic blocs have excessive correlations when sharing identical seeds

2020-03-05 Thread jfbu


jfbu  added the comment:

@tim.peters yes, a uniform random variable rescaled to two nearby scales N and 
M will display strong correlations. The CPython randrange() exhibits however 
orders of magnitude higher such correlations, but only in relation to a common 
bitlength. A randrange() function should a priori not be so strongly tied to 
the binary base.

The example you show would not be counted as a hit by my test for the 
randomseed 12.

>>> s = 0
>>> for t in range(10):
... random.seed(t)
... x = [round(random.random() * 100) for i in range(10)]
... random.seed(t)
... y = [round(random.random() * 101) for i in range(10)]
... if x == y:
... s += 1
... 
>>> s
94
>>> s = 0
>>> for t in range(10):
... random.seed(t)
... x = [random.randrange(100) for i in range(10)]
... random.seed(t)
... y = [random.randrange(101) for i in range(10)]
... if x == y:
... s += 1
... 
>>> s
90432

--

___
Python tracker 
<https://bugs.python.org/issue39867>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39867] randrange(N) for N's in same dyadic blocs have excessive correlations when sharing identical seeds

2020-03-05 Thread jfbu


jfbu  added the comment:

Yes indeed the source code of _randbelow_with_getrandbits generates this 
effect. And I understand the puzzlement about my test file setting the same 
random seed and then complaining about correlations. But there is some 
non-uniformity which is enormous between what happens for say n=99 and m=127 
versus n=99 and m=128.

Now that I looked at the actual source code, I have in other programming 
contexts available manners of reducing from a power of 2 to an arbitrary range 
which should not show this artefact.  I will need to translate it into Python 
and may submit a PR for evaluation after having tested.

My test file illustrates that if randrange() proceeded from a pseudo-random 
variable emulating the uniform distribution in (0,1) and rescaled and rounded 
to an integer range, correlations between distinct ranges would be of a 
completely different nature than what the CPython method leads to.

--

___
Python tracker 
<https://bugs.python.org/issue39867>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39867] randrange(N) for N's in same dyadic blocs have excessive correlations when sharing identical seeds

2020-03-05 Thread jfbu


New submission from jfbu :

We generate quadruples of random integers using randrange(n) and randrange(m) 
and count how many times the quadruples are identical, using the same random 
seed. Of course for nearby n and m (the real life example was with n==95 and 
m==97) we do expect matches. But we found orders of magnitude more than was 
expected.

The attached file demonstrates this by comparison with random()*n (with 
rounding) as alternative method to generate the random integers (we are aware 
this gives less uniformity for a given range, but these effects are completely 
negligible in comparison to the effect we test). For the latter the probability 
of matches is non-vanishing but orders of magnitude smaller than using 
randrange(n).

Here is an excerpt of our testing result. Each trial uses a random seed 
(selected via randrange(1)). Then 4 random integers in two given ranges 
are generated and compared. A hit is when all 4 match.

- with randrange():

n = 99, m = 124, 4135 hits among 1 trials
n = 99, m = 125, 3804 hits among 1 trials
n = 99, m = 126, 3803 hits among 1 trials
n = 99, m = 127, 3892 hits among 1 trials
n = 99, m = 128, 0 hits among 1 trials
n = 99, m = 129, 0 hits among 1 trials
n = 99, m = 130, 0 hits among 1 trials
n = 99, m = 131, 0 hits among 1 trials

- with random():

n = 99, m = 124, 0 hits among 1 trials
n = 99, m = 125, 0 hits among 1 trials
n = 99, m = 126, 0 hits among 1 trials
n = 99, m = 127, 0 hits among 1 trials
n = 99, m = 128, 0 hits among 1 trials
n = 99, m = 129, 0 hits among 1 trials
n = 99, m = 130, 0 hits among 1 trials
n = 99, m = 131, 0 hits among 1 trials

The test file has some hard-coded random seeds for reproductibility.

Although I did only limited testing it is flagrant there is completely abnormal 
correlation between randrange(n) and randrange(m) when the two integers have 
the same length in base 2.

Tested with 3.6 and 3.8.

--
files: testrandrange.py
messages: 363451
nosy: jfbu
priority: normal
severity: normal
status: open
title: randrange(N) for N's in same dyadic blocs have excessive correlations 
when sharing identical seeds
versions: Python 3.8
Added file: https://bugs.python.org/file48955/testrandrange.py

___
Python tracker 
<https://bugs.python.org/issue39867>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35564] [DOC] Sphinx 2.0 will require master_doc variable set in conf.py

2018-12-23 Thread jfbu


jfbu  added the comment:

sorry for previous message whose text mentioned the GitHub pull request number 
but this links to bpo issue of that number, of course completely unrelated

--
pull_requests: +10525

___
Python tracker 
<https://bugs.python.org/issue35564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35564] [DOC] Sphinx 2.0 will require master_doc variable set in conf.py

2018-12-23 Thread jfbu


Change by jfbu :


--
pull_requests:  -10524

___
Python tracker 
<https://bugs.python.org/issue35564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35564] [DOC] Sphinx 2.0 will require master_doc variable set in conf.py

2018-12-23 Thread jfbu


jfbu  added the comment:

GitHub PR #11290 has been merged into master

--
keywords: +patch
pull_requests: +10524
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue35564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35564] [DOC] Sphinx 2.0 will require master_doc variable set in conf.py

2018-12-22 Thread jfbu


New submission from jfbu :

When building CPython doc with master branch of dev repo of Sphinx (future 
Sphinx 2.0) one gets this warning:

WARNING: Since v2.0, Sphinx uses "index" as master_doc by default. Please add 
"master_doc = 'contents'" to your conf.py.

Fix will be to do as Sphinx says :)

--
assignee: docs@python
components: Documentation
messages: 332371
nosy: docs@python, jfbu
priority: normal
severity: normal
status: open
title: [DOC] Sphinx 2.0 will require master_doc variable set in conf.py

___
Python tracker 
<https://bugs.python.org/issue35564>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35528] [DOC] [LaTeX] Sphinx 2.0 uses GNU FreeFont as default for xelatex

2018-12-18 Thread jfbu


New submission from jfbu :

Not sure if any issue at all, but as said in title, starting with Sphinx 2.0 
(Spring 2019), XeLaTeX will be configured to use by default GNU FreeFont, (see 
https://github.com/sphinx-doc/sphinx/blob/master/CHANGES), and this means new 
dependency (for documentation builds on Ubuntu, package fonts-freefont-otf; for 
builds on Fedora 29 it is texlive-gnu-freefont).  Indeed currently CPython PDFs 
are built using ``xelatex``.

--
assignee: docs@python
components: Documentation
messages: 332092
nosy: docs@python, jfbu
priority: normal
severity: normal
status: open
title: [DOC] [LaTeX] Sphinx 2.0 uses GNU FreeFont as default for xelatex

___
Python tracker 
<https://bugs.python.org/issue35528>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34293] DOC: Makefile inherits a Sphinx 1.5 bug regarding PAPER envvar

2018-07-31 Thread jfbu


jfbu  added the comment:

https://github.com/python/cpython/pull/8585

--

___
Python tracker 
<https://bugs.python.org/issue34293>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34293] DOC: Makefile inherits a Sphinx 1.5 bug regarding PAPER envvar

2018-07-31 Thread jfbu


Change by jfbu :


--
keywords: +patch
pull_requests: +8094
stage:  -> patch review

___
Python tracker 
<https://bugs.python.org/issue34293>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34293] DOC: Makefile inherits a Sphinx 1.5 bug regarding PAPER envvar

2018-07-31 Thread jfbu


New submission from jfbu :

There has been a bug at Sphinx since release 1.5 
https://github.com/sphinx-doc/sphinx/issues/5234 about wrong handling of PAPER 
environment variable. The Makefile in Doc/ reproduces the error. As a result 
the "A4 latex" and "letter latex" sub targets of "dist" mis-behave.

A bugfix will be released at Sphinx 1.7.7 or 1.8 but the CPython Doc/Makefile 
needs an update, because the bugfix can only solve problems for new projects 
created with sphinx-quickstart (whether or not using the "make-mode" small 
Makefile, or the "no-make-mode" bigger Makefile which was default up to Sphinx 
1.5).

I will send PR next.

--
assignee: docs@python
components: Documentation
messages: 322770
nosy: docs@python, jfbu
priority: normal
severity: normal
status: open
title: DOC: Makefile inherits a Sphinx 1.5 bug regarding PAPER envvar
type: behavior

___
Python tracker 
<https://bugs.python.org/issue34293>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31589] Links for French documentation PDF is broken: LaTeX issue with non-ASCII characters?

2017-12-03 Thread jfbu

jfbu <j...@free.fr> added the comment:

Related https://github.com/sphinx-doc/sphinx/issues/4272

It is stated there that using babel-french in place of polyglossia-french 
avoids the "Improper discretionary list" xetex problem starting with xetex 
0.2 (i.e. TeXLive 2015) whereas with polyglossia-french the earliest xetex 
version I could test with success is 0.6 (TL2016). But starting with 
TL2016, polyglossia-french as the issue 
https://github.com/sphinx-doc/sphinx/issues/4272

With TeXLive 2014, using babel-french does not avoid the "Improper 
discretionary list" xetex problem. I don't know how this maps to Debian 
packaging. One needs xetex 0.2 at minimum.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31589>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31589] Links for French documentation PDF is broken: LaTeX issue with non-ASCII characters?

2017-12-03 Thread jfbu

jfbu <j...@free.fr> added the comment:

On-going discussion at http://tug.org/pipermail/xetex/2017-December/027212.html 
has brought new element that polyglossia's French module is broken with xetex 
since TeXLive2016. We had only one problem, we now have two on our hands.

Possibly Sphinx could be default use babel + French, not polyglossia + French, 
as the former is maintained but apparently less so the latter.

I tested that TeXLive 2015 (fully updated) and test document showing the 
https://github.com/sphinx-doc/sphinx/issues/3546 problem now compiles fine if 
using 

latex_elements = {
'babel': r'\usepackage{babel}',
}

in conf.py file, to override polyglossia which is default for Sphinx with 
xelatex.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31589>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31589] Links for French documentation PDF is broken: LaTeX issue with non-ASCII characters?

2017-12-03 Thread jfbu

jfbu <j...@free.fr> added the comment:

I can confirm the "Improper discretionary list" error from xetex build is a 
xetex bug which is present at XeTeX 0.2 and absent at XeTeX 0.6 and 
presumably all more recent releases.

It was seen at https://github.com/sphinx-doc/sphinx/issues/3546 and reported to 
XeTeX mailing list at http://tug.org/pipermail/xetex/2017-March/027056.html

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31589>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32200] Full docs build of 3.6 and 3.7 failing since 2017-10-15

2017-12-03 Thread jfbu

jfbu <j...@free.fr> added the comment:

For info, the xetex problem "Improper discretionary list" is presumably the one 
seen at  https://github.com/sphinx-doc/sphinx/issues/3546. I asked on xetex 
mailing list at http://tug.org/pipermail/xetex/2017-March/027056.html to which 
xetex bug it was related but it appears I got no reply. Hence I don't know the 
precise xetex release which fixed it, but it is ok with `XeTeX 0.6`. No 
action was taken on Sphinx side as this appeared to be a XeTeX bug only.

--

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue32200>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31589] Links for French documentation PDF is broken: LaTeX issue with non-ASCII characters?

2017-10-21 Thread jfbu

jfbu <j...@free.fr> added the comment:

I have made a PR at https://github.com/python/cpython/pull/4069 which enhances 
`conf.py` with some pdflatex extra Unicode configuration. I tested it with 
building PDF English documentation at master (at 
https://github.com/python/cpython/tree/db60a5bfa5d5f7a6f1538cc1fe76f0fda57b524e)
 and at 3.6 branch (at 
https://github.com/python/cpython/tree/1e78ed6825701029aa45a68f9e62dd3bb8d4e928)
 and also French documentation at 3.6 (at 
https://github.com/python/python-docs-fr/commit/76b522b79c3caa26658920c714acf8fac0c20eeb).
 The changes are only for ``pdflatex`` builds: if `latex_engine` is set to 
`xelatex`, `lualatex`, or `platex` (automatic if language is `ja`), nothing is 
modified.

--
nosy: +jfbu

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31589>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue31589] Links for French documentation PDF is broken: LaTeX issue with non-ASCII characters?

2017-10-21 Thread jfbu

Change by jfbu <j...@free.fr>:


--
pull_requests: +4040

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31589>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com