Re: [Python-Dev] [Python-checkins] bpo-33522: Enable CI builds on Visual Studio Team Services (GH-6865) (GH-6925)

2018-05-17 Thread Gregory P. Smith
Why did this commit modify .py files, unittests, and test.support?

That is inappropriate for something claiming to merely enable a CI platform.

-gps

On Thu, May 17, 2018 at 6:50 AM Steve Dower 
wrote:

>
> https://github.com/python/cpython/commit/0d8f83f59c8f4cc7fe125434ca4ecdcac111810f
> commit: 0d8f83f59c8f4cc7fe125434ca4ecdcac111810f
> branch: 3.6
> author: Steve Dower 
> committer: GitHub 
> date: 2018-05-17T09:46:00-04:00
> summary:
>
> bpo-33522: Enable CI builds on Visual Studio Team Services (GH-6865)
> (GH-6925)
>
> files:
> A .vsts/docs-release.yml
> A .vsts/docs.yml
> A .vsts/linux-buildbot.yml
> A .vsts/linux-coverage.yml
> A .vsts/linux-deps.yml
> A .vsts/linux-pr.yml
> A .vsts/macos-buildbot.yml
> A .vsts/macos-pr.yml
> A .vsts/windows-buildbot.yml
> A .vsts/windows-pr.yml
> A Misc/NEWS.d/next/Build/2018-05-15-12-44-50.bpo-33522.mJoNcA.rst
> A Misc/NEWS.d/next/Library/2018-05-16-17-05-48.bpo-33548.xWslmx.rst
> M Doc/make.bat
> M Lib/tempfile.py
> M Lib/test/support/__init__.py
> M Lib/test/test_asyncio/test_base_events.py
> M Lib/test/test_bdb.py
> M Lib/test/test_pathlib.py
> M Lib/test/test_poplib.py
> M Lib/test/test_selectors.py
> M PCbuild/rt.bat
> M Tools/ssl/multissltests.py
>
> diff --git a/.vsts/docs-release.yml b/.vsts/docs-release.yml
> new file mode 100644
> index ..e90428a42494
> --- /dev/null
> +++ b/.vsts/docs-release.yml
> @@ -0,0 +1,43 @@
> +# Current docs for the syntax of this file are at:
> +#
> https://github.com/Microsoft/vsts-agent/blob/master/docs/preview/yamlgettingstarted.md
> +
> +name: $(BuildDefinitionName)_$(Date:MMdd)$(Rev:.rr)
> +
> +queue:
> +  name: Hosted Linux Preview
> +
> +#variables:
> +
> +steps:
> +- checkout: self
> +  clean: true
> +  fetchDepth: 5
> +
> +- script: sudo apt-get update && sudo apt-get install -qy --force-yes
> texlive-full
> +  displayName: 'Install LaTeX'
> +
> +- task: UsePythonVersion@0
> +  displayName: 'Use Python 3.6 or later'
> +  inputs:
> +versionSpec: '>=3.6'
> +
> +- script: python -m pip install sphinx blurb python-docs-theme
> +  displayName: 'Install build dependencies'
> +
> +- script: make dist PYTHON=python SPHINXBUILD='python -m sphinx'
> BLURB='python -m blurb'
> +  workingDirectory: '$(build.sourcesDirectory)/Doc'
> +  displayName: 'Build documentation'
> +
> +- task: PublishBuildArtifacts@1
> +  displayName: 'Publish build'
> +  inputs:
> +PathToPublish: '$(build.sourcesDirectory)/Doc/build'
> +ArtifactName: build
> +publishLocation: Container
> +
> +- task: PublishBuildArtifacts@1
> +  displayName: 'Publish dist'
> +  inputs:
> +PathToPublish: '$(build.sourcesDirectory)/Doc/dist'
> +ArtifactName: dist
> +publishLocation: Container
> diff --git a/.vsts/docs.yml b/.vsts/docs.yml
> new file mode 100644
> index ..efa1e871656d
> --- /dev/null
> +++ b/.vsts/docs.yml
> @@ -0,0 +1,43 @@
> +# Current docs for the syntax of this file are at:
> +#
> https://github.com/Microsoft/vsts-agent/blob/master/docs/preview/yamlgettingstarted.md
> +
> +name: $(BuildDefinitionName)_$(Date:MMdd)$(Rev:.rr)
> +
> +queue:
> +  name: Hosted Linux Preview
> +
> +trigger:
> +  branches:
> +include:
> +- master
> +- 3.7
> +- 3.6
> +  paths:
> +include:
> +- Doc/*
> +
> +#variables:
> +
> +steps:
> +- checkout: self
> +  clean: true
> +  fetchDepth: 5
> +
> +- task: UsePythonVersion@0
> +  displayName: 'Use Python 3.6 or later'
> +  inputs:
> +versionSpec: '>=3.6'
> +
> +- script: python -m pip install sphinx~=1.6.1 blurb python-docs-theme
> +  displayName: 'Install build dependencies'
> +
> +- script: make check suspicious html PYTHON=python
> +  workingDirectory: '$(build.sourcesDirectory)/Doc'
> +  displayName: 'Build documentation'
> +
> +- task: PublishBuildArtifacts@1
> +  displayName: 'Publish build'
> +  inputs:
> +PathToPublish: '$(build.sourcesDirectory)/Doc/build'
> +ArtifactName: build
> +publishLocation: Container
> diff --git a/.vsts/linux-buildbot.yml b/.vsts/linux-buildbot.yml
> new file mode 100644
> index ..d75d7f57650e
> --- /dev/null
> +++ b/.vsts/linux-buildbot.yml
> @@ -0,0 +1,71 @@
> +# Current docs for the syntax of this file are at:
> +#
> https://github.com/Microsoft/vsts-agent/blob/master/docs/preview/yamlgettingstarted.md
> +
> +name: $(BuildDefinitionName)_$(Date:MMdd)$(Rev:.rr)
> +
> +queue:
> +  name: Hosted Linux Preview
> +
> +trigger:
> +  branches:
> +include:
> +- master
> +- 3.7
> +- 3.6
> +  paths:
> +exclude:
> +- Doc/*
> +- Tools/*
> +
> +variables:
> +  # Copy-pasted from linux-deps.yml until template support arrives
> +  OPENSSL: 1.1.0g
> +  OPENSSL_DIR: "$(build.sourcesDirectory)/multissl/openssl/$(OPENSSL)"
> +
> +
> +steps:
> +- checkout: self
> +  clean: true
> +  fetchDepth: 5
> +
> +#- template: linux-deps.yml
> +
> +# See
> 

Re: [Python-Dev] Why aren't escape sequences in literal strings handled by the tokenizer?

2018-05-17 Thread Guido van Rossum
To answer Larry's question, there's an overwhelming number of different
options -- bytes/unicode, raw/cooked, and (in Py2) `from __future__ import
unicode_literals`. So it's easier to do the actual semantic conversion in a
later stage -- then the lexer only has to worry about hopping over
backslashes.

On Thu, May 17, 2018 at 3:38 PM, Eric V. Smith  wrote:

> On 5/17/2018 3:01 PM, Larry Hastings wrote:
>
>>
>>
>> I fed this into tokenize.tokenize():
>>
>> b''' x = "\u1234" '''
>>
>> I was a bit surprised to see \U in the output.  Particularly because
>> the output (t.string) was a *string* and not *bytes*.
>>
>
> For those (like me) who have no idea how to use tokenize.tokenize's wacky
> interface, the test code is:
>
> list(tokenize.tokenize(io.BytesIO(b''' x = "\u1234" ''').readline))
>
> Maybe I'm making a parade of my ignorance, but I assumed that string
>> literals were parsed by the parser--just like everything else is parsed by
>> the parser, hey it seems like a good place for it--and in particular that
>> the escape sequence substitutions would be done in the tokenizer.  Having
>> stared at it a little, I now detect a whiff of "this design solved a real
>> problem".  So... what was the problem, and how does this design solve it?
>>
>
> I assume the intent is to not throw away any information in the lexer, and
> give the parser full access to the original string. But that's just a guess.
>
> BTW, my use case is that I hoped to use CPython's tokenizer to parse some
>> Python-ish-looking text and handle double-quoted strings for me.
>> *Especially* all the escape sequences--leveraging all CPython's support for
>> funny things like \U{penguin}.  The current behavior of the tokenizer makes
>> me think it'd be easier to roll my own!
>>
>
> Can you feed the token text to the ast?
>
> >>> ast.literal_eval('"\u1234"')
> 'ሴ'
>
> Eric
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%
> 40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why aren't escape sequences in literal strings handled by the tokenizer?

2018-05-17 Thread Eric V. Smith

On 5/17/2018 3:01 PM, Larry Hastings wrote:



I fed this into tokenize.tokenize():

b''' x = "\u1234" '''

I was a bit surprised to see \U in the output.  Particularly because 
the output (t.string) was a *string* and not *bytes*.


For those (like me) who have no idea how to use tokenize.tokenize's 
wacky interface, the test code is:


list(tokenize.tokenize(io.BytesIO(b''' x = "\u1234" ''').readline))

Maybe I'm making a parade of my ignorance, but I assumed that string 
literals were parsed by the parser--just like everything else is parsed 
by the parser, hey it seems like a good place for it--and in particular 
that the escape sequence substitutions would be done in the tokenizer.  
Having stared at it a little, I now detect a whiff of "this design 
solved a real problem".  So... what was the problem, and how does this 
design solve it?


I assume the intent is to not throw away any information in the lexer, 
and give the parser full access to the original string. But that's just 
a guess.


BTW, my use case is that I hoped to use CPython's tokenizer to parse 
some Python-ish-looking text and handle double-quoted strings for me.  
*Especially* all the escape sequences--leveraging all CPython's support 
for funny things like \U{penguin}.  The current behavior of the 
tokenizer makes me think it'd be easier to roll my own!


Can you feed the token text to the ast?

>>> ast.literal_eval('"\u1234"')
'ሴ'

Eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] FINAL WEEK FOR 3.7.0 CHANGES!

2018-05-17 Thread Brett Cannon
On Thu, 17 May 2018 at 14:31 Serhiy Storchaka  wrote:

> 15.05.18 14:51, Ned Deily пише:
> > This is it! We are down to THE FINAL WEEK for 3.7.0! Please get your
> > feature fixes, bug fixes, and documentation updates in before
> > 2018-05-21 ~23:59 Anywhere on Earth (UTC-12:00). That's about 7 days
> > from now. We will then tag and produce the 3.7.0 release candidate.
> > Our goal continues been to be to have no changes between the release
> > candidate and final; AFTER NEXT WEEK'S RC1, CHANGES APPLIED TO THE 3.7
> > BRANCH WILL BE RELEASED IN 3.7.1. Please double-check that there are
> > no critical problems outstanding and that documentation for new
> > features in 3.7 is complete (including NEWS and What's New items), and
> > that 3.7 is getting exposure and tested with our various platorms and
> > third-party distributions and applications. Those of us who are
> > participating in the development sprints at PyCon US 2018 here in
> > Cleveland can feel the excitement building as we work through the
> > remaining issues, including completing the "What's New in 3.7"
> > document and final feature documentation. (We wish you could all be
> > here.)
>
> The "What's New in 3.7" document is still not complete. Actually it is
> far completing. In the previous releases somebody made a thoughtful
> review of the NEWS file and added all significant changes in What's New,
> and also removed insignificant entries, reorganized entries, fixed
> errors, improved wording and formatting. Many thanks to Martin Panter,
> Elvis Pranskevichus, Yury Selivanov, R. David Murray, Nick Coghlan,
> Antoine Pitrou, Victor Stinner and others for their great work! But
> seems in 3.7 this documents doesn't have an editor.
>

Maybe we should start thinking about flagging PRs or issues as needing a
What's New entry to help track when they need one, or always expect it in a
PR and ignore that requirement when a 'skip whats new' label is applied.
That would at least make it easier to keep track of what needs to be done.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] FINAL WEEK FOR 3.7.0 CHANGES!

2018-05-17 Thread Serhiy Storchaka

17.05.18 21:43, Elvis Pranskevichus пише:


I'm working on the What's New document.  Will start putting PRs in the
next few days.


Great!

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Why aren't escape sequences in literal strings handled by the tokenizer?

2018-05-17 Thread Larry Hastings



I fed this into tokenize.tokenize():

   b''' x = "\u1234" '''

I was a bit surprised to see \U in the output.  Particularly because 
the output (t.string) was a *string* and not *bytes*.


It turns out, Python's tokenizer ignores escape sequences.  All it does 
is ignore the next character so that \" does the proper thing. But it 
doesn't do any substitutions.  The escape sequences are only handled 
when the AST node is created for the literal string!


Maybe I'm making a parade of my ignorance, but I assumed that string 
literals were parsed by the parser--just like everything else is parsed 
by the parser, hey it seems like a good place for it--and in particular 
that the escape sequence substitutions would be done in the tokenizer.  
Having stared at it a little, I now detect a whiff of "this design 
solved a real problem".  So... what was the problem, and how does this 
design solve it?


BTW, my use case is that I hoped to use CPython's tokenizer to parse 
some Python-ish-looking text and handle double-quoted strings for me.  
*Especially* all the escape sequences--leveraging all CPython's support 
for funny things like \U{penguin}.  The current behavior of the 
tokenizer makes me think it'd be easier to roll my own!



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] FINAL WEEK FOR 3.7.0 CHANGES!

2018-05-17 Thread Elvis Pranskevichus
On Thursday, May 17, 2018 2:31:37 PM EDT Serhiy Storchaka wrote:
> 15.05.18 14:51, Ned Deily пише:
> > This is it! We are down to THE FINAL WEEK for 3.7.0! Please get your
> > feature fixes, bug fixes, and documentation updates in before
> > 2018-05-21 ~23:59 Anywhere on Earth (UTC-12:00). That's about 7 days
> > from now. We will then tag and produce the 3.7.0 release candidate.
> > Our goal continues been to be to have no changes between the release
> > candidate and final; AFTER NEXT WEEK'S RC1, CHANGES APPLIED TO THE
> > 3.7 BRANCH WILL BE RELEASED IN 3.7.1. Please double-check that
> > there are no critical problems outstanding and that documentation
> > for new features in 3.7 is complete (including NEWS and What's New
> > items), and that 3.7 is getting exposure and tested with our
> > various platorms and third-party distributions and applications.
> > Those of us who are participating in the development sprints at
> > PyCon US 2018 here in Cleveland can feel the excitement building as
> > we work through the remaining issues, including completing the
> > "What's New in 3.7" document and final feature documentation. (We
> > wish you could all be here.)
> 
> The "What's New in 3.7" document is still not complete. Actually it is
> far completing. In the previous releases somebody made a thoughtful
> review of the NEWS file and added all significant changes in What's
> New, and also removed insignificant entries, reorganized entries,
> fixed errors, improved wording and formatting. Many thanks to Martin
> Panter, Elvis Pranskevichus, Yury Selivanov, R. David Murray, Nick
> Coghlan, Antoine Pitrou, Victor Stinner and others for their great
> work! But seems in 3.7 this documents doesn't have an editor.

I'm working on the What's New document.  Will start putting PRs in the 
next few days.

Elvis


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] FINAL WEEK FOR 3.7.0 CHANGES!

2018-05-17 Thread Ned Deily
Elvis has been working on the What’s New doc at the sprints this week. He 
should be checking in his edits soon.  Stay tuned!

  --
Ned Deily
n...@python.org -- []



> On May 17, 2018, at 14:31, Serhiy Storchaka  wrote:
> 
> 15.05.18 14:51, Ned Deily пише:
>> This is it! We are down to THE FINAL WEEK for 3.7.0! Please get your
>> feature fixes, bug fixes, and documentation updates in before
>> 2018-05-21 ~23:59 Anywhere on Earth (UTC-12:00). That's about 7 days
>> from now. We will then tag and produce the 3.7.0 release candidate.
>> Our goal continues been to be to have no changes between the release
>> candidate and final; AFTER NEXT WEEK'S RC1, CHANGES APPLIED TO THE 3.7
>> BRANCH WILL BE RELEASED IN 3.7.1. Please double-check that there are
>> no critical problems outstanding and that documentation for new
>> features in 3.7 is complete (including NEWS and What's New items), and
>> that 3.7 is getting exposure and tested with our various platorms and
>> third-party distributions and applications. Those of us who are
>> participating in the development sprints at PyCon US 2018 here in
>> Cleveland can feel the excitement building as we work through the
>> remaining issues, including completing the "What's New in 3.7"
>> document and final feature documentation. (We wish you could all be
>> here.)
> 
> The "What's New in 3.7" document is still not complete. Actually it is far 
> completing. In the previous releases somebody made a thoughtful review of the 
> NEWS file and added all significant changes in What's New, and also removed 
> insignificant entries, reorganized entries, fixed errors, improved wording 
> and formatting. Many thanks to Martin Panter, Elvis Pranskevichus, Yury 
> Selivanov, R. David Murray, Nick Coghlan, Antoine Pitrou, Victor Stinner and 
> others for their great work! But seems in 3.7 this documents doesn't have an 
> editor.
> 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python-committers] FINAL WEEK FOR 3.7.0 CHANGES!

2018-05-17 Thread Serhiy Storchaka

15.05.18 14:51, Ned Deily пише:

This is it! We are down to THE FINAL WEEK for 3.7.0! Please get your
feature fixes, bug fixes, and documentation updates in before
2018-05-21 ~23:59 Anywhere on Earth (UTC-12:00). That's about 7 days
from now. We will then tag and produce the 3.7.0 release candidate.
Our goal continues been to be to have no changes between the release
candidate and final; AFTER NEXT WEEK'S RC1, CHANGES APPLIED TO THE 3.7
BRANCH WILL BE RELEASED IN 3.7.1. Please double-check that there are
no critical problems outstanding and that documentation for new
features in 3.7 is complete (including NEWS and What's New items), and
that 3.7 is getting exposure and tested with our various platorms and
third-party distributions and applications. Those of us who are
participating in the development sprints at PyCon US 2018 here in
Cleveland can feel the excitement building as we work through the
remaining issues, including completing the "What's New in 3.7"
document and final feature documentation. (We wish you could all be
here.)


The "What's New in 3.7" document is still not complete. Actually it is 
far completing. In the previous releases somebody made a thoughtful 
review of the NEWS file and added all significant changes in What's New, 
and also removed insignificant entries, reorganized entries, fixed 
errors, improved wording and formatting. Many thanks to Martin Panter, 
Elvis Pranskevichus, Yury Selivanov, R. David Murray, Nick Coghlan, 
Antoine Pitrou, Victor Stinner and others for their great work! But 
seems in 3.7 this documents doesn't have an editor.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Webmaster] Possible virus in Win32 build of python?

2018-05-17 Thread Steve Holden
On Thu, May 17, 2018 at 5:26 AM, Ryan Saunders 
wrote:

> Hello webmaster,
>
>
>
> A little over a week ago, I got hit by a rather nasty virus…one of those
> “ransomware” viruses that encrypts everything on your disk and then demands
> bitcoin payment in exchange for the decryption key. Yuck.
>
>
>
> One potential way in which this virus might have gotten onto my system is
> via a version of Python I downloaded, as I was working on a script to
> auto-download Python around that time. It’s a bit difficult to be sure,
> since (a) my antivirus (Windows Defender) didn’t notice the virus at all
> and (b) most files on my HDD are now hopelessly encrypted, including the
> copies of Python I downloaded, which makes postmortem analysis…difficult.
>
>
>
> I plan to do some more investigation to try to determine exactly how I got
> this bug, but I thought it prudent to bring this to your attention quickly,
> just in case Python actually *was* the infection vector, so that you can
> remove any infected files from your download site.
>
>
>
> If I recall correctly, the versions of Python that I was working with were
> the following:
>
>- https://www.python.org/ftp/python/3.7.0/python-3.7.0b4-amd64.exe
>- https://www.python.org/ftp/python/3.7.0/python-3.7.0b4-
>embed-amd64.zip
>- https://www.python.org/ftp/python/3.7.0/python-3.7.0b3-amd64.exe
>- https://www.python.org/ftp/python/3.7.0/python-3.7.0b3-
>embed-amd64.zip
>- https://www.python.org/ftp/python/3.6.5/python-3.6.5-amd64.exe
>- https://www.python.org/ftp/python/3.6.5/python-3.6.5-embed-amd64.zip
>
>
>
> The virus is the “Arrow” virus, which most antivirus sites identify as a
> variant of the “dharma/crysys” family of malware. Unfortunately, Windows
> Defender did not catch it, so I’m not sure what AV tools to recommend. But
> I do suggest scanning the above files with whatever AV tools are at your
> disposal, just to be on the safe side, so that no one else contracts this
> thing.
>
>
>
> If I am later able to determine conclusively the source of my infection, I
> will let you know.
>
>
>
> Ryan
>
>
>
> Sent from Mail  for
> Windows 10
>
>
>
> ___
> Webmaster mailing list
> webmas...@python.org
> https://mail.python.org/mailman/listinfo/webmaster
>
> Hi Ryan,

Thanks for your note, and I'm sorry to hear that you have fallen victim to
malware.

I suspect the probability of a virus in the official installer
distributions is very low. I understand that the release process for
Windows does involve anti-virus scans, and I am not personally aware of
even any false positives on 3.6.

Since 3.7.0 is a pre-release I am notifying the developers list as a
precaution. You will hear from them if they require any further information.

Good luck restoring your system.

regards
 Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (Looking for) A Retrospective on the Move to Python 3

2018-05-17 Thread Nick Coghlan
On 14 May 2018 at 12:34, Chris Barker via Python-Dev 
wrote:

> On Sat, May 12, 2018 at 8:14 AM, Skip Montanaro 
> wrote:
>
>> > I have found 2to3 conversion to be remarkably easy and painless.
>>
>> > And the whole Unicode thing is much easier.
>>
>
> Another point here:
>
> between 3.0 and 3.6 (.5?) -- py3 grew a lot of minor features that made it
> easier to write py2/py3 compatible code. u"string", b'bytes %i' %
> something -- and when where the various __future__ imports made available?
>
> If these had been in place in 3.0, the whole process would have been
> easier :-(
>

The __future__ imports were already there in 2.6/3.0.

The other ones weren't there initially because we didn't know which things
we were tempted to add back because they were actually useful, and which
ones we just thought we wanted because we were used to the way the Python 2
text model worked (or failed to work, as the case may be). (The build time
source code translation step was also far less effective than we hoped it
was going to be, since we completely failed to account for the problem of
mapping tracebacks for converted code back to the original pre-translation
code)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-17 Thread Chris Angelico
On Fri, May 18, 2018 at 12:15 AM, Anthony Flury via Python-Dev
 wrote:
> Chris,
> I entirely agree. The same questioner also asked about the fastest data type
> to use as a key in a dictionary; and which data structure is fastest. I get
> the impression the person is very into micro-optimization, without profiling
> their application. It seems every choice is made based on the speed of that
> operation; without consideration of how often that operation is used.

Sounds like we're on the same page here.

> On 17/05/18 09:16, Chris Angelico wrote:
>> The hash values of Python objects are calculated by the __hash__
>> method, so arbitrary objects can do what they like, including
>> degenerate algorithms such as:
>>
>> class X:
>>  def __hash__(self): return 7
>
> Agreed - I should have said the default hash algorithm. Hashes for custom
> object are entirely application dependent.

There isn't a single "default hash algorithm"; in fact, I'm not sure
that there's even a single algorithm used for all strings. Certainly
the algorithm used for integers is completely different from the
one(s) used for strings; we have a guarantee that ints and floats
representing the same real number are going to have the same hash
(even if that hash isn't equal to the number -
hash(1e22)==hash(10**22)!=10**22 is True), since they compare equal.
The algorithms used and the resulting hashes may change between Python
versions, when you change interpreters (PyPy vs Jython vs CPython vs
Brython...), or even when you change word sizes, I believe (32-bit vs
64-bit).

So, this is (a) a premature optimization, (b) depending on something
that's not guaranteed, and (c) is a great way to paint yourself into a
corner. Perfect! :)

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Steve Dower
On 17May2018 1004, Brett Cannon wrote:
> 
> 
> On Thu, 17 May 2018 at 09:57 Paul Moore  > wrote:
> 
> On 17 May 2018 at 14:42, Brett Cannon  > wrote:
> >
> > If I understand things correctly, our planned migration to VSTS
> will include
> > eventually automating the signing of the Windows releases so that
> part wont
> > be an issue (which are currently signed manually be Steve).
> 
> Somewhat off-topic for this discussion, but is there any background on
> the "planned migration to VSTS" that I can go and read up on? I've
> seen the comments on the committers mailing list and it looks cool,
> but I got the impression it was an addition, rather than a migration.
> (If it is just an additional CI service we'll be using, then no
> worries - more is always better!)
> 
> 
> To be a bit more specific, it's "planned assuming the testing of VSTS
> works out as we expect it to". :) IOW it isn't definite quite yet, but I
> am not expecting any blockers or people objecting, so in my head I'm
> optimistic that it's going to happen. (Any more discussion can be
> brought up on core-workflow.)

I just posted another email, but it looks like it's working out :)

The migration hasn't really been planned as such, which is why so few
people have heard about it. I've just spent the PyCon US sprints proving
that it's a viable option to migrate to, and it can certainly help
relieve the burden on AppVeyor and Travis.

As Brett says, it'll be up to core-workflow as to whether we switch
completely and when.

On doing release builds through it, that is somewhat orthogonal. Right
now, the Windows build still requires using my secure VM, which doesn't
really let just anyone do the release, but we could easily get to a
point where specifically authorised people can produce a complete build.
Similarly, the macOS build probably shouldn't be done on the provided
(up-to-date) CI machine, as I believe that would impact our
compatibility. So perhaps having a well-powered and flexible build
service available will help, but no promises.

Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Visual Studio Team Services checks on pull requests

2018-05-17 Thread Steve Dower
Hi python-dev

Just drawing your attention to a change we're currently working through
on github. There are more details on my post on python-committers at
https://mail.python.org/pipermail/python-committers/2018-May/005404.html
but this is the short version.

Microsoft has donated a significant amount of macOS, Windows and Linux
build time on Visual Studio Team Services for CPython that we can use
for PR and commit builds on github. We've hooked these up already, so
you will see new checks on github pull requests (e.g.
https://github.com/python/cpython/pull/6937 ). These are currently not
required, but apart from some asyncio tests they appear to be totally
stable and considerably faster than our current ones.

There are a few limitations still, which I'll be working with the VSTS
team to resolve. Feel free to email me with any questions or suggestions.

And for complete openness, Microsoft hopes that this will be good
publicity for VSTS. If you have any examples of this working (e.g. you
adopt it for other projects, start using it at work, etc.) then please
pass those on to me as well. It will help convince then to keep giving
us free resources :)

Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-17 Thread Anthony Flury via Python-Dev

Chris,
I entirely agree. The same questioner also asked about the fastest data 
type to use as a key in a dictionary; and which data structure is 
fastest. I get the impression the person is very into 
micro-optimization, without profiling their application. It seems every 
choice is made based on the speed of that operation; without 
consideration of how often that operation is used.


On 17/05/18 09:16, Chris Angelico wrote:

On Thu, May 17, 2018 at 5:21 PM, Anthony Flury via Python-Dev
 wrote:

Victor,
Thanks for the link, but to be honest it will just confuse people - neither
the link or the related bpo entries state that the fix is only limited to
strings. They simply talk about hash randomization - which in my opinion
implies ALL hash algorithms; which is why I asked the question.

I am not sure how much should be exposed about the scope of security fixes
but you can understand my (and other's) confusion.

I am aware that applications shouldn't make assumptions about the value of
any given hash value - apart from some simple assumptions based hash value
equality (i.e. if two objects have different hash values they can't be the
same value).

The hash values of Python objects are calculated by the __hash__
method, so arbitrary objects can do what they like, including
degenerate algorithms such as:

class X:
 def __hash__(self): return 7
Agreed - I should have said the default hash algorithm. Hashes for 
custom object are entirely application dependent.


So it's impossible to randomize ALL hashes at the language level. Only
str and bytes hashes are randomized, because they're the ones most
likely to be exploitable - for instance, a web server will receive a
query like "http://spam.example/target?a=1=2=3; and provide a
dictionary {"a":1, "b":2, "c":3}. Similarly, a JSON decoder is always
going to create string keys in its dictionaries (JSON objects). Do you
know of any situation in which an attacker can provide the keys for a
dict/set as integers?
I was just asking the question - rather than critiquing the fault-fix. I 
am actually more concerned that the documentation relating to the fix 
doesn't make it clear that only strings have their hashes randomised.



/B//TW : //
//
//This question was prompted by a question on a social media platform about
the whether hash values are transferable between across platforms.
Everything I could find stated that after Python 3.3 ALL hash values were
randomized - but that clearly isn't the case; and the original questioner
identified that some hash values are randomized and other aren't.//
/

That's actually immaterial. Even if the hashes weren't actually
randomized, you shouldn't be making assumptions about anything
specific in the hash, save that *within one Python process*, two equal
values will have equal hashes (and therefore two objects with unequal
hashes will not be equal).
Entirely agree - I was just trying to get to the bottom of the 
difference - especially considering that the documentation I could find 
implied that all hash algorithms had been randomized.

//I did suggest strongly to the original questioner that relying on the same
hash value across different platforms wasn't a clever solution - their
original plan was to store hash values in a cross system database to enable
quick retrieval of data (!!!). I did remind the OP that a hash value wasn't
guaranteed to be unique anyway - and they might come across two different
values with the same hash - and no way to distinguish between them if all
they have is the hash. Hopefully their revised design will store the key,
not the hash./

Uhh if you're using a database, let the database do the work of
being a database. I don't know what this "cross system database" would
be implemented in, but if it's a proper multi-user relational database
engine like PostgreSQL, it's already going to have way better indexing
than anything you'd do manually. I think there are WAY better
solutions than worrying about Python's inbuilt hashing.

Agreed

If you MUST hash your data for sharing and storage, the easiest
solution is to just use a cryptographic hash straight out of
hashlib.py.
As stated before - I think the original questioner was intent on micro 
optimizations - and they had hit on the idea that storing an integer 
would be quicker than storing as string - entirely ignoring both the 
practicality of trying to code all strings into a value (since hashes 
aren't guaranteed not to collide), and the issues of trying to reverse 
that translation once the stored key had been retrieved.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/anthony.flury%40btinternet.com


Thanks for your comments :-)

--
--
Anthony Flury
email : *anthony.fl...@btinternet.com*
Twitter : *@TonyFlury *


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Brett Cannon
On Thu, 17 May 2018 at 09:57 Paul Moore  wrote:

> On 17 May 2018 at 14:42, Brett Cannon  wrote:
> >
> > If I understand things correctly, our planned migration to VSTS will
> include
> > eventually automating the signing of the Windows releases so that part
> wont
> > be an issue (which are currently signed manually be Steve).
>
> Somewhat off-topic for this discussion, but is there any background on
> the "planned migration to VSTS" that I can go and read up on? I've
> seen the comments on the committers mailing list and it looks cool,
> but I got the impression it was an addition, rather than a migration.
> (If it is just an additional CI service we'll be using, then no
> worries - more is always better!)
>

To be a bit more specific, it's "planned assuming the testing of VSTS works
out as we expect it to". :) IOW it isn't definite quite yet, but I am not
expecting any blockers or people objecting, so in my head I'm optimistic
that it's going to happen. (Any more discussion can be brought up on
core-workflow.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Paul Moore
On 17 May 2018 at 14:42, Brett Cannon  wrote:
>
> If I understand things correctly, our planned migration to VSTS will include
> eventually automating the signing of the Windows releases so that part wont
> be an issue (which are currently signed manually be Steve).

Somewhat off-topic for this discussion, but is there any background on
the "planned migration to VSTS" that I can go and read up on? I've
seen the comments on the committers mailing list and it looks cool,
but I got the impression it was an addition, rather than a migration.
(If it is just an additional CI service we'll be using, then no
worries - more is always better!)

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Antoine Pitrou
On Thu, 17 May 2018 09:42:38 -0400
Brett Cannon  wrote:
> 
> If I understand things correctly, our planned migration to VSTS will
> include eventually automating the signing of the Windows releases so that
> part wont be an issue (which are currently signed manually be Steve).

What part is being "planned" to be migrated?  I only heard about it on
a bug tracker entry.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Brett Cannon
On Thu, 17 May 2018 at 04:25 Paul Moore  wrote:

> On 17 May 2018 at 04:46, Alex Walters  wrote:
> >> 1. Producing binaries (to the quality we normally deliver - I'm not
> >> talking about auto-built binaries produced from a CI system) is a
> >> chunk of extra work for the release managers.
> >
> > This is actually the heart of the reason I asked the question.  CI tools
> are fairly good now.  If the CI tools could be used in such a way to make
> the building of binary artifacts less of a burden on the release managers,
> would there be interest in doing that, and in the process, releasing binary
> artifact installers for all security update releases.
> >
> > My rationale for asking if its possible is... well.. security releases
> are important, and it's hard to ask Windows users to install Visual Studio
> and build python to use the most secure version of python that will run
> your python program.  Yes there are better ideal solutions (porting your
> code to the latest and greatest feature release version), but that’s not a
> zero burden option either.
> >
> > If CI tools just aren't up to the task, then so be it, and this isn't
> something I would darken -ideas' door with.
>
> I honestly don't know if we're at a point where an auto-built security
> release would be sufficient and/or useful. That's mostly a question
> for the release manager(s). One sticking point might be that I believe
> the Windows installers (at least) are signed, and only the release
> managers have the signing key. It's probably *not* OK to leave the
> security releases unsigned ;-) So there would be a key management
> issue to address there.
>

If I understand things correctly, our planned migration to VSTS will
include eventually automating the signing of the Windows releases so that
part wont be an issue (which are currently signed manually be Steve).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] The history of PyXML

2018-05-17 Thread Serhiy Storchaka
Does anyone has the full copy of the PyXML repository, with the complete 
history?


This library was included in Python 2.1 as the xml package and is not 
maintained as a separate project since 2004. It's home on SourceForge 
was removed. I have found sources of the last PyXML version (0.8.4), but 
without history.


I'm trying to figure out some intentions and fix possible bugs in the 
xml package. The history of all commits could help.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 575 (Unifying function/method classes) update

2018-05-17 Thread Jeroen Demeyer

On 2018-05-16 17:31, Petr Viktorin wrote:

Less disruptive changes tend to have a better backwards compatibility story.
A less intertwined change makes it easier to revert just a single part,
in case that becomes necessary.


I'll just repeat what I said in a different post on this thread: we can 
still *implement* the PEP in a less intertwined and more gradual way. 
The PEP deals with several classes and each class can be changed separately.


However, there is not much point in starting this process if you don't 
intend to go all the way. The power of PEP 575 is really using this 
base_function class in many places.


A PEP just adding the class base_function as base class of 
buitin_function_or_method without using it anywhere else would make no 
sense by itself. Still, that could be a first isolated step in the 
implementation.


If PEP 575 is accepted, I would like to follow it up with PEPs to add 
more classes to the base_function hierarchy (candidates: staticmethod, 
classmethod, classmethod_descriptor, method-wrapper, slot wrapper, 
functools.lru_cache).



Jeroen.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Ned Deily
On May 17, 2018, at 04:24, Paul Moore  wrote:
> On 17 May 2018 at 04:46, Alex Walters  wrote:
>>> 1. Producing binaries (to the quality we normally deliver - I'm not
>>> talking about auto-built binaries produced from a CI system) is a
>>> chunk of extra work for the release managers.
>> 
>> This is actually the heart of the reason I asked the question.  CI tools are 
>> fairly good now.  If the CI tools could be used in such a way to make the 
>> building of binary artifacts less of a burden on the release managers, would 
>> there be interest in doing that, and in the process, releasing binary 
>> artifact installers for all security update releases.
>> 
>> My rationale for asking if its possible is... well.. security releases are 
>> important, and it's hard to ask Windows users to install Visual Studio and 
>> build python to use the most secure version of python that will run your 
>> python program.  Yes there are better ideal solutions (porting your code to 
>> the latest and greatest feature release version), but that’s not a zero 
>> burden option either.
>> 
>> If CI tools just aren't up to the task, then so be it, and this isn't 
>> something I would darken -ideas' door with.
> 
> I honestly don't know if we're at a point where an auto-built security
> release would be sufficient and/or useful. That's mostly a question
> for the release manager(s). One sticking point might be that I believe
> the Windows installers (at least) are signed, and only the release
> managers have the signing key. It's probably *not* OK to leave the
> security releases unsigned ;-) So there would be a key management
> issue to address there.

IMO, the idea of having either the current CI system or a third party produce 
binary artifacts for Python releases to be downloadable from python.org is a 
non-starter for lots of reasons, primarily because of the security risks.

The release team *could* produce those artifacts for releases in security mode 
and, while it would be some extra work, there are so few of them.  The question 
is should we.  Once a release moves from bugfix/maintenance mode to security 
mode, in some ways we are doing a disservice to our users to encourage them to 
not upgrade to a more recent maintained release.  Release branches in security 
mode do not get any fixes other than, based on past experience, at most a small 
number of security issues that might arise.  In particular, security mode 
release branches receive no platform-support fixes to support newer OS releases 
and/or newer hardware support and receive no buildbot testing.  Security mode 
releases today are really for downstream distributors and DIYers who are 
comfortable building and maintaining their own versions of software.

--
  Ned Deily
  n...@python.org -- []

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What is the rationale behind source only releases?

2018-05-17 Thread Paul Moore
On 17 May 2018 at 04:46, Alex Walters  wrote:
>> 1. Producing binaries (to the quality we normally deliver - I'm not
>> talking about auto-built binaries produced from a CI system) is a
>> chunk of extra work for the release managers.
>
> This is actually the heart of the reason I asked the question.  CI tools are 
> fairly good now.  If the CI tools could be used in such a way to make the 
> building of binary artifacts less of a burden on the release managers, would 
> there be interest in doing that, and in the process, releasing binary 
> artifact installers for all security update releases.
>
> My rationale for asking if its possible is... well.. security releases are 
> important, and it's hard to ask Windows users to install Visual Studio and 
> build python to use the most secure version of python that will run your 
> python program.  Yes there are better ideal solutions (porting your code to 
> the latest and greatest feature release version), but that’s not a zero 
> burden option either.
>
> If CI tools just aren't up to the task, then so be it, and this isn't 
> something I would darken -ideas' door with.

I honestly don't know if we're at a point where an auto-built security
release would be sufficient and/or useful. That's mostly a question
for the release manager(s). One sticking point might be that I believe
the Windows installers (at least) are signed, and only the release
managers have the signing key. It's probably *not* OK to leave the
security releases unsigned ;-) So there would be a key management
issue to address there.

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-17 Thread Chris Angelico
On Thu, May 17, 2018 at 5:21 PM, Anthony Flury via Python-Dev
 wrote:
> Victor,
> Thanks for the link, but to be honest it will just confuse people - neither
> the link or the related bpo entries state that the fix is only limited to
> strings. They simply talk about hash randomization - which in my opinion
> implies ALL hash algorithms; which is why I asked the question.
>
> I am not sure how much should be exposed about the scope of security fixes
> but you can understand my (and other's) confusion.
>
> I am aware that applications shouldn't make assumptions about the value of
> any given hash value - apart from some simple assumptions based hash value
> equality (i.e. if two objects have different hash values they can't be the
> same value).

The hash values of Python objects are calculated by the __hash__
method, so arbitrary objects can do what they like, including
degenerate algorithms such as:

class X:
def __hash__(self): return 7

So it's impossible to randomize ALL hashes at the language level. Only
str and bytes hashes are randomized, because they're the ones most
likely to be exploitable - for instance, a web server will receive a
query like "http://spam.example/target?a=1=2=3; and provide a
dictionary {"a":1, "b":2, "c":3}. Similarly, a JSON decoder is always
going to create string keys in its dictionaries (JSON objects). Do you
know of any situation in which an attacker can provide the keys for a
dict/set as integers?

> /B//TW : //
> //
> //This question was prompted by a question on a social media platform about
> the whether hash values are transferable between across platforms.
> Everything I could find stated that after Python 3.3 ALL hash values were
> randomized - but that clearly isn't the case; and the original questioner
> identified that some hash values are randomized and other aren't.//
> /

That's actually immaterial. Even if the hashes weren't actually
randomized, you shouldn't be making assumptions about anything
specific in the hash, save that *within one Python process*, two equal
values will have equal hashes (and therefore two objects with unequal
hashes will not be equal).

> //I did suggest strongly to the original questioner that relying on the same
> hash value across different platforms wasn't a clever solution - their
> original plan was to store hash values in a cross system database to enable
> quick retrieval of data (!!!). I did remind the OP that a hash value wasn't
> guaranteed to be unique anyway - and they might come across two different
> values with the same hash - and no way to distinguish between them if all
> they have is the hash. Hopefully their revised design will store the key,
> not the hash./

Uhh if you're using a database, let the database do the work of
being a database. I don't know what this "cross system database" would
be implemented in, but if it's a proper multi-user relational database
engine like PostgreSQL, it's already going to have way better indexing
than anything you'd do manually. I think there are WAY better
solutions than worrying about Python's inbuilt hashing.

If you MUST hash your data for sharing and storage, the easiest
solution is to just use a cryptographic hash straight out of
hashlib.py.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-17 Thread Greg Ewing

Anthony Flury via Python-Dev wrote:
//I did suggest strongly to the original questioner that relying on the 
same hash value across different platforms wasn't a clever solution


Even without randomisation, I wouldn't rely on hash values
staying the same between different Python versions. Storing
them in a database sounds like a really bad idea.

--
Greg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-17 Thread Anthony Flury via Python-Dev

Victor,
Thanks for the link, but to be honest it will just confuse people - 
neither the link or the related bpo entries state that the fix is only 
limited to strings. They simply talk about hash randomization - which in 
my opinion implies ALL hash algorithms; which is why I asked the question.


I am not sure how much should be exposed about the scope of security 
fixes but you can understand my (and other's) confusion.


I am aware that applications shouldn't make assumptions about the value 
of any given hash value - apart from some simple assumptions based hash 
value equality (i.e. if two objects have different hash values they 
can't be the same value).


/B//TW : //
//
//This question was prompted by a question on a social media platform 
about the whether hash values are transferable between across platforms. 
Everything I could find stated that after Python 3.3 ALL hash values 
were randomized - but that clearly isn't the case; and the original 
questioner identified that some hash values are randomized and other 
aren't.//

//
//I did suggest strongly to the original questioner that relying on the 
same hash value across different platforms wasn't a clever solution - 
their original plan was to store hash values in a cross system database 
to enable quick retrieval of data (!!!). I did remind the OP that a hash 
value wasn't guaranteed to be unique anyway - and they might come across 
two different values with the same hash - and no way to distinguish 
between them if all they have is the hash. Hopefully their revised 
design will store the key, not the hash./



On 17/05/18 07:38, Victor Stinner wrote:

Hi,

String hash is randomized, but not the integer hash:

$ python3.5 -c 'print(hash("abc"))'
-8844814677999896014
$ python3.5 -c 'print(hash("abc"))'
-7757160699952389646

$ python3.5 -c 'print(hash(1))'
1
$ python3.5 -c 'print(hash(1))'
1

frozenset hash is combined from values of the set. So it's only
randomized if values hashes are randomized.

The denial of service is more likely to occur with strings as keys,
than with integers.

See the following link for more information:
http://python-security.readthedocs.io/vuln/cve-2012-1150_hash_dos.html

Victor

2018-05-16 17:48 GMT-04:00 Anthony Flury via Python-Dev :

This may be known but I wanted to ask this esteemed body first.

I understand that from Python3.3 there was a security fix to ensure that
different python processes would generate different hash value for the same
input - to prevent denial of service based on crafted hash conflicts.

I opened two python REPLs on my Linux 64bit PC and did the following

Terminal 1:

 >>> hash('Hello World')
-1010252950208276719

 >>> hash( frozenset({1,9}) )
  -7625378979602737914
 >>> hash(frozenset({300,301}))
-8571255922896611313

 >>> hash((1,9))
3713081631926832981
 >>> hash((875,932))
3712694086932196356



Terminal 2:

 >>> hash('Hello World')
-8267767374510285039

 >>> hash( frozenset({1,9}) )
  -7625378979602737914
 >>> hash(frozenset({300,301}))
-8571255922896611313

 >>> hash((1,9))
3713081631926832981
 >>> hash((875,932))
3712694086932196356

As can be seen - taking a hash of a string does indeed create a different
value between the two processes (as expected).

However the frozen set hash, the same in both cases, as is the hash of the
tuples - suggesting that the vulnerability resolved in Python 3.3 wasn't
resolved across all potentially hashable values. lI even used different
large numbers to ensure that the integers weren't being interned.

I can imagine that frozensets aren't used frequently as hash keys - but I
would think that tuples are regularly used. Since that their hashes are not
salted does the vulnerability still exist in some form ?.

--
--
Anthony Flury
email : *anthony.fl...@btinternet.com*
Twitter : *@TonyFlury *

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com



--
--
Anthony Flury
email : *anthony.fl...@btinternet.com*
Twitter : *@TonyFlury *

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Hashes in Python3.5 for tuples and frozensets

2018-05-17 Thread Victor Stinner
Hi,

String hash is randomized, but not the integer hash:

$ python3.5 -c 'print(hash("abc"))'
-8844814677999896014
$ python3.5 -c 'print(hash("abc"))'
-7757160699952389646

$ python3.5 -c 'print(hash(1))'
1
$ python3.5 -c 'print(hash(1))'
1

frozenset hash is combined from values of the set. So it's only
randomized if values hashes are randomized.

The denial of service is more likely to occur with strings as keys,
than with integers.

See the following link for more information:
http://python-security.readthedocs.io/vuln/cve-2012-1150_hash_dos.html

Victor

2018-05-16 17:48 GMT-04:00 Anthony Flury via Python-Dev :
> This may be known but I wanted to ask this esteemed body first.
>
> I understand that from Python3.3 there was a security fix to ensure that
> different python processes would generate different hash value for the same
> input - to prevent denial of service based on crafted hash conflicts.
>
> I opened two python REPLs on my Linux 64bit PC and did the following
>
> Terminal 1:
>
> >>> hash('Hello World')
>-1010252950208276719
>
> >>> hash( frozenset({1,9}) )
>  -7625378979602737914
> >>> hash(frozenset({300,301}))
>-8571255922896611313
>
> >>> hash((1,9))
>3713081631926832981
> >>> hash((875,932))
>3712694086932196356
>
>
>
> Terminal 2:
>
> >>> hash('Hello World')
>-8267767374510285039
>
> >>> hash( frozenset({1,9}) )
>  -7625378979602737914
> >>> hash(frozenset({300,301}))
>-8571255922896611313
>
> >>> hash((1,9))
>3713081631926832981
> >>> hash((875,932))
>3712694086932196356
>
> As can be seen - taking a hash of a string does indeed create a different
> value between the two processes (as expected).
>
> However the frozen set hash, the same in both cases, as is the hash of the
> tuples - suggesting that the vulnerability resolved in Python 3.3 wasn't
> resolved across all potentially hashable values. lI even used different
> large numbers to ensure that the integers weren't being interned.
>
> I can imagine that frozensets aren't used frequently as hash keys - but I
> would think that tuples are regularly used. Since that their hashes are not
> salted does the vulnerability still exist in some form ?.
>
> --
> --
> Anthony Flury
> email : *anthony.fl...@btinternet.com*
> Twitter : *@TonyFlury *
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com