[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Karthikeyan
On Fri, Oct 16, 2020, 12:45 AM Serhiy Storchaka  wrote:

> 14.10.20 20:56, Brett Cannon пише:
> > I think if the project is not maintained externally and thus synced into
> > the stdlib we can drop the attributes.
>
> I have found only one exception. decimal.__version__ refers to the
> version of the external specification which was not changed since 2009.
> I think it should be kept, although it might be better to use different
> name for it (like "spec_version").
>
> I do not know about any current projects maintained externally and
> synced into the stdlib. simplejson and ElementTree are too different now
> from the stdlib versions. Some features flow in both directions, but
> selectively on case by case basis, not as full sync. External argparse
> is outdated now.
>
I guess zipp that is maintained externally has code adopted into
zipfile.ZipPath regularly : https://github.com/jaraco/zipp

__version__ was removed from mock and it broke a package in fedora. The PR
has a discussion and also links to the bpo to remove __version__ from all
of stdlib : https://github.com/python/cpython/pull/17977

I am also in favor of removing since it causes confusion when the package
is not maintained externally n synced into stdlib.

Thanks

___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/MIEMWWC5W2WKV25WTARXACQOIUBUUSLS/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3NY5JIFUP5674Q3FR2DOMLXGBE6D4XJD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Neil Schemenauer
On 2020-10-15, Serhiy Storchaka wrote:
> [..] it seems that there are no usages the __version__ variable in
> top 4K pypi packages.

Given that, I think it's fine to remove them.  If we find broken
code during the alpha release we still have a chance to revert.
However, it would seem quite unlikely there would be a problem.
Thanks to Batuhan for the useful search tool.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7BX4J3LSTFFGQ4GCB5EGN552ZLVOBCSR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Serhiy Storchaka
14.10.20 20:56, Brett Cannon пише:
> I think if the project is not maintained externally and thus synced into
> the stdlib we can drop the attributes.

I have found only one exception. decimal.__version__ refers to the
version of the external specification which was not changed since 2009.
I think it should be kept, although it might be better to use different
name for it (like "spec_version").

I do not know about any current projects maintained externally and
synced into the stdlib. simplejson and ElementTree are too different now
from the stdlib versions. Some features flow in both directions, but
selectively on case by case basis, not as full sync. External argparse
is outdated now.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MIEMWWC5W2WKV25WTARXACQOIUBUUSLS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Serhiy Storchaka
15.10.20 14:59, Victor Stinner пише:
> If the __version__ variable is used, I suggest to start with a
> deprecation period using a module __getattr__(): emit
> DeprecationWarning, and only remove these variables in 2 Python
> releases (PEP 387).

This is a good idea, I though about it. But it seems that there are no
usages the __version__ variable in top 4K pypi packages.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LBHVJIXGG3U5FPOYYOTS6AG3KPSLHBER/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Serhiy Storchaka
14.10.20 23:25, Batuhan Taskaya пише:
> I've indexed a vast majority of the files from top 4K pypi packages to
> this system, and here are the results about __version__ usage on
> argparse, cgi, csv, decimal, imaplib, ipaddress, optparse, pickle,
> platform, re, smtpd, socketserver, tabnanny (result of an quick grep)
> 
> 
> rawdata/clean/argparse/setup.py
> 
> |argparse.__version__|

If it refers to the third-party argparse module, which uses
argparse.__version__ in its setup.py, it is __version__ of that
third-party module, not the one from the stdlib.

> rawdata/pypi/junitparser-1.4.1/bin/junitparser
> 
> |argparse.__version__|

argparse.__version__ is used for displaying the version of the
junitparser script. Of course the version of argparse (1.1 in the
stdlib) does not have any relation with the version of junitparser
(currently 1.4.1), so this is purely a misuse.


> rawdata/pypi/interpret_community-0.15.1/interpret_community/mlflow/mlflow.py
> 
> 
> |pickle.__version__|
> 
> The pickle in the last example looks like a result of import cloudpickle
> as pickle, so we are safe to eliminate that.

So it seems that there is only one usage of __version__ from the stdlib
modules, and that that one is a bug. Reported.

It seems pretty safe to just remove __version__ variables.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GRJTXCRUDEUQZE2IYSBKFJYKGS7AXZXA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Serhiy Storchaka
14.10.20 23:25, Batuhan Taskaya пише:
> I've indexed a vast majority of the files from top 4K pypi packages to
> this system, and here are the results about __version__ usage on
> argparse, cgi, csv, decimal, imaplib, ipaddress, optparse, pickle,
> platform, re, smtpd, socketserver, tabnanny (result of an quick grep)
> 
> 
> rawdata/clean/argparse/setup.py
> 
> |argparse.__version__|
> 
> rawdata/pypi/junitparser-1.4.1/bin/junitparser
> 
> |argparse.__version__|
> 
> rawdata/pypi/interpret_community-0.15.1/interpret_community/mlflow/mlflow.py

As for argparse, it was perhaps the last third-party module added to the
stdlib without changing name and significant rewriting. It was added in
Python 2.7/3.2, and older Python versions are not maintained for long
time. There is a third-party module argparse on PyPI for older Python
versions, its version 1.4 is higher that the version in the stdlib
(1.1), but I think that the stdlib version has more features. The
version of the module is just not informative.

> |pickle.__version__|
> 
> The pickle in the last example looks like a result of import cloudpickle
> as pickle, so we are safe to eliminate that.
> 
> Here is the query if you want to try by yourself on different
> parameters:
> https://search.tree.science/?query=Attribute%28Name%28%27argparse%27%7C%27cgi%27%7C%27csv%27%7C%27decimal%27%7C%27imaplib%27%7C%27ipaddress%27%7C%27optparse%27%7C%27platform%27%7C%27pickle%27%7C%27re%27%7C%27smtpd%27%7C%27socketserver%27%7C%27tabnanny%27%29%2C+%22__version__%22%29

Thank you Batuhan. It will help to decide what to do with __version__
attributes: keep them ,upgrade to sys.version or sys.version_info, remove.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XAHSO7AUX5J7MDXFDXUAHYAAWF7WH4JZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Ethan Furman

On 10/15/20 5:45 AM, Erlend Aasland wrote:

FYI, sqlite3 has a /pysqlite/ “version" attribute iso. “__version__", stemming from its days outside of stdlib. It has 
held the value “2.6.0" since commit f9cee22, 2010-03-05.


The proposal is to remove dunder version and friends, not plain version and friends.  Or did you mean it should also be 
removed?


--
~Ethan~


* iso.  => instead of
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W2VCBOOS7NSUMEDBO7UOB22Z3CBVZIV5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Paul Moore
On Thu, 15 Oct 2020 at 16:31, Erlend Aasland  wrote:
>
> Actually both sqlite3.version and sqlite3.version_info, the former as a 
> string, the latter as a tuple.

However, sqlite3.sqlite_version and sqlite3.sqlite_version_info should
definitely be retained, as they give the version of the sqlite library
Python is using.

(In general, I'm ambivalent about removing version attributes - I
agree that they are basically useless, but there's little gain from
removing them, and there's the risk of breaking code for essentially
no reason. If we're looking to tidy things up, I'm fairly sure there
are better candidates than this...)

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6PUP4FORLNX4TAEET42C6YN4SGR22SSU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Erlend Aasland
Actually both sqlite3.version and sqlite3.version_info, the former as a string, 
the latter as a tuple.

E

On 15 Oct 2020, at 14:45, Erlend Aasland 
mailto:erlen...@innova.no>> wrote:

FYI, sqlite3 has a pysqlite “version" attribute iso. “__version__", stemming 
from its days outside of stdlib. It has held the value “2.6.0" since commit 
f9cee22, 2010-03-05.


Erlend E. Aasland

On 14 Oct 2020, at 15:53, Serhiy Storchaka 
mailto:storch...@gmail.com>> wrote:

Some module attributes in the stdlib have attribute __version__. It
makes sense if the module is developed independently from Python, but
after inclusion in the stdlib it no longer have separate releases which
should be identified by version. New changes goes into module usually
without changing the value of __version__. Different versions of the
module for different Python version can have different features but the
same __version__.

I propose to remove __version__ in all stdlib modules. Are there any
exceptions?

Also, what do you think about other meta attributes like __author__,
__credits__, __email__, __copyright__, __about__, __date__?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to 
python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KBU4EU2JULXSMUZULD5HJJWCGOMN52MK/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4BSUFZSQY4OSZGGCEYP67NG6IWNSP2EA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Erlend Aasland
FYI, sqlite3 has a pysqlite “version" attribute iso. “__version__", stemming 
from its days outside of stdlib. It has held the value “2.6.0" since commit 
f9cee22, 2010-03-05.


Erlend E. Aasland

On 14 Oct 2020, at 15:53, Serhiy Storchaka 
mailto:storch...@gmail.com>> wrote:

Some module attributes in the stdlib have attribute __version__. It
makes sense if the module is developed independently from Python, but
after inclusion in the stdlib it no longer have separate releases which
should be identified by version. New changes goes into module usually
without changing the value of __version__. Different versions of the
module for different Python version can have different features but the
same __version__.

I propose to remove __version__ in all stdlib modules. Are there any
exceptions?

Also, what do you think about other meta attributes like __author__,
__credits__, __email__, __copyright__, __about__, __date__?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to 
python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KBU4EU2JULXSMUZULD5HJJWCGOMN52MK/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DAB2TCPTLLH7ZUU2B2VRKQYN7SWUT6RN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [python-committers] Re: Performance benchmarks for 3.9

2020-10-15 Thread M.-A. Lemburg
On 15.10.2020 15:50, Victor Stinner wrote:
> Le mer. 14 oct. 2020 à 17:59, Antoine Pitrou  a écrit :
>> unpack-sequence is a micro-benchmark. (...)
> 
> I suggest removing it.
> 
> I removed other similar micro-benchmarks from pyperformance in the
> past, since they can easily be misunderstood and misleading. For
> curious people, I'm keeping a collection of Python micro-benchmarks
> at:
> https://github.com/vstinner/pymicrobench

As mentioned, those micro benchmark are more helpful in identifying
performance regressions than macro benchmarks, esp. when you find that
a macro benchmark is showing issues.

When you find that a macro benchmark isn't performing well anymore,
it's very difficult understanding the cause and micro benchmarks
help identify the reasons.

So instead of removing them, I'd suggest to add them back to
the suite or have them run in a separate suite, specifically
called "micro benchmarks" to address you concern about people
misinterpreting them.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Oct 15 2020)
>>> Python Projects, Coaching and Support ...https://www.egenix.com/
>>> Python Product Development ...https://consulting.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   https://www.egenix.com/company/contact/
 https://www.malemburg.com/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/F6TXJOXW2HNA6ZB6PNIHCVHQLFAO5JWD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Changing Python's string search algorithms

2020-10-15 Thread Tim Peters
[Guido]
> I am not able to dream up any hard cases -- like other posters,
> my own use of substring search is usually looking for a short
> string in a relatively short piece of text. I doubt even the current
> optimizations matter to my uses.

I should  have responded to this part differently.   What I said was
fine ;-) , but it's a mistake to focus exclusively on pathologies
here.  What Fredrik did was with the aim of significantly speeding
utterly ordinary searches.  For example, search for "xyz" in
"abcdefgh...xyz".

Brute force starts comparing at the first location:

abcdefgh...
xyz

The current code compares "c" with "z" fist.  They don't match.  Now
what?  It looks at the_next_ character in the haystack, "d".  Thanks
to preprocessing the needle, it knows that "d" appears nowhere in the
needle (actually, the code is so keen on "tiny constant extra space"
that preprocessing only saves enough info to get a _probabilitistic_
guess about whether "d" is in the needle, but one that's always
correct when "not in the needle" is its guess).

For that reason, there's no possible match when starting the attempt
at _any_ position that leaves "d" overlapping with the needle.  So it
can immediately skip even trying starting at text indices 1, 2, or 3.
It can jump immediately to try next at index 4:

abcdefgh...
0123xyz

Then the same kind of thing, seeing that "g" and "z" don't match, and
that "h" isn't in the needle.  And so on, jumping 4 positions at a
time until finally hitting "xyz" at the end of the haystack.

The new code gets similar kinds of benefits in _many_ similarly
ordinary searches too, but they're harder to explain because they're
not based on intuitive tricks explicitly designed to speed ordinary
cases.  They're more happy consequences of making pathological cases
impossible.

>From randomized tests so far, it's already clear that the new code is
finding more of this nature to exploit in non-pathological cases than
the current code.  Although that's partly (but only partly) a
consequence of Dennis augmenting the new algorithm with a more
powerful version of the specific trick explained above (which is an
extreme simplification of Daniel Sunday's algorithm, which was also
aimed at speeding ordinary searches).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y3DFFBHNMHDGRE2GIEMH7XLY5YR6BMKR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Changing Python's string search algorithms

2020-10-15 Thread Tim Peters
[Dennis Sweeney ]
> Here's my attempt at some heuristic motivation:

Thanks, Dennis!  It helps.  One gloss:

> 
> The key insight though is that the worst strings are still
> "periodic enough", and if we have two different patterns going on,
> then we can intentionally split them apart.

The amazing (to me) thing is that splitting into JUST two parts is
always enough to guarantee linearity.  What if there are a million
different patterns going on ?  Doesn't matter!  I assume this
remarkable outcome is a consequence of the Critical Factorization
Theorem.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IKPKLR43U522PC55JB7GQZMQSGJHNCKF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: [python-committers] Re: Performance benchmarks for 3.9

2020-10-15 Thread Victor Stinner
Le mer. 14 oct. 2020 à 17:59, Antoine Pitrou  a écrit :
> unpack-sequence is a micro-benchmark. (...)

I suggest removing it.

I removed other similar micro-benchmarks from pyperformance in the
past, since they can easily be misunderstood and misleading. For
curious people, I'm keeping a collection of Python micro-benchmarks
at:
https://github.com/vstinner/pymicrobench

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VX6DPYNQUSZSMFYRNMGTKBJIGGX6O7UE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove module's __version__ attributes in the stdlib

2020-10-15 Thread Victor Stinner
If the __version__ variable is used, I suggest to start with a
deprecation period using a module __getattr__(): emit
DeprecationWarning, and only remove these variables in 2 Python
releases (PEP 387).

sys.version or sys.version_info can be used instead of argparse.__version__, no?

Victor

Le mer. 14 oct. 2020 à 22:30, Batuhan Taskaya  a écrit :
>
> I've indexed a vast majority of the files from top 4K pypi packages to this 
> system, and here are the results about __version__ usage on argparse, cgi, 
> csv, decimal, imaplib, ipaddress, optparse, pickle, platform, re, smtpd, 
> socketserver, tabnanny (result of an quick grep)
>
>
> rawdata/clean/argparse/setup.py
>
> argparse.__version__
>
> rawdata/pypi/junitparser-1.4.1/bin/junitparser
>
> argparse.__version__
>
> rawdata/pypi/interpret_community-0.15.1/interpret_community/mlflow/mlflow.py
>
> pickle.__version__
>
> The pickle in the last example looks like a result of import cloudpickle as 
> pickle, so we are safe to eliminate that.
>
> Here is the query if you want to try by yourself on different parameters: 
> https://search.tree.science/?query=Attribute%28Name%28%27argparse%27%7C%27cgi%27%7C%27csv%27%7C%27decimal%27%7C%27imaplib%27%7C%27ipaddress%27%7C%27optparse%27%7C%27platform%27%7C%27pickle%27%7C%27re%27%7C%27smtpd%27%7C%27socketserver%27%7C%27tabnanny%27%29%2C+%22__version__%22%29
> On 14.10.2020 21:23, Neil Schemenauer wrote:
>
> On 2020-10-14, Serhiy Storchaka wrote:
>
> I propose to remove __version__ in all stdlib modules. Are there any
> exceptions?
>
> I agree that these kinds of meta attributes are not useful and it
> would be nice to clean them up.  However, IMHO, maybe the cleanup is
> not worth breaking Python programs.  We could remove them from the
> documentation, add comments (or deprecation warnings) telling people
> not to use them.
>
> I think it would be okay to remove them if we could show that the
> top N PyPI packages don't use these attributes or at least very few
> of them do.  As someone who regularly tests alpha releases, I've
> found it quite painful to do since nearly every release is breaking
> 3rd party packages that my code depends on.  I feel we should try
> hard to avoid breaking things unless there is a strong reason and
> there is no easy way to provide backwards compatibility.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/MI2SLQCZIKBRFX7HCUB7G4B64MTZ6XVC/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/MSQTTUZOW6KSECSZE5XH65LANGII2P5F/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WTTUGC7RQZLV6XQPVQAV4RVELNLAIATM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Changing Python's string search algorithms

2020-10-15 Thread Dennis Sweeney
Here's my attempt at some heuristic motivation:

Try to construct a needle that will perform as poorly as possible when
using the naive two-nested-for-loops algorithm. You'll find that if
there isn't some sort of vague periodicity in your needle, then you
won't ever get *that* unlucky; each particular alignment will fail
early, and if it doesn't then some future alignment would be pigeonholed 
to fail early.

So Crochemore and Perrin's algorithm explicitly handles this "worst case"
of periodic strings. Once we've identified in the haystack some period
from the needle, there's no need to re-match it. We can keep a memory
of how many periods we currently remember matching up, and never re-match
them. This is what gives the O(n) behavior for periodic strings.

But wait! There are some bad needles that aren't quite periodic.
For instance:

>>> 'ABCABCAABCABC' in 'ABC'*1_000_000

The key insight though is that the worst strings are still
"periodic enough", and if we have two different patterns going on,
then we can intentionally split them apart. For example,
`"xyxyxyxyabcabc" --> "xyxyxyxy" + "abcabc"`. I believe the goal is to
line it up so that if the right half matches but not the left then we
can be sure to skip somewhat far ahead. This might not correspond
exactly with splitting up two patterns. This is glossing over some
details that I'm admittedly still a little hazy on as well, but
hopefully that gives at least a nudge of intuition.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MXMS5XIV6WJFFRHTH7TBHAO3TC4QIHBZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 638: Syntactic macros

2020-10-15 Thread Dima Tisnek
My 2c as a Python user (mostly) and someone who dabbled in ES2020:

The shouting syntax! does not sit well with me.
The $hygenic is also cumbersome.

To contrast, babel macros:
* looks like regular code, without special syntax: existing tooling
works, less mental strain
* have access to call site environment, so not strictly hygienic(?):
allow for greater expressive power

I these the two points above really helped adopt babel macros in the
js community and should, at the very least be seriously considered by
the py community.

Cheers,
d.

On Sat, 26 Sep 2020 at 21:16, Mark Shannon  wrote:
>
> Hi everyone,
>
> I've submitted my PEP on syntactic macros as PEP 638.
> https://www.python.org/dev/peps/pep-0638/
>
> All comments and suggestions are welcome.
>
> Cheers,
> Mark
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/U4C4XHNRC4SHS3TPZWCTY4SN4QU3TT6V/
> Code of Conduct: http://python.org/psf/codeofconduct/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VEC7VWY5TJJGBXWFQUX3XO43SQAZ7FMR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Changing Python's string search algorithms

2020-10-15 Thread Greg Ewing

On 15/10/20 1:45 pm, Chris Angelico wrote:

So it'd
be heuristics in the core language that choose a good default for most
situations, and then a str method that returns a preprocessed needle.


Or maybe cache the results of the preprocessing?

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EH26IH753UX3PSURGN5GUKEKX6QDANEZ/
Code of Conduct: http://python.org/psf/codeofconduct/