Re: Proposed new syntax

2017-08-14 Thread Jussi Piitulainen
Paul Rubin writes:

>> Jussi Piitulainen writes:

>>> But what is "set comprehension" in French, German, or Finnish?
>
> The idea comes from set theory: for some reason English Wikipedia
> doesn't have Finnish cross-wiki links for most of the relevant terms,
> but Google translates "axiom of specification" as "Määritelmän axiom".
> Maybe you can find something from there.

Most of the relevant pages are missing in the Finnish Wikipedia.

However, I did find there a nice name for this axiom schema based on yet
another of its many names: "erotteluskeema" ("schema of separation").
<https://fi.wikipedia.org/wiki/Joukko-oppi> I like that.

Marko's Finnish suggestion, "koonta", is related to the relevant sense
of "comprehension", while his "gleaning" and "culling" relate to
"separation".

Python's "comprehensions" do both "separation" and "replacement"
(another axiom schema), aka filtering and mapping.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-12 Thread Jussi Piitulainen
Marko Rauhamaa writes:

> Marko Rauhamaa <ma...@pacujo.net>:
>
>> Jussi Piitulainen <jussi.piitulai...@helsinki.fi>:
>>
>>> But what is "set comprehension" in French, German, or Finnish?
>>
>> [...]
>>
>> Myself, I might propose the word "koonta" as a simple Finnish
>> translation for "comprehension".
>
> And maybe "culling" or "gleaning" could work in English.

Nice.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-12 Thread Jussi Piitulainen
Marko Rauhamaa writes:

> Jussi Piitulainen writes:
>
>> Rustom Mody writes:
>>> [ My conjecture: The word ‘comprehension’ used this way in English is
>>> meaningless and is probably an infelicious translation of something
>>> which makes sense in German]
>>
>> From a Latin word for "taking together", through Middle French,
>
> Metaphors' galore:
>
>English: understand < stand under something
>French:  comprendre < take something in
>German:  verstehen  < stand in front of something
>Finnish: ymmärtää   < surround something
>
> all mean the same thing.

English also has "comprehend" (in English it seems to me opaque).

Finnish also has "käsittää" 'understand' < 'get a hold of', from "käsi"
'hand' (at least it looks to me like it might be so derived).

But what is "set comprehension" in French, German, or Finnish?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-12 Thread Jussi Piitulainen
Rustom Mody writes:

> [ My conjecture: The word ‘comprehension’ used this way in English is
> meaningless and is probably an infelicious translation of something
> which makes sense in German]

From a Latin word for "taking together", through Middle French,
according to this source, which has further details:

https://en.wiktionary.org/wiki/comprehension
https://en.wiktionary.org/wiki/comprehensio#Latin
https://en.wiktionary.org/wiki/comprehendo#Latin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-10 Thread Jussi Piitulainen
MRAB writes:

> On 2017-08-10 15:28, Steve D'Aprano wrote:

>> Every few years, the following syntax comes up for discussion, with
>> some people saying it isn't obvious what it would do, and others
>> disagreeing and saying that it is obvious. So I thought I'd do an
>> informal survey.
>>
>> What would you expect this syntax to return?
>>
>> [x + 1 for x in (0, 1, 2, 999, 3, 4) while x < 5]
>>
>>
>> For comparison, what would you expect this to return? (Without
>> actually trying it, thank you.)
>>
>> [x + 1 for x in (0, 1, 2, 999, 3, 4) if x < 5]
>>
>>
>>
>> How about these?
>>
>> [x + y for x in (0, 1, 2, 999, 3, 4) while x < 5 for y in (100, 200)]
>>
>> [x + y for x in (0, 1, 2, 999, 3, 4) if x < 5 for y in (100, 200)]
>>
>>
>>
>> Thanks for your comments!
>>
> There's a subtlety there.
>
> Initially I would've thought that the 'while' would terminate the
> iteration of the preceding 'for', but then when I thought about how I
> would expand it into multiple lines, I realised that the 'while' would
> have to be expanded to "if x < 5: break", not an inner 'while' loop.

I wonder how such expansions would work in general.

[x + y for x in (0, 1, 2, 999, 3, 4) for y in (100, 200) while x < 5]
[x + y for x in (0, 1, 2, 999, 3, 4) for y in (100, 200) while x < y]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-10 Thread Jussi Piitulainen
Steve D'Aprano writes:

> Every few years, the following syntax comes up for discussion, with
> some people saying it isn't obvious what it would do, and others
> disagreeing and saying that it is obvious. So I thought I'd do an
> informal survey.
>
> What would you expect this syntax to return?
>
> [x + 1 for x in (0, 1, 2, 999, 3, 4) while x < 5]

[1, 2, 3]

> For comparison, what would you expect this to return? (Without
> actually trying it, thank you.)
>
> [x + 1 for x in (0, 1, 2, 999, 3, 4) if x < 5]

[1, 2, 3, 4, 5]

> How about these?
>
> [x + y for x in (0, 1, 2, 999, 3, 4) while x < 5 for y in (100, 200)]

[100, 200, 101, 201, 102, 202]
>
> [x + y for x in (0, 1, 2, 999, 3, 4) if x < 5 for y in (100, 200)]

[100, 200, 101, 201, 102, 202, 103, 203, 104, 204]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Installing matplotlib on python3

2017-07-27 Thread Jussi Piitulainen
FS writes:

> I just installed matplotlib on debian and I tried to import it on
> python3. It cannot be found however it can be found on python 2.x. No
> surprise:
>   A 'find -name matplotliib' reveals:
> /usr/share/matplotlib
> /usr/lib/python2.7/dist-packages/matplotlib
>
> I am not sure how the apt-get elected to place matplotlib in the
> python2.7 directory but I want to "properly" install it so it can
> import under python3.
>
>>There are probably commands from python3 to point its import to the
>> 2.7 directory, but I expect that is just a workaround and I am
>> uneasy about whether I have somehow installed a 2.7 compatible
>> version only of matplotlib.
>>Possibly there is some recommended way to re-install matplotlib for
>>python3 sending it to the appropriate directories.
>
> Any advice here?

You may find it's a different package, one named python3-* instead of
python-*:

$ apt-cache search matplotlib
idl - Interactive Data Language IDL
python-matplotlib - Python based plotting system in a style similar to Matlab
python-matplotlib-data - Python based plotting system (data package)
python-matplotlib-dbg - Python based plotting system (debug extension)
python-matplotlib-doc - Python based plotting system (documentation package)
python-mpltoolkits.basemap - matplotlib toolkit to plot on map projections
python-mpltoolkits.basemap-data - matplotlib toolkit to plot on map projections 
(data package)
python-mpltoolkits.basemap-doc - matplotlib toolkit to plot on map projections 
(documentation)
python-mpmath - library for arbitrary-precision floating-point arithmetic
python-mpmath-doc - library for arbitrary-precision floating-point arithmetic - 
Documentation
python-scitools - Python library for scientific computing
python-wxmpl - Painless matplotlib embedding in wxPython
python3-matplotlib - Python based plotting system in a style similar to Matlab 
(Python 3)
python3-matplotlib-dbg - Python based plotting system (debug extension, Python 
3)
python3-mpmath - library for arbitrary-precision floating-point arithmetic 
(Python3)
$
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regular expression

2017-07-26 Thread Jussi Piitulainen
Kunal Jamdade writes:

> There is a filename say:- 'first-324-True-rms-kjhg-Meterc639.html' .
>
> I want to extract the last 4 characters. I tried different regex. but
> i am not getting it right.
>
> Can anyone suggest me how should i proceed.?

os.path.splitext(name)  # most likely; also: os.path.split, os.path.join

name.rsplit('.', 1) # might do

name[-4:]   # "last 4 characters"

name.endswith('.html')  # is this what you really want?

This is not a regex job.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: About the implementation of del in Python 3

2017-07-06 Thread Jussi Piitulainen
Marko Rauhamaa writes:

> Jussi Piitulainen:
>
>> For me it's enough to know that it's the object itself that is passed
>> around as an argument, as a returned value, as a stored value, as a
>> value of a variable. This is the basic fact that lets me understand
>> the behaviour and performance of programs.
>
> That "definition" is very circular. You haven't yet defined what is
> "object itself". The word "self", in partucular, looks like yet
> another synonym of "identity".

Yes, I regard the identity of an object as the most *basic* thing.

> Anyway, it would be nice to have an explicit statement in the language
> definition that says that passing an argument and returning a value
> preserve the identity.

Isn't there? I think it's at least very strongly implied.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: About the implementation of del in Python 3

2017-07-06 Thread Jussi Piitulainen
MRAB <pyt...@mrabarnett.plus.com> writes:

> On 2017-07-06 15:29, Jussi Piitulainen wrote:
>> Marko Rauhamaa writes:
>>
>>> While talking about addresses might or might not be constructive,
>>> let me just point out that there is no outwardly visible distinction
>>> between "address" or "identity".
>>
>> With a generational or otherwise compacting garbage collector there
>> would be. I believe that to be a valid implementation strategy.
>>
>> Or you are using "address" in some abstract sense so that the
>> "address" does not change when the internal representation of the
>> object is moved to another location.
>>
>>> Ignoring the word that is used to talk about object identity, it
>>> would be nice to have a precise formal definition for it. For
>>> example, I know that any sound implementation of Python would
>>> guarantee:
>>>
>>> >>> def f(a): return a
>>> ...
>>> >>> a = object()
>>> >>> a is f(a)
>>> True
>>>
>>> But how do I know it?
>>
>> For me it's enough to know that it's the object itself that is passed
>> around as an argument, as a returned value, as a stored value, as a
>> value of a variable. This is the basic fact that lets me understand
>> the behaviour and performance of programs.
>>
> Perhaps you should be thinking of it as passing around the end of a
> piece of string, the other end being tied to the object itself. :-)

I don't find that helpful, and I don't find myself in need of such help.
Most of the time that piece of string is (those pieces of string are)
just a distraction to me. They get in the way. So I *don't*.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: About the implementation of del in Python 3

2017-07-06 Thread Jussi Piitulainen
Marko Rauhamaa writes:

> While talking about addresses might or might not be constructive, let
> me just point out that there is no outwardly visible distinction
> between "address" or "identity".

With a generational or otherwise compacting garbage collector there
would be. I believe that to be a valid implementation strategy.

Or you are using "address" in some abstract sense so that the "address"
does not change when the internal representation of the object is moved
to another location.

> Ignoring the word that is used to talk about object identity, it would
> be nice to have a precise formal definition for it. For example, I
> know that any sound implementation of Python would guarantee:
>
> >>> def f(a): return a
> ...
> >>> a = object()
> >>> a is f(a)
> True
>
> But how do I know it?

For me it's enough to know that it's the object itself that is passed
around as an argument, as a returned value, as a stored value, as a
value of a variable. This is the basic fact that lets me understand the
behaviour and performance of programs.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: About the implementation of del in Python 3

2017-07-06 Thread Jussi Piitulainen
Chris Angelico writes:

> On Thu, Jul 6, 2017 at 5:35 PM, Jussi Piitulainen
> <jussi.piitulai...@helsinki.fi> wrote:
>> Incidentally, let no one point out that ids are not memory addresses.
>> It says in the interactive help that they are (Python 3.4.0):
>>
>> Help on built-in function id in module builtins:
>>
>> id(...)
>> id(object) -> integer
>>
>> Return the identity of an object. This is guaranteed to be unique
>> among simultaneously existing objects. (Hint: it's the object's
>> memory address.)
>
> Sorry, not the case.
>
>
> Help on built-in function id in module builtins:
>
>>>> help(id)
> id(obj, /)
> Return the identity of an object.
>
> This is guaranteed to be unique among simultaneously existing objects.
> (CPython uses the object's memory address.)
>
>>>> help(id)
> Help on built-in function id in module __builtin__:
>
> id(...)
>
>>>>> help(id)
> Help on built-in function id in module __builtin__:
>
> id(...)
> Return the identity of an object: id(x) == id(y) if and only if x is y.
>
>
> The interactive help does not say that in any version newer than the
> 3.4 that you tested. The function does not return an address, it
> returns an identity.

Excellent. I'm happy to withdraw the prohibition.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: About the implementation of del in Python 3

2017-07-06 Thread Jussi Piitulainen
Dan Wissme writes:

> I thought that del L[i] would slide L[i+1:] one place to the left,
> filling the hole, but :
>
 L
> [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
 id(L)
> 4321967496
 id(L[5])# address of 50 ?
> 4297625504
 del L[2]
 id(L[4]) # new address of 50 ?
> 4297625504
 id(L)
> 4321967496
>
> So the element 50 is still at the same memory location.
> What del L[i] do exactly, and what is its complexity ? O(1) or O(n) ?

id identifies the object that is stored at that index, not the location.
Locations are not objects.

Consider [L[5], L[5]] where the same object is stored in two different
places. In the implementation level there is some kind of reference in
the internal representation of the list to the representation of the
object somewhere else in memory. In the language level, the object
simply is stored in two places, and that's nothing unusual. Storing or
fetching or passing or returning objects around does not make copies.

Incidentally, let no one point out that ids are not memory addresses.
It says in the interactive help that they are (Python 3.4.0):

Help on built-in function id in module builtins:

id(...)
id(object) -> integer

Return the identity of an object. This is guaranteed to be unique
among simultaneously existing objects. (Hint: it's the object's
memory address.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to write raw strings to Python

2017-07-05 Thread Jussi Piitulainen
Binary Boy writes:

> On Wed, 05 Jul 2017 20:37:38 +0300, Jussi Piitulainen wrote:
>> Sam Chats writes:
>> 
>> > On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote:
>> >> On 2017-07-05, Sam Chats <blahb...@blah.org> wrote:
>> >> 
>> >> > I want to write, say, 'hello\tworld' as-is to a file, but doing
>> >> > f.write('hello\tworld') makes the file look like:
>> >> [...]
>> >> > How can I fix this?
>> >> 
>> >> That depends on what you mean by "as-is".
>> >> 
>> >> Seriously.
>> >> 
>> >> Do you want the single quotes in the file?  Do you want the backslash
>> >> and 't' character in the file?
>> >> 
>> >> When you post a question like this it helps immensely to provide an
>> >> example of the output you desire.
>> >
>> > I would add to add the following couple lines to a file:
>> >
>> > for i in range(5):
>> > print('Hello\tWorld')
>> >
>> > Consider the leading whitespace to be a tab.
>> 
>> import sys
>> 
>> lines = r'''
>> for line in range(5):
>> print('hello\tworld')
>> '''
>> 
>> print(lines.strip())
>> 
>> sys.stdout.write(lines.strip())
>> sys.stdout.write('\n')
>
> Thanks! But will this work if I already have a string through a string
> variable, rather than using it directly linke you did (by declaring
> the lines variable)?  And, will this work while writing to files?

Yes, it will work the same. Writing does not interpret the contents of
the string. Try it - replace sys.stdout above with your file object.

If you see a different result in your actual program, your string may be
different than you think. Investigate that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to write raw strings to Python

2017-07-05 Thread Jussi Piitulainen
Sam Chats writes:

> On Wednesday, July 5, 2017 at 9:09:18 PM UTC+5:30, Grant Edwards wrote:
>> On 2017-07-05, Sam Chats  wrote:
>> 
>> > I want to write, say, 'hello\tworld' as-is to a file, but doing
>> > f.write('hello\tworld') makes the file look like:
>> [...]
>> > How can I fix this?
>> 
>> That depends on what you mean by "as-is".
>> 
>> Seriously.
>> 
>> Do you want the single quotes in the file?  Do you want the backslash
>> and 't' character in the file?
>> 
>> When you post a question like this it helps immensely to provide an
>> example of the output you desire.
>
> I would add to add the following couple lines to a file:
>
> for i in range(5):
> print('Hello\tWorld')
>
> Consider the leading whitespace to be a tab.

import sys

lines = r'''
for line in range(5):
print('hello\tworld')
'''

print(lines.strip())

sys.stdout.write(lines.strip())
sys.stdout.write('\n')
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to write add frequency in particular file by reading a csv file and then making a new file of multiple csv file by adding frequency

2017-06-23 Thread Jussi Piitulainen
Dennis Lee Bieber writes:

> On Thu, 22 Jun 2017 22:46:28 +0300, Jussi Piitulainen declaimed the
> following:
>
>>
>> A pair of methods, str.maketrans to make a translation table and then
>> .translate on every string, allows to do all that in one step:
>>
>> spacy = r'\/-.[]{}()'
>> tr = str.maketrans(dict.fromkeys(spacy, ' '))
>>
>> ...
>>
>> ln = ln.translate(tr)
>>
>> But those seem to be only in Python 3.
>>
>
>   Well -- I wasn't trying for "production ready" either; mostly
> focusing on the SQLite side of things.

I know, and that's a sound suggestion if the OP is ready for that.

I just like those character translation methods, and I didn't like it
when you first took the time to call a simple regex "line noise" and
then proceeded to post something that looked much more noisy yourself.

>> However, if the OP really is getting their input from a CSV file,
>> they shouldn't need methods like these. Because surely it's then
>> already an unambiguous list of words, to be read in with the csv
>> module? Or else it's not yet CSV at all after all? I think they need
>> to sit down with someone who can walk them through the whole
>> exercise.
>
>   The OP file extensions had CSV, but there was no sign of the csv
> module being used; worse, it looks like the write of the results file
> has no formatting -- it is the repr of a tuple of (word, count)!

Exactly. Too many things like that makes me think they are not ready for
more advanced methods.

>   I'm going out on a limb and guessing the regex being used to
> find words is accepting anything separated by leading/trailing space
> containing a minimum of 3 and maximum of 15 characters in the set
> a..z. So could be missing first and last words on a line if they don't
> have the leading or trailing space, and ignoring "a", "an", "me",
> etc., along with "mrs." [due to .] In contrast, I didn't limit on
> length, and tried to split "look-alike" into "look" and "alike" (and
> given time, would have tried to accept "people's" as a possessive).

I'm not sure I like the splitting of look-alike (I'm not sure that I
like not splitting it either) but note that the regex does that for
free.

The \b in the original regex matches the empty string at a position
where there is a "word character" on only one side. It recognizes a
boundary at the beginning of a line and at whitespace, but also at all
the punctuation marks.

You guess right about the length limits. I wouldn't use them, and then
there's no need for the boundary markers any more: my \w+ matches
maximal sequences of word characters (even in foreign languages like
Finnish or French, and even in upper case, also digits).

To also match "people's" and "didn't", use \w+'\w+, and to match with
and without the ' make the trailing part optional \w+('\w+)? except the
notation really does start to become noisy because one must prevent the
parentheses from "capturing" the group:

import re
wordy = re.compile(r'''  \w+  (?: ' \w+ )? ''', re.VERBOSE)
text = '''
Oliver N'Goma, dit Noli, né le 23 mars 1959 à Mayumba et mort le 7 juin
2010, est un chanteur et guitariste gabonais d'Afro-zouk.
'''

print(wordy.findall(text))

# ['Oliver', "N'Goma", 'dit', 'Noli', 'né', 'le', '23', 'mars', '1959',
# 'à', 'Mayumba', 'et', 'mort', 'le', '7', 'juin', '2010', 'est', 'un',
# 'chanteur', 'et', 'guitariste', 'gabonais', "d'Afro", 'zouk']

Not too bad?

But some punctuation really belongs in words. And other doesn't. And
everything becomes hard and every new heuristic turns out too strict or
too lenient and things that are not words at all may look like words, or
it may not be clear whether something is a word or is more than one word
or is less than a word or not like a word at all. Should one be amused?
Should one despair?

:)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to write add frequency in particular file by reading a csv file and then making a new file of multiple csv file by adding frequency

2017-06-22 Thread Jussi Piitulainen
Dennis Lee Bieber writes:

>   # lowerecase all, open hyphenated and / separated words, parens,
>   # etc.
>   ln = ln.lower().replace("/", " ").replace("-", " ").replace(".", " ")
>   ln = ln.replace("\\", " ").replace("[", " ").replace("]", " ")
>   ln = ln.replace("{", " ").replace("}", " ")
>   wds = ln.replace("(", " ").replace(")", " ").replace("\t", " ").split()

A pair of methods, str.maketrans to make a translation table and then
.translate on every string, allows to do all that in one step:

spacy = r'\/-.[]{}()'
tr = str.maketrans(dict.fromkeys(spacy, ' '))

...

ln = ln.translate(tr)

But those seem to be only in Python 3.

>   # for each word in the line
>   for wd in wds:
>   # strip off leading/trailing punctuation
>   wd = wd.strip("\\|'\";'[]{},<>?~!@#$%^&*_+= ")

You have already replaced several of those characters with spaces.

>   # do we still have a word? Skip any with still embedded
>   # punctuation
>   if wd and wd.isalpha():
>   # attempt to update the count for this word

But for quick and dirty work I might use a very simple regex, probably
literally this regex:

wordy = re.compile(r'\w+')

...

for wd in wordy.findall(ln): # or .finditer, but I think it's newer
...


However, if the OP really is getting their input from a CSV file, they
shouldn't need methods like these. Because surely it's then already an
unambiguous list of words, to be read in with the csv module? Or else
it's not yet CSV at all after all? I think they need to sit down with
someone who can walk them through the whole exercise.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Standard lib version of something like enumerate() that takes a max count iteration parameter?

2017-06-14 Thread Jussi Piitulainen
Andre Müller writes:

> I'm a fan of infinite sequences. Try out itertools.islice.
> You should not underestimate this very important module.
>
> Please read also the documentation:
> https://docs.python.org/3.6/library/itertools.html
>
> from itertools import islice
>
> iterable = range(100)
> # since Python 3 range is a lazy evaluated object
> # using this just as a dummy
> # if you're using legacy Python (2.x), then use the xrange function for it
> # or you'll get a memory error
>
> max_count = 10
> step = 1
>
> for i, element in enumerate(islice(iterable, 0, max_count, step), start=1):
> print(i, element)

I like to test this kind of thing with iter("abracadabra") and look at
the remaining elements, just to be sure that they are still there.

from itertools import islice

s = iter("abracadabra")
for i, element in enumerate(islice(s, 3)):
print(i, element)

print(''.join(s))

Prints this:

0 a
1 b
2 r
acadabra

One can do a similar check with iter(range(1000)). The range object
itself does not change when its elements are accessed.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Standard lib version of something like enumerate() that takes a max count iteration parameter?

2017-06-14 Thread Jussi Piitulainen
Malcolm Greene writes:

> Wondering if there's a standard lib version of something like
> enumerate() that takes a max count value?
> Use case: When you want to enumerate through an iterable, but want to
> limit the number of iterations without introducing if-condition-break
> blocks in code.
> Something like:
>
> for counter, key in enumerate( some_iterable, max_count=10 ):
> 
>
> Thank you,
> Malcolm

for counter, key in zip(range(10), some_iterable):

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to store some elements from a list into another

2017-06-13 Thread Jussi Piitulainen
Peter Otten writes:

...

> def edges(items):
> first = last = next(items)
> for last in items:
> pass
> return [first, last]

...

> However, this is infested with for loops. Therefore

...

> I don't immediately see what to do about the for loop in edges(), so
> I'll use the traditional cop-out: Removing the last loop is left as an
> exercise...

In the spirit of the exercise:

def sekond(x, y):
return y

def edges(items): # where items is a non-empty iterator
first = next(items)
last = functools.reduce(sekond, items, first)
return [first, last]

Of course, right?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to store some elements from a list into another

2017-06-13 Thread Jussi Piitulainen
breamore...@gmail.com writes:

> On Monday, June 12, 2017 at 7:33:03 PM UTC+1, José Manuel Suárez Sierra wrote:
>> Hello,
>> I am stuck with a (perhaps) easy problem, I hope someone can help me:
>> 
>> My problem is:
>> I have a list of lists like this one:
>> [[55, 56, 57, 58, 83, 84, 85, 86, 89, 90, 91, 92, 107, 108, 109, 110,
>> 111, 117, 118, 119, 120, 128, 129, 130, 131, 135, 136, 137, 138, 184,
>> 185, 186, 187, 216, 217, 218, 219, 232, 233, 234, 235, 236, 237, 238,
>> 267, 268, 269, 270, 272, 273, 274, 275], [2, 3, 4, 5, 9, 10, 11, 12,
>> 21, 22, 23, 24, 29, 30, 31, 32, 56, 57, 58, 59, 65, 66, 67, 68, 74,
>> 75, 76, 77, 78, 89, 90, 91, 92, 98, 99, 100, 101, 102, 125, 126, 127,
>> 128, 131, 132, 133, 134, 135]]
>> 
>> And what I want is to store some of these datum into another list
>> according to the next conditions:
>>
>> 1. I just want to store data if these values are consecutive (four in
>> four), for instance, for first element I would like to store into the
>> new list: [[[55,58],[83,86],[n,n+3]]] and so on.
>>
>>  I tried something like this:
>> 
>> x=0
>> y=0
>> while list6[x][y] == list6[x][y+1]-1 and list6[x][y] == 
>> list6[x][y+1]-2 and list6[x][y] == list6[x][y+1]-3 or list6[x][0]:
>> 
>> list7.append(list6[x][y])
>> list7.append(list6[x][y+3])
>> y = y+1
>> if (list6[x][y])%(list6[x][len(list6[x])-1]) == 0:
>> x= x+1
>> 
>> if len(list6[x]) == x and len(list6[x][y]) == y:
>> break
>> 
>> 
>> It does not work
>> 
>> I appreciate your help 
>> Thank you
>
> Perhaps use the recipe for consecutive numbers from here
> https://docs.python.org/2.6/library/itertools.html#examples It will
> have to be modified for Python 3, I'll leave that as an exercise.

What a clever idea. Pity it's gone in newer documentation. (By the "it"
in the previous sentence I refer only to the idea of grouping by the
difference to the index in the original sequence, and by "gone" only to
the fact that I didn't see this example at the corresponding location
for Python 3.6, which I found by replacing the 2 in the URL with 3.
Perhaps the idea is preserved somewhere else?)

Anyway, I've adapted it to Python 3, and to an analysis of the problem
at hand - mainly the problem that the OP finds themselves _stuck_ with
their spec and their code, as quoted above. Hope it helps.

What follows, follows.

# The big idea is to define (auxiliary) functions. It's not an
# advanced idea. It's the most basic of all ideas. The experience of
# being stuck comes from trying to see the whole problem at once.

# Ok, familiary with standard ways of viewing things helps. But that
# is just the flip side of breaking problems into manageable parts:
# past experience suggests that some kinds of parts are more useful,
# more composable into a solution, so in standard libraries.

# One subproblem is to group just one list of numbers, then it is easy
# to group every list in a list of such lists. But another subproblem
# is to deal with one group of numbers. There seem to be two concerns:
# a group should consist of consecutive numbers, and a group should
# consist of four numbers - the latter at least is easy enough if the
# group is stored as a list, but what should be done if there are five
# or seven numbers? No idea, but that can be clarified later once the
# components of a solution are untangled into their own functions.

# The battle cry is: Use def!

import itertools as it
import operator as op

def applied(f):
'''Reification of that asterisk - like a really advanced
computer-sciency kind of thing. But see no lambda!'''
def F(args): return f(*args)
return F

def consequences(data):
'''Lists of consecutive datami, clever idea adapted from
https://docs.python.org/2.6/library/itertools.html#examples'''
for k, group in it.groupby(enumerate(data), applied(op.sub)):
yield [datum for index, datum in group]

def isfourlong(sequence):
'''True if sequence is of length 4.'''
return len(sequence) == 4

def ends(sequences):
'''Collect the endpoints of sequences in a list of 2-lists.'''
return [[sequence[0], sequence[-1]] for sequence in sequences]

data = [[55, 56, 57, 58, 83, 84, 85, 86, 89, 90, 91, 92, 107, 108,
 109, 110, 111, 117, 118, 119, 120, 128, 129, 130, 131, 135,
 136, 137, 138, 184, 185, 186, 187, 216, 217, 218, 219, 232,
 233, 234, 235, 236, 237, 238, 267, 268, 269, 270, 272, 273,
 274, 275],

[2, 3, 4, 5, 9, 10, 11, 12, 21, 22, 23, 24, 29, 30, 31, 32,
 56, 57, 58, 59, 65, 66, 67, 68, 74, 75, 76, 77, 78, 89, 90,
 91, 92, 98, 99, 100, 101, 102, 125, 126, 127, 128, 131, 132,
 133, 134, 135]]

def testrun():
'''See how many variations can be composed out of the few auxiliary
functions - the problem becomes tame, or at least a bit tamer.
This kind of ad-hoc test suite is very useful, during 

Re: Generator and return value

2017-06-07 Thread Jussi Piitulainen
Frank Millman writes:

> It would be nice to write a generator in such a way that, in addition
> to 'yielding' each value, it performs some additional work and then
> 'returns' a final result at the end.
>
>> From Python 3.3, anything 'returned' becomes the value of the
>> StopIteration 
> exception, so it is possible, but not pretty.
>
> Instead of -
>my_gen = generator()
>for item in my_gen():
>do_something(item)
>[how to get the final result?]
>
> you can write -
>my_gen = generator()
>while True:
>try:
>item = next(my_gen())
>do_something(item)
>except StopIteration as e:
>final_result = e.value
>
> Is this the best way to achieve it, or is there a nicer alternative?

Like this, and imagination is the limit:

def generator(box):
yield 1
box.append('won')
yield 2
box.append('too')
yield 3
box.append('tree')

mabox = []
for item in generator(mabox): pass
print(*mabox)
# prints: won too tree
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is An Element of a Sequence an Object?

2017-06-05 Thread Jussi Piitulainen
Peter Otten writes:

> Thomas Jollans wrote:
>
>> On 04/06/17 09:52, Rustom Mody wrote:
>>> On Sunday, June 4, 2017 at 12:45:23 AM UTC+5:30, Jon Forrest wrote:
 I'm learning about Python. A book I'm reading about it
 says "... a string in Python is a sequence. A sequence is an ordered
 collection of objects". This implies that each character in a string
 is itself an object.

 This doesn't seem right to me, but since I'm just learning Python
 I questioned the author about this. He gave an example the displays
 the ids of string slices. These ids are all different, but I think
 that's because the slicing operation creates objects.

 I'd like to suggest an explanation of what a sequence is
 that doesn't use the word 'object' because an object has
 a specific meaning in Python.

 Am I on the right track here?
>>> 
>>> Its a good sign that you are confused
>>> If you were not (feeling) confused, it would mean you are actually more
>>> so… Following is not exactly what you are disturbed by... Still closely
>>> related
>>> 
>> s="a string"
>> s[0]
>>> 'a'
>> s[0][0]
>>> 'a'
>> s[0][0][0][0][0]
>>> 'a'
>> 
>> Also:
>> 
> s[0] is s[0][0][0][0][0][0][0]
>> True
>
>
> However, this is an implementation detail:
>
 def is_cached(c):
> ... return c[0] is c[0][0]
> ...

I think this works the same, and looks more dramatic to me:

...return c[0] is c[0]

 is_cached(chr(255))
> True
 is_cached(chr(256))
> False

Also same thing, as far as I can see:

>>> s = "\u00ff\u0100" ; (s[0] is s[0], s[1] is s[1])
(True, False)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bug or intended behavior?

2017-06-02 Thread Jussi Piitulainen
sean.diza...@gmail.com writes:

> Can someone please explain this to me?  Thanks in advance!
>
> ~Sean
>
>
> Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 12:39:47) 
> [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
 print "foo %s" % 1-2
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: unsupported operand type(s) for -: 'str' and 'int'


The per cent operator has precedence over minus. Spacing is not
relevant. Use parentheses.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: zeep, infinite recursion

2017-05-29 Thread Jussi Piitulainen
Nagy László Zsolt writes:

> Running this command:
>
> python3.6 -m zeep exmaple.wsdl 
>
> I get this (this is only the end of the traceback):
>
...
> from zeep.xsd import ComplexType
> RecursionError: maximum recursion depth exceeded
>
> Looks like an infinite recursion to me. Due to a non-disclosure
> agreement, I'm not able to send you the example wsdl. But I can tell
> that the very same WSDL works with Oracle Java Web Services. So the
> WSDL itself is fine.
>
> Could this be a bug in zeep?

It could be some sort of bug, of course, but (not knowing anything about
WSDLs or zeeps) it could be that the data is deeper than Python's
default recursion depth, which is rather small. You could try setting a
higher limit and see if the call succeeds then.

(Search for Python's maximum recursion depth. I don't remember the
incantation but it should be easy to find.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Check for regular expression in a list

2017-05-26 Thread Jussi Piitulainen
Jussi Piitulainen writes:

> Or use a regex match if the condition becomes more complex. Even then,
> there is re.match to attemp a match at the start of the string, which
> helps to keep the expression simple.

Soz: attemp' a match.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Check for regular expression in a list

2017-05-26 Thread Jussi Piitulainen
Rustom Mody  writes:

> On Friday, May 26, 2017 at 5:02:55 PM UTC+5:30, Cecil Westerhof wrote:
>> To check if Firefox is running I use:
>> if not 'firefox' in [i.name() for i in list(process_iter())]:
>> 
>> It probably could be made more efficient, because it can stop when it
>> finds the first instance.
>> 
>> But know I switched to Debian and there firefox is called firefox-esr.
>> So I should use:
>> re.search('^firefox', 'firefox-esr')
>> 
>> Is there a way to rewrite
>> [i.name() for i in list(process_iter())]
>> 
>> so that it returns True when there is a i.name() that matches and
>> False otherwise?
>> And is it possible to stop processing the list when it found a match?
>
> 'in' operator is lazily evaluated if its rhs is an iterable (it looks)
> So I expect you can replace
> if not 'firefox' in [i.name() for i in list(process_iter())]:
> with
> if not 'firefox' in (i.name() for i in list(process_iter())]):

Surely that should be:

if not 'firefox' in (i.name() for i in process_iter()):

And that again should be:

if any((i.name() == 'firefox') for i in process_iter()):

Which can then be made into:

if any(i.name().startswith('firefox') for i in process_iter()):

Or use a regex match if the condition becomes more complex. Even then,
there is re.match to attemp a match at the start of the string, which
helps to keep the expression simple.

Redundancy of parentheses is a bit subtle above - the generator
expression as the sole argument to any() does not need them, and the
parentheses of (i.name() == 'firefox') are not necessary.

As to early exit:

>>> s = iter('abracadabra')
>>> any(c == 'c' for c in s)
True
>>> list(s)
['a', 'd', 'a', 'b', 'r', 'a']
>>> 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Scala considering significant indentation like Python

2017-05-22 Thread Jussi Piitulainen
Cholo Lennon writes:

> On 22/05/17 00:53, Steve D'Aprano wrote:
>> The creator of Scala, Martin Odersky, has proposed introducing
>> Python-like significant indentation to Scala and getting rid of
>> braces:
>>
>>  I was playing for a while now with ways to make Scala's syntax
>>  indentation-based. I always admired the neatness of Python
>>  syntax and also found that F# has benefited greatly from its
>>  optional indentation-based syntax, so much so that nobody seems
>>  to use the original syntax anymore.
>>
>> https://github.com/lampepfl/dotty/issues/2491
>>
>>
>>
>
> From the link:
>
> "Impediments
>
> What are the reasons for preferring braces over indentations?
>
> Provide visual clues where constructs end. With pure indentation based
> syntax it is sometimes hard to tell how many levels of indentation are
> terminated at some point... "
>
> I am a huge python fan (but also a C++ and Java fan) and I agree with
> Scala creator, sometimes the readability is complicated. So, more
> often than I would like to, I end up missing the braces :-O

I am the inventor of multiple ends on the same line. This way, in a
language where all of several nested constructs end with an end - not
going to name the language but it's Julia - instead of

end
end
end
end
end,

one combines the uninformative lines into one by writing

end end end end end,

and with four-space indentation the ends align neatly with the starts.
Technically, the ends on the remaining line of ends are backwards.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to install Python package from source on Windows

2017-05-22 Thread Jussi Piitulainen
Gregory Ewing writes:

> Jussi Piitulainen wrote:
>> I was surprised that Git (or GitHub Desktop) simply failed so badly.
>> Not sure what it could have done instead.
>
> It's not really git's fault, it's a consequence of differing
> filename conventions on different platforms. The only way to

I understand that, but the way it failed was worse than I expected.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to install Python package from source on Windows

2017-05-21 Thread Jussi Piitulainen
Chris Angelico writes:

> On Sun, May 21, 2017 at 9:43 PM, Jussi Piitulainen wrote:
>> It happened to me recently when cloning a git repository from GitHub,
>> using GitHub Desktop, to a Mac OS file system. Some filenames differed
>> only in case, like "INFO" and "info" in the same directory. Mac OS
>> considered them the same file, Git tried to associate them with
>> different objects, or something like that. Cat-astrophic :)
>>
>> Something similar happened on Windows with filenames that ended in a
>> period, in the same repository after the problems on Mac OS had been
>> fixed. Apparently Windows considers "log-3.1." the same as "log-3.1" -
>> is that right? (This cloning attempt was not made by me but by a
>> colleague who has access to a Windows computer. It was mostly calendar
>> dates in a format that uses periods.)
>>
>> I was surprised that Git (or GitHub Desktop) simply failed so badly.
>> Not sure what it could have done instead.
>
> Why are you surprised that it failed, if you agree that there's
> nothing else it could have done? The fault is with the file system
> that is unable to distinguish file names that are logically distinct.
> It's not the file system's place to fold case; if anything, it's the
> UI's job. And git has been given content for both file names, so
> what's it going to do other than attempt to create both files?

I was surprised that it didn't notice and report that things went wrong
while it was creating the clone that it was unable to handle afterwards.
It apparently created an inconsistent index and thought everything was
fine at that point. Then failed to understand its own state.

It could have noticed that it's about to create a file that is already
there when it shouldn't be.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to install Python package from source on Windows

2017-05-21 Thread Jussi Piitulainen
Chris Angelico writes:

> On Sun, May 21, 2017 at 8:23 PM, bartc wrote:
>> On 20/05/2017 19:37, Chris Angelico wrote:
>>
>>> rosuav@sikorsky:~/linux$ find -name \*.c -or -name \*.h | wc -l
>>> 44546
>>>
>>> These repositories, by the way, correspond to git URLs
>>> https://github.com/python/cpython,
>>> git://pike-git.lysator.liu.se/pike.git,
>>> git://source.winehq.org/git/wine, and
>>> https://github.com/torvalds/linux respectively, if you want to check
>>> my numbers. Two language interpreters, a popular execution subsystem,
>>> and an OS kernel.
>>>
>>> I'd like to see you create a single-file version of the Linux kernel
>>> that compiles flawlessly on any modern compiler and has no configure
>>> script.
>>
>>
>> I've had a look at the Linux stuff. (BTW, when copying to Windows,
>> the file "aux" occurs several times, which causes problems as it's a
>> reserved filename I think. Also there were a dozen conflicts where
>> different versions of the same file wanted to be stored at the same
>> location.)
>
> I don't understand where you would have obtained the sources that
> there are duplicate files. It's easiest just to clone someone's git
> repo (eg Linus Torvald's).

It happened to me recently when cloning a git repository from GitHub,
using GitHub Desktop, to a Mac OS file system. Some filenames differed
only in case, like "INFO" and "info" in the same directory. Mac OS
considered them the same file, Git tried to associate them with
different objects, or something like that. Cat-astrophic :)

Something similar happened on Windows with filenames that ended in a
period, in the same repository after the problems on Mac OS had been
fixed. Apparently Windows considers "log-3.1." the same as "log-3.1" -
is that right? (This cloning attempt was not made by me but by a
colleague who has access to a Windows computer. It was mostly calendar
dates in a format that uses periods.)

I was surprised that Git (or GitHub Desktop) simply failed so badly.
Not sure what it could have done instead.

(There were also filenames that Mac OS or Windows rejected. Also
filenames that were truly horrible - a couple contained backspaces.)

Incidentally, I used Python scripts to find and fix issues in those
filenames, including character encoding issues. Smooth sailing :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Rawest raw string literals

2017-04-21 Thread Jussi Piitulainen
Tim Chase <python.l...@tim.thechases.com> writes:

> On 2017-04-21 08:23, Jussi Piitulainen wrote:
>> Tim Chase writes:
>>> Bash:
>>>   cat <>>   "single and double" with \ and /
>>>   EOT
>>>
>>> PS: yes, bash's does interpolate strings, so you still need to do
>>> escaping within, but the arbitrary-user-specified-delimiter idea
>>> still holds.
>> 
>> If you put any quote characters in the initial EOT, it doesn't.
>> Quote removal on the EOT determines the actual EOT at the end.
>> 
>>   cat <<"EOT"
>>   Not expanding any $amount here
>>   EOT
>
> Huh, I just tested it and you're 100% right on that.  But I just
> re-read over that section of my `man bash` page and don't see anything
> that stands out as detailing this.  Is there something I missed in the
> docs?

It's in this snippet, yanked straight from the man page:

   The format of here-documents is:

  <<[-]word
  here-document
  delimiter

   No  parameter expansion, command substitution, arithmetic expansion,
   or pathname expansion is performed on word.  If  any  characters  in
   word  are  quoted,  the  delimiter is the result of quote removal on
   word, and the lines in the here-document are not expanded.  If  word
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Rawest raw string literals

2017-04-20 Thread Jussi Piitulainen
Tim Chase writes:

> A number of tools use a custom quote-string:
>
> Bash:
>
>   cat <   "single and double" with \ and /
>   EOT

[snip]

> PS: yes, bash's does interpolate strings, so you still need to do
> escaping within, but the arbitrary-user-specified-delimiter idea still
> holds.

If you put any quote characters in the initial EOT, it doesn't. Quote
removal on the EOT determines the actual EOT at the end.

  cat <<"EOT"
  Not expanding any $amount here
  EOT
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bigotry and hate speech on the python mailing list

2017-04-18 Thread Jussi Piitulainen
Joel Goldstick writes:

> On Tue, Apr 18, 2017 at 5:30 AM, James McMahon wrote:
>> Can the moderators please get involved here and remind people to
>> address python related topics and questions on the python mailing
>> list? While I can only speak to my interest when joining this list,
>> isn't python why most people joined this list? Others have different
>> and polarizing views on many subjects. This just isn't the right
>> place to voice your views on subjects other than python. I delete
>> this same tired thread every day, and every day it reappears. With
>> all manners and due respect, stay on topic.
>>
>> Isn't this list content moderated by anyone?
>>
>> -Jim
>>
>
> Plus 1 to Jim.  Come on gang! Back to python!  Spaces vs. tabs anyone?

If there's anything about Python that is less interesting than spaces
vs. tabs, I don't even want to hear what it is.

I agree with Steven that it's a highly offensive thing to say that ASCII
is good enough to "most people".

Python lets me work in UTF-8 *and* has helped me fix rather strange
mixups of legacy encodings. I appreciate that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Looping [was Re: Python and the need for speed]

2017-04-18 Thread Jussi Piitulainen
Christian Gollwitzer writes:

> Am 18.04.17 um 08:21 schrieb Chris Angelico:
>> On Tue, Apr 18, 2017 at 4:06 PM, Christian Gollwitzer  
>> wrote:
>>> Am 18.04.17 um 02:18 schrieb Ben Bacarisse:
>>>
 Thanks (and to Grant).  IO seems to be the canonical example.  Where
 some languages would force one to write

   c = sys.stdin.read(1)
   while c == ' ':
   c = sys.stdin.read(1)
>>>
>>> repeat
>>> c  = sys.stdin.read(1)
>>> until c != ' '
>>
>> Except that there's processing code after it.
>>
>
> Sorry, I misread it then - Ben's code did NOT have it, it looks like a
> "skip the whitespace" loop.

It also reads the first character that is not whitespace, so it's not
usable to *merely* skip the whitespace.

>> while True:
>> c = sys.stdin.read(1)
>> if not c: break
>> if c.isprintable(): text += c
>> elif c == "\x08": text = text[:-1]
>> # etc
>>
>> Can you write _that_ as a do-while?
>
> No. This case OTOH looks like an iteration to me and it would be most
> logical to write
>
> for c in sys.stdin:
>  if c.isprintable(): text += c
>  elif c == "\x08": text = text[:-1]
>  # etc
>
> except that this iterates over lines. Is there an analogous iterator
> for chars? For "lines" terminated by something else than "\n"?
> "for c in get_chars(sys.stdin)" and
> "for c in get_string(sys.stdin, terminate=':')" would be nicely
> readable IMHO. Or AWK-like processing:
>
> for fields in get_fields(open('/etc/passwd'), RS='\n', FS=':'):
>   if fields[2]=='0':
>   print 'Super-User found:', fields[0]

I don't know if those exist in some standard library, but they are easy
to write, and I do it all the time. I don't need the chars one, but I do
tab-separated fields and line-separated groups of tab-separated fields,
and variations.

for s, sentence in enumerate(sentences(sys.stdin)):
for k, token in enumerate(sentence):
...
token[LEMMA] or warn('empty LEMMA', s, k, sentence)
...

The wrapper function around sys.stdin or other text source is different
depending on the data format. Sometimes it's messy, sometimes not. Any
messy details are hidden the wrapper.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python and the need for speed

2017-04-12 Thread Jussi Piitulainen
bart4...@gmail.com writes:

> On Wednesday, 12 April 2017 16:50:01 UTC+1, Jussi Piitulainen  wrote:
>> bart4...@gmail.com writes:
>> 
>> > On Wednesday, 12 April 2017 12:56:32 UTC+1, Jussi Piitulainen  wrote:
>> >> bartc writes:
>> >> 
>> >
>> >> > These are straightforward language enhancements.
>> >> 
>> >> FYI, the question is not how to optimize the code but how to prevent
>> >> the programmer from writing stupid code in the first place. Someone
>> >> suggested that a language should do that.
>> >
>> > The 'stupid code' thing is a red herring. I assume the code people
>> > write is there for a reason.
>> 
>> So you walked in to a conversation about something that does not
>> interest you and simply started talking about your own thing.
>> 
>> Because of course you did.
>> 
>> I get confused when you do that.
>
> Huh? The subject is making Python faster. It's generally agreed that
> being very dynamic makes that harder, everything else being equal.

Amazingly, the references in this part of the thread work all the way
back to the point were you joined in. Let me quote extensively from the
message to which you originally followed up. I lay it out so that I can
add some commentary where I point out that the topic of the discussion
at that point is whether a programming language should prevent people
from writing bad code.

[Rick Johnson, quoted by Steven D'Aprano]
| high level languages like Python should make it difficult, if not
| impossible, to write sub-optimal code (at least in the blatantly
| obvious cases).

That's the topic of the discussion at that point: whether it should be
difficult to *write* bad code in the language. I don't know what else
Rick may have said in his message, but Steven chose that topic for that
message.

Not whether a language should be such that the compiler could produce
efficient code, or whether the compiler should be such that it could
produce efficient code even if the language makes it a challenge, but
whether it should be difficult to write blatantly sub-optimal code.

[Steven D'Aprano, in response]
| You mean that Python should make it impossible to write:
|
|near_limit = []
|near_limit.append(1)
|near_limit = len(near_limit)
|
| instead of:
|
|near_limit = 1
|
| ? I look forward to seeing your version of RickPython that can
| enforce that rule.

You snipped that. Steven asks whether Rick really thinks that Python
should prevent that code from being written. Steven also pointed out
that this example came from some actual code :) [*I* snipped that.]

It's not a question how to produce efficient code for that, or how
Python prevents the compiler from producing optimal code. It's a
question of the compiler rejecting the code.

Traceback (most recent call last):
  File "/dev/fd/63", line 37, in 
SanityClauseException: code is blatantly sub-optimal

As far as I know, no language does that. Because reasons?

[Steven D'Aprano, in response]
| Here's another example:
|
|answer = 0
|for i in range(10):
|   answer += 1
|
| instead of 
|
|answer = 10
|
| So... how exactly does the compiler prohibit stupid code?

That was Steven's second example, which Steven again used to ask Rick
whether or how he really thinks the compiler should prohibit such code.
(I have no idea whether that discussion has continued in other branches
of the thread, or whether Rick really thinks that, but he seemed to be
saying something like that.)

You chose to comment on that example, but I think you chose to ignore
the question of prohibition altogether, even though that was the
question in this particular exchange between Rick and Steven. You went
on to address the actual compilation challenges instead.

It's a valid topic but it's a different topic. It fits under the subject
line, but not in response to that particular message. Those example
snippets were intended as a *different kind of challenge*.

That's why your original response had me confused, briefly, and then
irritated, slightly. That's all.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python and the need for speed

2017-04-12 Thread Jussi Piitulainen
bart4...@gmail.com writes:

> On Wednesday, 12 April 2017 12:56:32 UTC+1, Jussi Piitulainen  wrote:
>> bartc writes:
>> 
>
>> > These are straightforward language enhancements.
>> 
>> FYI, the question is not how to optimize the code but how to prevent
>> the programmer from writing stupid code in the first place. Someone
>> suggested that a language should do that.
>
> The 'stupid code' thing is a red herring. I assume the code people
> write is there for a reason.

So you walked in to a conversation about something that does not
interest you and simply started talking about your own thing.

Because of course you did.

I get confused when you do that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python and the need for speed

2017-04-12 Thread Jussi Piitulainen
bart4...@gmail.com writes:

> On Wednesday, 12 April 2017 10:57:10 UTC+1, bart...@gmail.com  wrote:
>> On Wednesday, 12 April 2017 07:48:57 UTC+1, Steven D'Aprano  wrote:
>
>> > for i in range(10):
>> > answer += 1
>
>> > So... how exactly does the compiler prohibit stupid code?
>
>> And this lookup happens for every loop iteration.
>
> I've just looked at byte-code for that loop (using an on-line Python
> as I don't have it on this machine). I counted 7 byte-codes that need
> to be executed per iteration, plus five to set up the loop, one of
> which needs to call a function.
>
> My language does the same loop with only 4 byte-codes. Loop setup
> needs 2 (to store '10' into the loop counter).
>
> It also has the option of using a loop with no control variable (as
> it's not used here). Still four byte-codes, but the looping byte-code
> is a bit faster.
>
> Plus there is the option of using ++answer instead of answer += 1. Now
> there are only two byte-codes! (NB don't try ++ in Python.)
>
> These are straightforward language enhancements.

FYI, the question is not how to optimize the code but how to prevent the
programmer from writing stupid code in the first place. Someone
suggested that a language should do that.

But you appear to be talking about the original topic of the thread, as
seen on the subject line, so ok :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Temporary variables in list comprehensions

2017-04-07 Thread Jussi Piitulainen
Roel Schroeven writes:

> Lele Gaifax schreef op 6/04/2017 20:07:
>> Piet van Oostrum  writes:
>>
>>> It is a poor man's 'let'. It would be nice if python had a real 'let'
>>> construction. Or for example:
>>>
>>> [(tmp, tmp + 1) for x in data with tmp = expensive_calculation(x)]
>>>
>>> Alas!
>>
>> It would be nice indeed!
>>
>> Or even
>>
>>   [(tmp, tmp + 1) for x in data
>>with expensive_calculation(x) as tmp
>>if tmp is not None]
>>
>
> Perhaps this:
>
> [(tmp, tmp + 1) for tmp in
>  (expensive_calculation(x) for x in data)
>  if tmp is not None]
>
> A bit less elegant, but works right now.

The "poor man's let" works right now.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Quick questions about globals and database connections

2017-04-06 Thread Jussi Piitulainen
DFS  writes:

> On 4/6/2017 10:54 AM, Python wrote:
>> Le 06/04/2017 à 16:46, DFS a écrit :
>>> On 4/5/2017 10:52 PM, Dan Sommers wrote:
 On Wed, 05 Apr 2017 22:00:46 -0400, DFS wrote:

> I have a simple hard-coded check in place before even trying to
> connect:
>
> if dbtype not in ('sqlite postgres'):
>print "db type must be sqlite or postgres"
>exit()

 That's not doing what you think it is.

 Hint:  What is ('sqlite postgres')?
>>>
>>> ?
>>>
>>> dbtype is a string, and the check works perfectly.  No typos make it
>>> past the guard.
>>
>> except that it would be True for dbtype = 'lite post' or dbtype = 'stgr'
>
>
> Except before I even get there I have another checkpoint.
>
> ---
> if len(sys.argv) != 4:
>   print "Enter group name, verbose setting (-s or -v), and db type"
>   exit(0)
>
> GRP = sys.argv[1]   (a few lines later I check this value against a db)   
>   
> verbose = 'x'
> if sys.argv[2] == '-s': verbose = False
> if sys.argv[2] == '-v': verbose = True
> if verbose not in (True,False):
>   print "Enter -s (silent) or -v (verbose)"
>   exit()
>   
> dbtype = sys.argv[3]  
> if dbtype not in ('sqlite postgres'):
>   print "db type must be sqlite or postgres"
>   exit()
> ---
>
> $1,000,000 virtual if you can get a bad command past those.
>
> These are good commands:
> $python progname.py comp.lang.python -v postgres
> $python progname.py comp.lang.c -s sqlite

So these are meant to be bad commands:
$python progname.py comp.lang.python -v post
$python progname.py comp.lang.c -s sql
$python progname.py comp.lang.cobol -v 'lite post'
$python progname.py comp.lang.perl -s stgr

What happens when you try them with the above code?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Quick questions about globals and database connections

2017-04-06 Thread Jussi Piitulainen
DFS writes:

> On 4/5/2017 10:52 PM, Dan Sommers wrote:
>> On Wed, 05 Apr 2017 22:00:46 -0400, DFS wrote:
>>
>>> I have a simple hard-coded check in place before even trying to connect:
>>>
>>> if dbtype not in ('sqlite postgres'):
>>>print "db type must be sqlite or postgres"
>>>exit()
>>
>> That's not doing what you think it is.
>>
>> Hint:  What is ('sqlite postgres')?
>
> ?
>
> dbtype is a string, and the check works perfectly.  No typos make it
> past the guard.

"lite post" in "sqlite postgres"
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Temporary variables in list comprehensions

2017-04-06 Thread Jussi Piitulainen
Vincent Vande Vyvre writes:

> Le 06/04/17 à 14:25, Piet van Oostrum a écrit :
>> Steven D'Aprano  writes:
>>
>>> Suppose you have an expensive calculation that gets used two or more
>>> times in a loop. The obvious way to avoid calculating it twice in an
>>> ordinary loop is with a temporary variable:
>>>
>>> result = []
>>> for x in data:
>>>  tmp = expensive_calculation(x)
>>>  result.append((tmp, tmp+1))
>>>
>>>
>>> But what if you are using a list comprehension? Alas, list comps
>>> don't let you have temporary variables, so you have to write this:
>>>
>>>
>>> [(expensive_calculation(x), expensive_calculation(x) + 1) for x in data]
>>>
>>>
>>> Or do you? ... no, you don't!
>>>
>>>
>>> [(tmp, tmp + 1) for x in data for tmp in [expensive_calculation(x)]]
>>>
>>>
>>> I can't decide whether that's an awesome trick or a horrible hack...
>> It is a poor man's 'let'. It would be nice if python had a real 'let'
>> construction. Or for example:
>>
>> [(tmp, tmp + 1) for x in data with tmp = expensive_calculation(x)]
>>
>> Alas!
>
> With two passes
>
> e = [expensive_calculation(x) for x in data]
> final = [(x, y+1) for x, y in zip(e, e)]
>
> Vincent

Imagine some crazy combinatory question - how many ways can one choose
two subsets of the ten decimal digits so that the size of the first is
the minimum of the second and the size of the second is the maximum of
the first _or_ the minima and maxima of the two are the same?

Comprehensions lend themselves readily to such explorations. It happens
that some expensively computed value is needed twice, like the minima
and maxima of the two combinations in this exercise (because this
exercise was carefully crafted to be just so, but anyway), and then it
saves time to do the computations once: let the values have names.

from itertools import combinations as choose

print(sum(1 for m in range(1,10) for n in range(1,10)
  for a in choose(range(1,10), m)
  for b in choose(range(1,10), n)
  if ((len(a) == min(b) and len(b) == max(a)) or
  (min(a) == min(b) and max(a) == max(b)

print(sum(1 for m in range(1,10) for n in range(1,10)
  for a in choose(range(1,10), m)
  for b in choose(range(1,10), n)
  for lena, mina, maxa in [[len(a), min(a), max(a)]]
  for lenb, minb, maxb in [[len(b), min(b), max(b)]]
  if ((lena == minb and lenb == maxa) or
  (mina == minb and maxa == maxb

I realized afterwards that the sizes, len(a) and len(b), already had
names, m and n, and were only used once in the condition anyway, but let
that illustrate the point: this kind of expression lends itself to
analysis and modification, which is what one wants in explorative code.

(But the "for x in [foo(u,w)]" works, so, shrug, I guess? I'd welcome a
proper let construction, but then I find that I can live without.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Appending data to a json file

2017-04-04 Thread Jussi Piitulainen
Michael Torrie writes:

> On 04/03/2017 11:31 PM, dieter wrote:
>> Dave  writes:
>> 
>>> I created a python program that gets data from a user, stores the data
>>> as a dictionary in a list of dictionaries.  When the program quits, it
>>> saves the data file.  My desire is to append the new data to the
>>> existing data file as is done with purely text files.
>> 
>> Usually, you cannot do that:
>> "JSON" stands for "JavaScript Object Notation": 
>
> That's assuming he's using JSON; he never specified what he's using to
> represent data as plain text.  Though I suspect you're correct, for all
> we know he could just be writing data using his own text representation
> or writing to an ini file.  And in his case it sounds like JSON is not
> an ideal method for saving his data since he's wanting to only append
> data, not read it in his program.
>
> In the future, Dave, please provide all the information pertaining to
> the problem so we can give accurate advice. Don't make us guess or
> assume we can all infer this information.

The clue is on the subject line.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python under PowerShell adds characters

2017-03-29 Thread Jussi Piitulainen
lyngw...@gmail.com writes:

> I wrote a Python script, which executed as intended on Linux and from
> cmd.exe on Windows.  Then, I ran it from the PowerShell command line,
> all print statements added ^@ after every character.
>
> Have you seen this?  Do you know how to prevent this?

Script is printing UTF-16 or something, viewer is expecting ASCII or
some eight bit code and making null bytes visible as ^@.

Python gets some default encoding from its environment. There are ways
to set the default, and ways to override the default in the script. For
example, you can specify an encoding when you open a file.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Escaping confusion with Python 3 + MySQL

2017-03-27 Thread Jussi Piitulainen
Gregory Ewing writes:

> Νίκος Βέργος wrote:
>
>> Its still a mystery to em whay this fails syntactically when at the
>> same time INSERT works like that.
>
> I don't think *anyone* understands why SQL was designed with
> INSERT and UPDATE having completely different syntaxes.
> But it was, and we have to live with it.

A story I heard is that IBM had two competing teams working to design a
database system. One team understood programming languages. The other
team understood storage system layouts. The latter team won, and their
system grew up to be SQL.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Escaping confusion with Python 3 + MySQL

2017-03-26 Thread Jussi Piitulainen
Νίκος Βέργος writes:

> print('''UPDATE visitors SET (pagesID, host, ref, location, useros,
> browser, visits) VALUES (%s, %s, %s, %s, %s, %s, %s) WHERE host LIKE
> "%s"''', (pID, domain, ref, location, useros, browser, lastvisit,
> domain) )
>
> prints out:
>
> UPDATE visitors SET (pagesID, host, ref, location, useros, browser,
> visits) VALUES (%s, %s, %s, %s, %s, %s, %s) WHERE host LIKE "%s" (1,
> 'cyta.gr', 'Άμεση Πρόσβαση', 'Greece', 'Windows', 'Chrome', '17-03-24
> 22:04:24', 'cyta.gr')
>
> How should i write the cursor.execute in order to be parsed properly?
> As i have it now %s does not get substituted.
> i use PyMySQL by the way and i have tried every possible combination
> even with % instead of a comma but still produces errors.

You should include the actual statement that produces the errors, not a
very different statement. And you should include the actual text of the
error messages.

To learn about PyMySQL cursor.execute, put "pymysql cursor execute"
(without the quotes) to a search engine and find something like the
following document (the first hit from google.fi for me).

https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html

That looks helpful to me.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Who are the "spacists"?

2017-03-21 Thread Jussi Piitulainen
Grant Edwards writes:

> Well written code _is_ ASCII-art.

:)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The ternaery operator

2017-03-16 Thread Jussi Piitulainen
William Mayor writes:

>> I think it would be nice to have a way of getting the 'true' value as
>> the return with an optional value if false.  The desire comes about
>> when the thing I'm comparing is an element of a collection:
>> 
>>drugs['choice'] if drugs['choice'] else 'pot'
>> 
>> Then I'm tempted to do:
>> 
>>  chosen = drugs['choice']
>>  chosen if chosen else 'pot'
>> 
>> I sometimes feel like doing:
>> 
>>  drugs['choice'] else 'pot'
>> 
>
> For the case where the element in the collection exists, but might be
> falsey you could do:
>
>   drugs[‘choice’] or ‘pot'

drugs.get('choice') or 'pot'

drugs.get('choice', 'pot')

[snip]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regular expression query

2017-03-12 Thread Jussi Piitulainen
rahulra...@gmail.com writes:

> Hi All,
>
> I have a string which looks like
>
> a,b,c "4873898374", d, ee "3343,23,23,5,,5,45", f 
> "5546,3434,345,34,34,5,34,543,7"
>
> It is comma saperated string, but some of the fields have a double
> quoted string as part of it (and that double quoted string can have
> commas).  Above string have only 6 fields. First is a, second is
> b and last is f "5546,3434,345,34,34,5,34,543,7".  How can I
> split this string in its fields using regular expression ? or even if
> there is any other way to do this, please speak out.

If you have any control over the source of this data, try to change the
source so that it writes proper CSV. Then you can use the csv module to
parse the data.

As it is, csv.reader failed me. Perhaps someone else knows how it should
be parameterized to deal with this?

len(next(csv.reader(io.StringIO(s == 20
len(next(csv.reader(io.StringIO(s), doublequote = False))) == 20

Here's a regex solution that assumes that there is something in a field
before the doublequoted part, then at most one doublequoted part and
nothing after the doublequoted part.

len(re.findall(r'([^",]+(?:"[^"]*")?)', s)) == 6

re.findall(r'([^",]+(?:"[^"]*")?)', s)
['a',
'b',
'c "4873898374"',
' d',
' ee "3343,23,23,5,,5,45"',
' f "5546,3434,345,34,34,5,34,543,7"']

The outermost parentheses in the pattern make the whole pattern a
capturing group. They are redundant above (with re.findall) but
important in the following alternative (with re.split).

re.split(r'([^",]+(?:"[^"]*")?)', s)
['', 'a',
',', 'b',
',', 'c "4873898374"',
',', ' d',
',', ' ee "3343,23,23,5,,5,45"',
',', ' f "5546,3434,345,34,34,5,34,543,7"',
'']

This splits the string with the pattern that matches the actual data.
With the capturing group it also returns the actual data. One could then
check that the assumptions hold and every other value is just a comma.

I would make that check and throw an exception on failure.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What's the neatest way of getting dictionary entries in a specified order?

2017-03-08 Thread Jussi Piitulainen
Chris Green writes:

> I have a fairly simple application that populates a GUI window with
> fields from a database table.  The fields are defined/configured by a
> dictionary as follows:-
>
> # 
> # 
> # Address Book field details, dictionary key is the database column 
> # 
> dbcol = {}
> dbcol['firstname'] = col('First Name', True, False)
> dbcol['lastname'] = col('Last Name', True, False)
> dbcol['email'] = col('E-Mail', True, True)
> dbcol['phone'] = col('Phone', True, True)
> dbcol['mobile'] = col('Mobile', True, True)
> dbcol['address'] = col('Address', True, False)
> dbcol['town'] = col('Town/City', True, False)
> dbcol['county'] = col('County/Region', True, False)
> dbcol['postcode'] = col('PostCode', True, False)
> dbcol['country'] = col('Country', True, False)
> dbcol['notes'] = col('Notes', True, False)
> dbcol['www'] = col('Web Page', True, True)
> dbcol['categories'] = col('Categories', True, True)
>
> How can I get the fields in the GUI window in the order I want rather
> than the fairly random order that they appear in at the moment?

Look up OrderedDict in the collections module.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Better way to do this dict comprehesion

2017-03-08 Thread Jussi Piitulainen
Sayth Renshaw writes:

>> To find an unpaired number in linear time with minimal space, try
>> stepping through the list and either adding to a set or removing from
>> it. At the end, your set should contain exactly one element. I'll let
>> you write the actual code :)
>> 
>> ChrisA
>
> ChrisA the way it sounds through the list is like filter with map and
> a lambda. http://www.python-course.eu/lambda.php Haven't had a good go
> yet... will see how it goes.

You mean reduce(xor, map(lambda o : {o}. With suitable imports.

I think Chris is suggesting the same thing but in the piecemeal way of
updating a set at each element. You can test for membership, o in res,
and then res.add(o) or res.remove(o) depending on the test.

And you need to say set() to make an empty set, because {} is dict().
Your input sequence is guaranteed non-empty, but it's simply easier to
start with an empty res.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Better way to do this dict comprehesion

2017-03-08 Thread Jussi Piitulainen
Sayth Renshaw writes:

> Peter I really like this
>
> The complete code: 
>
 from collections import Counter 
 def find_it(seq): 
> ... [result] = [k for k, v in Counter(seq).items() if v % 3 == 0] 
> ... return result 

You confirmed to Chris that you want the item that occurs an odd number
of times. The test for that is v % 2 == 1.

Or just v % 2, given that 0 and 1 are considered false and true, resp.

But v % 3 == 0 is something else.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: str.title() fails with words containing apostrophes

2017-03-06 Thread Jussi Piitulainen
Marko Rauhamaa writes:

> Steve D'Aprano wrote:

>> I came across this book title:
>>
>> Täällä Pohjantähden alla (‘Here beneath the North Star’)
>>
>> http://www.booksfromfinland.fi/1980/12/the-strike/
>>
>> which is partly title case, but I'm not sure what rule is being
>> applied there. My guess is that "Täällä Pohjantähden" means "North
>> Star" and it counts as a proper noun, like countries and people's
>> names, and so takes initial caps for each word. Am I close?
>
> Correct.

Not quite.

"Täällä" is "here", "Pohjantähden" is "North Star". So it's just a first
word and a name.

("Pohja" and "tähti" correspond to "North" and "Star".)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: str.title() fails with words containing apostrophes

2017-03-06 Thread Jussi Piitulainen
Steve D'Aprano writes:

> On Tue, 7 Mar 2017 03:28 am, Grant Edwards wrote:
>> 
>> Besides locale-aware, it'll need to be style-guide-aware so that it
>> knows whether you want MLA, Chicago, Strunk & White, NYT, Gregg,
>> Mrs. Johnson from 9th grade English class, or any of a dozen or two
>> others.  And that's just for US English.  [For all I know, most of
>> the ones I listed agree completely on "title case", but I doubt it.]
>
> As far as I am aware, there are only two conventions for title case in
> English:
>
> Initial Capitals For All The Words In A Sentence.
>
> Initial Capitals For All the Significant Words in a Sentence.
>
> For some unstated, subjective rule for "significant" which usually
> means "three or more letters, excluding the definite article ('the')".

That's where the variation is hidden. I browsed three sites to see what
they do. One doesn't title-capitalize anything. One capitalizes
everything.

One was more interesting. I think it has human editors who pay attention
to these matters. They do not capitalize these short words: 'a', 'an',
'at', 'the', 'in', 'of', 'on', 'for', 'to', 'and', 'vs.'; they
capitalize longer prepositions: 'From', 'Into', 'With', 'Through'. Also
auxiliary verbs and copulas even when short.

A 'Nor' was capitalized in the middle of a title, but there was a
sentence boundary just before the 'Nor'. I'd classify 'nor' with 'and'
otherwise, but they might base the non-capitalization on frequency for
all I know.

Some two-letter words: 'Is', 'Am', 'Do', 'So', 'No', 'He', 'We', 'It',
'My', 'Up'; also 'Au Revoir', 'Oi Oi Oi', 'Ay Ay Ay'.

Then there is 'Grown-Ups' and 'Contrary-to-Fact' but 'X-ing'. Sometimes
a hyphen makes a word boundary, sometimes not.

> But of course there are exceptions: words which are necessarily in
> all-caps should stay in all-caps (e.g. NASA) and names.

There may be lots of these if you are handling something like a tech
news site that talks about people and companies and institutions from
all over the world. Names are tricky.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: str.title() fails with words containing apostrophes

2017-03-06 Thread Jussi Piitulainen
gvm...@gmail.com writes:

> On Monday, March 6, 2017 at 2:37:11 PM UTC+5:30, Jussi Piitulainen wrote:
>> gvm...@gmail.com writes:
>> 
>> > On Sunday, March 5, 2017 at 11:25:04 PM UTC+5:30, Steve D'Aprano wrote:
>> >> I'm trying to convert strings to Title Case, but getting ugly results
>> >> if the words contain an apostrophe:
>> >> 
>> >> 
>> >> py> 'hello world'.title()  # okay
>> >> 'Hello World'
>> >> py> "i can't be having with this".title()  # not okay
>> >> "I Can'T Be Having With This"
>> >> 
>> >> 
>> >> Anyone have any suggestions for working around this?
>> 
>> [snip sig]
>> 
>> > import string
>> >
>> > txt = "i can't be having with this"
>> > string.capwords(txt)
>> >
>> > That gives you "I Can't Be Having With This"
>> >
>> > Hope that helps.
>> 
>> Won't Steve D'aprano And D'arcy Cain Be Happy Now :)
>
>
> I found it at https://docs.python.org/3/library/string.html#string.capwords :)

Sure, it's there, and that's a good point. It still mangles their names.

It also mangles any whitespace in the string. That is probably mostly
harmless.

It also will capitalize all the little words in the string that are
usually not capitalized in titles, even in the usual headlinese English
variants. And all the acronyms and such that are usually written in all
caps, or in even odder patterns.

I guess it's a somewhat practical approximation to an AI-hard problem.
(Mumble mumble str.swapcase, er, never mind me :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: str.title() fails with words containing apostrophes

2017-03-06 Thread Jussi Piitulainen
gvm...@gmail.com writes:

> On Sunday, March 5, 2017 at 11:25:04 PM UTC+5:30, Steve D'Aprano wrote:
>> I'm trying to convert strings to Title Case, but getting ugly results
>> if the words contain an apostrophe:
>> 
>> 
>> py> 'hello world'.title()  # okay
>> 'Hello World'
>> py> "i can't be having with this".title()  # not okay
>> "I Can'T Be Having With This"
>> 
>> 
>> Anyone have any suggestions for working around this?

[snip sig]

> import string
>
> txt = "i can't be having with this"
> string.capwords(txt)
>
> That gives you "I Can't Be Having With This"
>
> Hope that helps.

Won't Steve D'aprano And D'arcy Cain Be Happy Now :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: list of the lists - append after search

2017-03-02 Thread Jussi Piitulainen
Andrew Zyman writes:

> On Thursday, March 2, 2017 at 2:57:02 PM UTC-5, Jussi Piitulainen wrote:
>> Peter Otten <__pete...@web.de> writes:
>> 
>> > Andrew Zyman wrote:
>> >
>> >> On Thursday, March 2, 2017 at 11:27:34 AM UTC-5, Peter Otten wrote:
>> >>> Andrew Zyman wrote:
>> >>> .
>> >>> .
>> >>> > End result:
>> >>> >  ll =[ [a,1], [b,2], [c,3], [blah, 1000, 'new value'] ]
>> >>> 
>> >>> >>> outer = [["a", 1], ["b", 2], ["c", 3], ["blah", 1000]]
>> >>> >>> for inner in outer:
>> >>> ... if inner[0] == "blah":
>> >>> ... inner.append("new value")
>> >> 
>> >> thank you. this will do.
>> >> Just curious, is the above loop can be done in a one-liner?
>> >
>> > Ah, that newbie obsession ;)
>> >
>> >>>> outer = [["a", 1], ["b", 2], ["c", 3], ["blah", 1000]]
>> >>>> [inner + ["new value"] if inner[0] == "blah" else inner for inner in 
>> > outer]
>> > [['a', 1], ['b', 2], ['c', 3], ['blah', 1000, 'new value']]
>> >
>> > Note that there is a technical difference to be aware of -- matching
>> > lists are replaced rather than modified.
>> 
>> I take it you are too sane, or too kind, to suggest the obvious
>> solution:
>> 
>> >>> outer = [["a", 1], ["b", 2], ["c", 3], ["blah", 1000]]
>> >>> [inner.append("new value") for inner in outer if inner[0] == "blah"]
>> [None]
>> >>> outer
>> [['a', 1], ['b', 2], ['c', 3], ['blah', 1000, 'new value']]
>> 
>> [snip]
>
> Arh!!! this is it :)
>
> I'm sure i'll regret this line of code in 2 weeks - after i
> successfully forget what i wanted to achieve :)

Jokes aside, you should strive to express your intention in your code.
Be kind to your future self. Write the three-line loop if you want
in-place modification.

I use comprehensions a lot, but not to save lines. I might make Peter's
expression above a four-liner to make its structure more visible:

  res = [ ( inner + ["new value"]
if inner[0] == "blah"
else inner )
  for inner in outer ]

Maybe.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: list of the lists - append after search

2017-03-02 Thread Jussi Piitulainen
Peter Otten <__pete...@web.de> writes:

> Andrew Zyman wrote:
>
>> On Thursday, March 2, 2017 at 11:27:34 AM UTC-5, Peter Otten wrote:
>>> Andrew Zyman wrote:
>>> .
>>> .
>>> > End result:
>>> >  ll =[ [a,1], [b,2], [c,3], [blah, 1000, 'new value'] ]
>>> 
>>> >>> outer = [["a", 1], ["b", 2], ["c", 3], ["blah", 1000]]
>>> >>> for inner in outer:
>>> ... if inner[0] == "blah":
>>> ... inner.append("new value")
>> 
>> thank you. this will do.
>> Just curious, is the above loop can be done in a one-liner?
>
> Ah, that newbie obsession ;)
>
 outer = [["a", 1], ["b", 2], ["c", 3], ["blah", 1000]]
 [inner + ["new value"] if inner[0] == "blah" else inner for inner in 
> outer]
> [['a', 1], ['b', 2], ['c', 3], ['blah', 1000, 'new value']]
>
> Note that there is a technical difference to be aware of -- matching
> lists are replaced rather than modified.

I take it you are too sane, or too kind, to suggest the obvious
solution:

>>> outer = [["a", 1], ["b", 2], ["c", 3], ["blah", 1000]]
>>> [inner.append("new value") for inner in outer if inner[0] == "blah"]
[None]
>>> outer
[['a', 1], ['b', 2], ['c', 3], ['blah', 1000, 'new value']]

[snip]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Looking for documentation on how Python assigns to function parameters

2017-03-01 Thread Jussi Piitulainen
Steve D'Aprano writes:

> Given a function like this:
>
>
> def func(alpha, beta, gamma, delta=4, *args, **kw):
> ...
>
>
> which is called in some fashion:
>
> # say
> func(1, 2, gamma=3, epsilon=5)
>
> which may or may not be valid:
>
> func(1, 2, alpha=0)
>
> how does Python match up the formal parameters in the `def` statement with
> the arguments given in the call to `func`?
>
> I'm looking for official docs, if possible. So far I've had no luck finding
> anything.

Possibly https://docs.python.org/3/reference/expressions.html#calls

Python 3.6.0 documentation, Language Reference
6. Expressions
6.3 Primaries
6.3.4 Calls
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to flatten only one sub list of list of lists

2017-02-28 Thread Jussi Piitulainen
Sayth Renshaw writes:

> How can I flatten just a specific sublist of each list in a list of lists?
>
> So if I had this data
>
>
> [   ['46295', 'Montauk', '3', '60', '85', ['19', '5', '1', '0 $277790.00']],
> ['46295', 'Dark Eyes', '5', '59', '83', ['6', '4', '1', '0 $105625.00']],
> ['46295', 'Machinegun Jubs', '6', '53', '77', ['6', '2', '1', '1 
> $71685.00']],
> ['46295', 'Zara Bay', '1', '53', '77', ['12', '2', '3', '3 $112645.00']]]
>
>
> How can I make it be
>
>
> [   ['46295', 'Montauk', '3', '60', '85', '19', '5', '1', '0 $277790.00'],
> ['46295', 'Dark Eyes', '5', '59', '83', '6', '4', '1', '0 $105625.00'],
> ['46295', 'Machinegun Jubs', '6', '53', '77', '6', '2', '1', '1 
> $71685.00'],
> ['46295', 'Zara Bay', '1', '53', '77', '12', '2', '3', '3 $112645.00']]

Here's two ways. First makes a copy, second flattens each list in place,
both assume it's the last member (at index -1) of the list that needs
flattening.

datami = [   ['46295', 'Montauk', '3', '60', '85', ['19', '5', '1', '0 
$277790.00']],
 ['46295', 'Dark Eyes', '5', '59', '83', ['6', '4', '1', '0 
$105625.00']],
 ['46295', 'Machinegun Jubs', '6', '53', '77', ['6', '2', '1', '1 
$71685.00']],
 ['46295', 'Zara Bay', '1', '53', '77', ['12', '2', '3', '3 
$112645.00']]]

flat = [ data[:-1] + data[-1] for data in datami ]

for data in datami: data.extend(data.pop())

print('flat copy of datami:', *flat, sep = '\n')
print('flattened datami:', *datami, sep = '\n')
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Usage of ast.

2017-02-27 Thread Jussi Piitulainen
Vincent Vande Vyvre writes:

> I've this strange error:
>
> Python 3.4.3 (default, Nov 17 2016, 01:08:31)
> [GCC 4.8.4] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 import ast
 l = "print('hello world')"
 ast.literal_eval(l)
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/usr/lib/python3.4/ast.py", line 84, in literal_eval
> return _convert(node_or_string)
>   File "/usr/lib/python3.4/ast.py", line 83, in _convert
> raise ValueError('malformed node or string: ' + repr(node))
> ValueError: malformed node or string: <_ast.Call object at 0x7fcf955871d0>
>
>
> Is it an import question ?

print('hello world') is not a literal.

Literals are expressions that somehow stand for themselves. Try
help(ast.literal_eval) for a more detailed definition.

literal_eval('x') # not ok: x is not a literal
literal_eval('"x"') # ok: "x" is a literal
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: CSV

2017-02-22 Thread Jussi Piitulainen
Braxton Alfred writes:

> Why does this not run?  It is right out of the CSV file in the Standard Lib.
>
>
>  
>
> Python ver 3.4.4, 64 bit.
>
>  
>
>  
>
>  
>
> import csv
> """ READ EXCEL FILE """
> filename = 'c:\users\user\my documents\Braxton\Excel\personal\bp.csv'

'\b' is backspace. A couple of months ago I actually met a filename with
a backspace in it. I renamed the file. Or maybe I removed it, I forget.

But don't you get a SyntaxError from '\user'?

[snip]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: str.format fails with JSON?

2017-02-21 Thread Jussi Piitulainen
carlopi...@gmail.com writes:

> Hi,
>
> When I run this piece of code:
>
> 'From {"value": 1}, value={value}'.format(value=1)
>
> Python complains about the missing "value" parameter (2.7.12 and
> 3.6.x):

Perhaps you know this, but just to be sure, and for the benefit of any
reader who doesn't: double the braces in the format string when you
don't mean them to be interpreted as a parameter.

> But according to the format string syntax
> (https://docs.python.org/2/library/string.html):

[- -]

> So according to the specification, {value} should be recognized as a
> valid format string identifier and {"value"} should be ignored.
>
> Python seems to not follow the specification in the documentation.
> Anything inside the keys is accepted as identifier.

I think raising an error is more helpful than ignoring it. I think.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: WANT: bad code in python (for refactoring example)

2017-02-17 Thread Jussi Piitulainen
Makoto Kuwata  writes:

> On Thu, Feb 16, 2017 at 6:53 AM, Erik  wrote:
>>
>> (Python code examples of what you think is "bad" vs "good" would be
>> useful).
>
> You are right.
>
> Bad code Example:
>
> #
> https://codewords.recurse.com/issues/one/an-introduction-to-functional-programming
>
> from random import random
>
> def move_cars(car_positions):
> return map(lambda x: x + 1 if random() > 0.3 else x,
>car_positions)
>
> def output_car(car_position):
> return '-' * car_position
>
> def run_step_of_race(state):
> return {'time': state['time'] - 1,
> 'car_positions': move_cars(state['car_positions'])}
>
> def draw(state):
> print ''
> print '\n'.join(map(output_car, state['car_positions']))
>
> def race(state):
> draw(state)
> if state['time']:
> race(run_step_of_race(state))
>
> race({'time': 5,
>   'car_positions': [1, 1, 1]})

Here's a rewrite in functional style, which I consider somewhat better
than the original:

from random import random

def move(positions):
return [ pos + bool(random() > 0.3) for pos in positions ]

def race(positions, steps):
for step in range(steps):
positions = move(positions)
yield positions

def draw(positions):
print(*('-' * pos for pos in positions), sep = '\n')

if __name__ == '__main__':
for step, positions in enumerate(race([1,1,1], 5)):
step and print()
draw(positions)

I did a number of things, but mainly I reconceptualized the race itself
explicitly as a sequence of positions. While a Python generator is an
extremely stateful object, with some care it works well in functional
style.

> Refactoring example:
>
> from random import random
>
> class Car(object):
>
> def __init__(self):
> self.position = 1
>
> def move(self):
> if random() > 0.3:
> self.position += 1
> return self.position
>
> class Race(object):
>
> def __init__(self, n_cars=3):
> self._cars = [ Car() for _ in range(n_cars) ]
>
> def round(self):
> for car in self._cars:
> car.move()
>
> def report(self):
> print("")
> for car in self._cars:
> print('-' * car.position)
>
> def run(self, n_rounds=5):
> self.report()
> for _ in range(n_rounds):
> self.round()
> self.report()
>
> if __name__ == '__main__':
> Race(3).run(5)

If you want to refactor bad code into better code, it would be more
honest to start with code that is already in your preferred style. That
would be a good lesson.

Now you've taken questionable code in a somewhat functional style and
refactored it into object-oriented style. But you don't call them that.
You call them bad code and good code. That's a bad lesson. It conflates
issues.

Take good functional code, refactor it into your preferred style. Also
do the reverse. That would be a good lesson, assuming your students are
ready for such discussion.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Extract sigle file from zip file based on file extension

2017-02-10 Thread Jussi Piitulainen
loial writes:

> I need to be able to extract a single file from a .zip file in python.
>
> The zip file will contain many files. The file to extract will be the
> only .csv file in the zip, but the full name of the csv file will not
> be known.
>
> Can this be done in python?

Find the one member name that ends with ".csv". If the following
assignment crashes, it wasn't true that there is exactly one such.

with zipfile.ZipFile(path, "r") as f:
[member] = [name for name in f.namelist() if name.endswith(".csv")]
# extract member here now that you know its full name
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Rename file without overwriting existing files

2017-02-09 Thread Jussi Piitulainen
Steve D'Aprano writes:

> On Mon, 30 Jan 2017 09:39 pm, Peter Otten wrote:
>
> def rename(source, dest):
>> ... os.link(source, dest)
>> ... os.unlink(source)
>> ...
> rename("foo", "baz")
> os.listdir()
>> ['bar', 'baz']
> rename("bar", "baz")
>> Traceback (most recent call last):
>>   File "", line 1, in 
>>   File "", line 2, in rename
>> FileExistsError: [Errno 17] File exists: 'bar' -> 'baz'
>
>
> Thanks Peter!
>
> That's not quite ideal, as it isn't atomic: it is possible that the link
> will succeed, but the unlink won't. But I prefer that over the alternative,
> which is over-writing a file and causing data loss.
>
> So to summarise, os.rename(source, destination):
>
> - is atomic on POSIX systems, if source and destination are both on the 
>   same file system;
>
> - may not be atomic on Windows?
>
> - may over-write an existing destination on POSIX systems, but not on
>   Windows;
>
> - and it doesn't work across file systems.
>
> os.replace(source, destination) is similar, except that it may over-write an
> existing destination on Windows as well as on POSIX systems.
>
>
> The link/unlink trick:
>
> - avoids over-writing existing files on POSIX systems at least;
>
> - but maybe not Windows?
>
> - isn't atomic, so in the worst case you end up with two links to
>   the one file;
>
> - but os.link may not be available on all platforms;
>
> - and it won't work across file systems.
>
>
> Putting that all together, here's my attempt at a version of file rename
> which doesn't over-write existing files:
>
>
> import os
> import shutil
>
> def rename(src, dest):
> """Rename src to dest only if dest doesn't already exist (almost)."""
> if hasattr(os, 'link'):
> try:
> os.link(src, dest)
> except OSError:
> pass
> else:
> os.unlink(src)
> return
> # Fallback to an implementation which is vulnerable to a 
> # Time Of Check to Time Of Use bug.
> # Try to reduce the window for this race condition by minimizing
> # the number of lookups needed between one call and the next.
> move = shutil.move
> if not os.file.exists(dest):
> move(src, dest)
> else:
> raise shutil.Error("Destination path '%s' already exists" % dest)
>
>
>
> Any comments? Any bugs? Any cross-platform way to slay this TOCTOU bug once
> and for all?

To claim the filename before crossing a filesystem boundary, how about:

1) create a temporary file in the target directory (tempfile.mkstemp)

2) link the temporary file to the target name (in the same directory)

3) unlink the temporary name

4) now it should be safe to move the source file to the target name

5) set permissions and whatever other attributes there are?

Or maybe copy the source file to the temporary name, link the copy to
the target name, unlink the temporary name, unlink the source file;
failing the link step: unlink the temporary name but do not unlink the
source file.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: best way to ensure './' is at beginning of sys.path?

2017-02-04 Thread Jussi Piitulainen
Wildman writes:

> On Sat, 04 Feb 2017 11:27:01 +0200, Jussi Piitulainen wrote:
>
>> Wildman writes:
>> 
>> [snip]
>> 
>>> If anyone is interested the correct way is to add this to
>>> /etc/profile (at the bottom):
>>>
>>> PATH=$PATH:./
>>> export PATH
>> 
>> Out of interest, can you think of a corresponding way that a mere
>> user can remove the dot from their $PATH after some presumably
>> well-meaning system administrator has put it there?
>> 
>> Is there any simple shell command for it? One that works whether the
>> dot is at the start, in the middle, or at the end, and with or
>> without the slash, and whether it's there more than once or not at
>> all.
>> 
>> And I'd like it to be as short and simple as PATH="$PATH:.", please.
>
> No, I do not know.  You might try your question in
> a linux specific group.  Personally I don't understand
> the danger in having the dot in the path.  The './'
> only means the current directory.  DOS and Windows
> has searched the current directory since their
> beginning.  Is that also dangerous?

I'd just like to be able to decide for myself.

(Which I am, of course. In shell it's just more annoying to remove than
it is to add, as far as I know.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: best way to ensure './' is at beginning of sys.path?

2017-02-04 Thread Jussi Piitulainen
Wildman writes:

[snip]

> If anyone is interested the correct way is to add this to
> /etc/profile (at the bottom):
>
> PATH=$PATH:./
> export PATH

Out of interest, can you think of a corresponding way that a mere user
can remove the dot from their $PATH after some presumably well-meaning
system administrator has put it there?

Is there any simple shell command for it? One that works whether the dot
is at the start, in the middle, or at the end, and with or without the
slash, and whether it's there more than once or not at all.

And I'd like it to be as short and simple as PATH="$PATH:.", please.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: update a list element using an element in another list

2017-01-31 Thread Jussi Piitulainen
Daiyue Weng writes:

> Hi, I am trying to update a list of dictionaries using another list of
> dictionaries correspondingly. In that, for example,
>
> #  the list of dicts that need to be updated
> dicts_1 = [{'dict_1': '1'}, {'dict_2': '2'}, {'dict_3': '3'}]
>
> # dict used to update dicts_1
> update_dicts = [{'dict_1': '1_1'}, {'dict_2': '1_2'}, {'dict_3': '1_3'}]
>
> so that after updating,
>
> dicts_1 = [{'dict_1': '1_1'}, {'dict_2': '1_2'}, {'dict_3': '1_3'}]
>
> what's the best way to the updates?
>
> This is actually coming from when I tried to create a list of entities
> (dictionaries), then updating the entities using another list
> dictionaries using google.cloud.datastore.
>
> entities = [Entity(self.client.key(kind, entity_id)) for entity_id in
> entity_ids]
>
> # update entities using update_dicts
> for j in range(len(entities)):
> for i in range(len(update_dicts)):
> if j == i:
>entities[j].update(update_dicts[i])

[I restored the indentation.]

> I am wondering is there a brief way to do this.

A straightforward algorithmic improvement:

for j in range(len(entities)):
entities[j].update(update_dicts[j])

The real thing:

for e, u in zip(entities, update_dicts):
e.update(u)

A thing between those:

for j, e in enumerate(entities):
e.update(update_dicts[j])

(By symmetry, you could enumerate update_dicts instead.)

It pays to learn zip and enumerate.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Rename file without overwriting existing files

2017-01-30 Thread Jussi Piitulainen
Peter Otten writes:

> Jussi Piitulainen wrote:
>
>> Peter Otten writes:
>> 
>>> Steve D'Aprano wrote:
>
>>>> The wider context is that I'm taking from 1 to 
>>>> path names to existing files as arguments, and for each path name I
>>>> transfer the file name part (but not the directory part) and then rename
>>>> the file. For example:
>>>> 
>>>> foo/bar/baz/spam.txt
>>>>
>>>> may be renamed to:
>>>>
>>>> foo/bar/baz/ham.txt
>>>> 
>>>> but only provided ham.txt doesn't already exist.
>>>
>>> Google finds
>>>
>>> http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions
>>>
>>> and from a quick test it appears to work on Linux:
>> 
>> It doesn't seem to be documented. 
>
> For functions with a C equivalent a look into the man page is usually
> helpful.

Followed by a few test cases to see what Python actually does, at least
in those particular test cases, I suppose. Yes.

But is it a bug in Python if a Python function *doesn't* do what the
relevant man page in the user's operating system says? Or whatever the
user's documentation entry is called. For me, yes, it's a man page.

>> I looked at help(os.link) on Python
>> 3.4 and the corresponding current library documentation on the web. I
>> saw no mention of what happens when dst exists already.
>> 
>> Also, creating a hard link doesn't seem to work between different file
>> systems, which may well be relevant to Steve's case.
>
> In his example above he operates inside a single directory. Can one
> directory spread across multiple file systems?

Hm, you are right, he does say he's working in a single directory.

But *I'm* currently working on processes where results from a batch
system are eventually moved to another directory, and I have no control
over the file systems. So while it was interesting to learn about
os.link, I cannot use os.link here; on the other hand, I can use
shutil.move, and in my present case it will only accidentally overwrite
a file if I've made a programming mistake myself, or if the underlying
platform is not working as advertised, so I'm in a different situation.

[- -]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Rename file without overwriting existing files

2017-01-30 Thread Jussi Piitulainen
Peter Otten writes:

> Steve D'Aprano wrote:
>
>> On Mon, 30 Jan 2017 03:33 pm, Cameron Simpson wrote:
>> 
>>> On 30Jan2017 13:49, Steve D'Aprano  wrote:
This code contains a Time Of Check to Time Of Use bug:

if os.path.exists(destination)
raise ValueError('destination already exists')
os.rename(oldname, destination)


In the microsecond between checking for the existence of the destination
and actually doing the rename, it is possible that another process may
create the destination, resulting in data loss.

Apart from keeping my fingers crossed, how should I fix this TOCTOU bug?
>>> 
>>> For files this is a problem at the Python level. At the UNIX level you
>>> can play neat games with open(2) and the various O_* modes.
>>> 
>>> however, with directories things are more cut and dry. Do you have much
>>> freedom here? What's the wider context of the question?
>> 
>> The wider context is that I'm taking from 1 to 
>> path names to existing files as arguments, and for each path name I
>> transfer the file name part (but not the directory part) and then rename
>> the file. For example:
>> 
>> foo/bar/baz/spam.txt
>> 
>> may be renamed to:
>> 
>> foo/bar/baz/ham.txt
>> 
>> but only provided ham.txt doesn't already exist.
>
> Google finds
>
> http://stackoverflow.com/questions/3222341/how-to-rename-without-race-conditions
>
> and from a quick test it appears to work on Linux:

It doesn't seem to be documented. I looked at help(os.link) on Python
3.4 and the corresponding current library documentation on the web. I
saw no mention of what happens when dst exists already.

Also, creating a hard link doesn't seem to work between different file
systems, which may well be relevant to Steve's case. I get:

OSError: [Errno 18] Invalid cross-device link: [snip]

And that also is not mentioned in the docs.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Jussi Piitulainen
Chris Angelico writes:

> On Sun, Jan 22, 2017 at 2:56 AM, Jussi Piitulainen wrote:
>> Steve D'Aprano writes:
>>
>> [snip]
>>
>>> You could avoid that error by increasing the offset by the right
>>> amount:
>>>
>>> stuff = text[offset + len("ф".encode('utf-8'):]
>>>
>>> which is awful. I believe that's what Go and Julia expect you to do.
>>
>> Julia provides a method to get the next index.
>>
>> let text = "ἐπὶ οἴνοπα πόντον", offset = 1
>> while offset <= endof(text)
>> print(text[offset], ".")
>> offset = nextind(text, offset)
>> end
>> println()
>> end # prints: ἐ.π.ὶ. .ο.ἴ.ν.ο.π.α. .π.ό.ν.τ.ο.ν.
>
> This implies that regular iteration isn't good enough, though.

It doesn't. Here's the straightforward iteration over the whole string:

let text = "ἐπὶ οἴνοπα πόντον"
for c in text
print(c, ".")
end
println()
end # prints: ἐ.π.ὶ. .ο.ἴ.ν.ο.π.α. .π.ό.ν.τ.ο.ν.

One can also join any iterable whose elements can be converted to
strings, and characters can:

let text = "ἐπὶ οἴνοπα πόντον"
println(join(text, "."), ".")
end # prints: ἐ.π.ὶ. .ο.ἴ.ν.ο.π.α. .π.ό.ν.τ.ο.ν.

And strings, trivially, can:

let text = "ἐπὶ οἴνοπα πόντον"
println(join(split(text), "."), ".")
end # prints: ἐπὶ.οἴνοπα.πόντον.

> Here's a function that creates a numbered list:
>
> def print_list(items):
> width = len(str(len(items)))
> for idx, item in enumerate(items, 1):
> print("%*d: %s" % (width, idx, item))
>
> In Python, this will happily accept anything that is iterable and has
> a known length. Could be a list or tuple, obviously, but can also just
> as easily be a dict view (keys or items), a range object, or a
> string. It's perfectly acceptable to enumerate the characters of a
> string. And enumerate() itself is implemented entirely generically.

I'll skip the formatting - I don't know off-hand how to do it - but keep
the width calculation, and I cut the character iterator short at 10
items to save some space. There, it's much the same in Julia:

let text = "ἐπὶ οἴνοπα πόντον"

function print_list(items)
width = endof(string(length(items)))
println("width = ", width)
for (idx, item) in enumerate(items)
println(idx, '\t', item)
end
end

print_list(take(text, 10))
print_list([text, text, text])
print_list(split(text))
end

That prints this:

width = 2
1   ἐ
2   π
3   ὶ
4
5   ο
6   ἴ
7   ν
8   ο
9   π
10  α
width = 1
1   ἐπὶ οἴνοπα πόντον
2   ἐπὶ οἴνοπα πόντον
3   ἐπὶ οἴνοπα πόντον
width = 1
1   ἐπὶ
2   οἴνοπα
3   πόντον

> If you have to call nextind() to get the next character, you've made
> it impossible to do any kind of generic operation on the text. You
> can't do a windowed view by slicing while iterating, you can't have a
> "lag" or "lead" value, you can't do any of those kinds of simple and
> obvious index-based operations.

Yet Julia does with ease many things that you seem to think it cannot
possibly do at all. The iteration system works on types that have
methods for certain generic functions. For strings, the default is to
iterate over something like its characters; I think another iterator
over valid indexes is available, or wouldn't be hard to write; it could
be forward or backward, and in Julia many of these things are often
peekable by default (because the iteration protocol itself does not have
state - see below at "more magic").

The usual things work fine:

let text = "ἐπὶ οἴνοπα πόντον"
foreach(print, enumerate(zip(text, split(text
end # prints: (1,('ἐ',"ἐπὶ"))(2,('π',"οἴνοπα"))(3,('ὶ',"πόντον"))

How is that bad?

More magic:

let text = "ἐπὶ οἴνοπα πόντον"
let ever = cycle(split(text))
println(first(ever))
println(first(ever))
for n in 2:6
println(join(take(ever, n), " "))
end end end

This prints the following. The cycle iterator, ever, produces an
endless repetition of the three words, but it doesn't have state like
Python iterators do, so it's possible to look at the first word twice
(and then five more times).

ἐπὶ
ἐπὶ
ἐπὶ οἴνοπα
ἐπὶ οἴνοπα πόντον
ἐπὶ οἴνοπα πόντον ἐπὶ
ἐπὶ οἴνοπα πόντον ἐπὶ οἴνοπα
ἐπὶ οἴνοπα πόντον ἐπὶ οἴνοπα πόντον

> Oh, and Python 3.3 wasn't the first programming language to use this
> flexible string representation. Pike introduced an extremely similar
> string representation back in 1998:
>
> https://github.com/pikelang/Pike/commit/db4a4

Ok. Is GitHub that old?

> So yes, UTF-8 has its advantages. But it also has its costs, and for a
> text processing language like Pike or Python, they significantly
> outweigh the benefits.

I process text in my work but I really don't use character indexes much
at all. Rather split, join, startswith, endswith, that kind of thing,
and whether a string contains some character or substring anywhere.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Jussi Piitulainen
Steve D'Aprano writes:

[snip]

> You could avoid that error by increasing the offset by the right
> amount:
>
> stuff = text[offset + len("ф".encode('utf-8'):]
>
> which is awful. I believe that's what Go and Julia expect you to do.

Julia provides a method to get the next index.

let text = "ἐπὶ οἴνοπα πόντον", offset = 1
while offset <= endof(text)
print(text[offset], ".")
offset = nextind(text, offset)
end
println()
end # prints: ἐ.π.ὶ. .ο.ἴ.ν.ο.π.α. .π.ό.ν.τ.ο.ν.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 393 vs UTF-8 Everywhere

2017-01-20 Thread Jussi Piitulainen
Chris Angelico writes:

> On Sat, Jan 21, 2017 at 11:30 AM, Pete Forman wrote:

>> I was asserting that most useful operations on strings start from
>> index 0. The r* operations would not be slowed down that much as
>> UTF-8 has the useful property that attempting to interpret from a
>> byte that is not at the start of a sequence (in the sense of a code
>> point rather than Python) is invalid and so quick to move over while
>> working backwards from the end.
>
> Let's take one very common example: decoding JSON. A ton of web
> servers out there will call json.loads() on user-supplied data. The
> bulk of the work is in the scanner, which steps through the string and
> does the actual parsing. That function is implemented in Python, so
> it's a good example. (There is a C accelerator, but we can ignore that
> and look at the pure Python one.)
>
> So, how could you implement this function? The current implementation
> maintains an index - an integer position through the string. It
> repeatedly requests the next character as string[idx], and can also
> slice the string (to check for keywords like "true") or use a regex
> (to check for numbers). Everything's clean, but it's lots of indexing.
> Alternatively, it could remove and discard characters as they're
> consumed. It would maintain a string that consists of all the unparsed
> characters. All indexing would be at or near zero, but after every
> tiny piece of parsing, the string would get sliced.
>
> With immutable UTF-8 strings, both of these would be O(n^2). Either
> indexing is linear, so parsing the tail of the string means scanning
> repeatedly; or slicing is linear, so parsing the head of the string
> means slicing all the rest away.
>
> The only way for it to be fast enough would be to have some sort of
> retainable string iterator, which means exposing an opaque "position
> marker" that serves no purpose other than parsing. Every string parse
> operation would have to be reimplemented this way, lest it perform
> abysmally on large strings. It'd mean some sort of magic "thing" that
> probably has a reference to the original string, so you don't get the
> progressive RAM refunds that slicing gives, and you'd still have to
> deal with lots of the other consequences. It's probably doable, but it
> would be a lot of pain.

Julia does this. It has immutable UTF-8 strings, and there is a JSON
parser. The "opaque position marker" is just the byte index. An attempt
to use an invalid index throws an error. A substring type points to an
underlying string. An iterator, called graphemes, even returns
substrings that correspond to what people might consider a character.

I offer Julia as evidence.

My impression is that Julia's UTF-8-based system works and is not a
pain. I wrote a toy function once to access the last line of a large
memory-mapped text file, so I have just this little bit of personal
experience of it, so far. Incidentally, can Python memory-map a UTF-8
file as a string?

http://docs.julialang.org/en/stable/manual/strings/
https://github.com/JuliaIO/JSON.jl
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Enum with only a single member

2017-01-09 Thread Jussi Piitulainen
Steven D'Aprano writes:

> Is it silly to create an enumeration with only a single member? That
> is, a singleton enum?
>
> from enum import Enum
>
> class Unique(Enum):
> FOO = auto()
>
>
> The reason I ask is that I have two functions that take an enum
> argument. The first takes one of three enums:
>
> class MarxBros(Enum):
> GROUCHO = 1
> CHICO = 2
> HARPO = 3
>
> and the second takes *either* one of those three, or a fourth distinct
> value. So in my two functions, I have something like this:
>
>
> def spam(arg):
> if isinstance(arg, MarxBros): 
> ...
>
>
> def ham(arg):
> if isinstance(arg, MarxBros) or arg is Unique.FOO:
> ...
>
>
> Good, bad or indifferent?

With a class of its own, the single value can be identified by its class
uniformly with the other relevant values.

def ham(arg):
if insinstance(arg, (MarxBros, Unique)):
...

Seems good to me.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-08 Thread Jussi Piitulainen
Paul Rubin writes:

> I think Python's version of iterators is actually buggy and at least
> the first element of the rest of the sequence should be preserved.
> There are ways to fake it but they're too messy for something like
> this.  It should be the default and might have been a good change for
> Python 3.

It could still be added as an option, to both takewhile and iter(_, _).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-08 Thread Jussi Piitulainen
Paul Rubin writes:

> Jussi Piitulainen writes:
>> That would return 0 even when there is no 0 in xs at all.
>
> Doesn't look that way to me:
>
> >>> minabs([5,3,1,2,4])
> 1

Sorry about that. I honestly meant to say it would return 1 even when
there was a single 0 at the very end. Somehow I got seriously confused.

You noticed the actual problem yourself.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-07 Thread Jussi Piitulainen
Pablo Lucena writes:

> How about using the second usage of builtin iter()?
>
> In [92]: iter?
> Docstring:
> iter(iterable) -> iterator
> iter(callable, sentinel) -> iterator

Nice to learn about that. But it has the same problem as
itertools.takewhile:

> In [88]: numbers
> Out[88]: [1, 9, 8, 11, 22, 4, 0, 3, 5, 6]
>
> # create iterator over the numbers to make callable simple
> # you may pre-sort or do w/e as needed of course
> In [89]: numbers_it = iter(numbers)
>
> # callable passed into iter - you may customize this
> # using functools.partial if need to add function arguments
> In [90]: def grab_until():
> ...: return next(numbers_it)
> ...:
>
> # here 0 is the 'sentinel' ('int()' would work as well as you have
> # the iterator produced by iter() here stops as soon as sentinel value
> # is encountered
> In [91]: list(iter(grab_until, 0))
> Out[91]: [1, 9, 8, 11, 22, 4]

You get the same with numbers = [1, 9, 8, 11, 22, 4], where 0 does not
occur. How do you then tell which it was?

I think both itertools.takewhile(-, -) and iter(-, -) should have an
option to include the sentinel in their output. Then they could be used
in situations like this.

[The discussion with Rustom about Haskell's lazy evaluation is not
related to this, as far as I can see, so I just snipped it from here.]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-07 Thread Jussi Piitulainen
Paul Rubin writes:

> Jussi Piitulainen writes:
>>> Use itertools.takewhile
>> How? It consumes the crucial stop element:
>
> Oh yucch, you're right, it takes it from both sides.  How about this:
>
> from itertools import takewhile, islice
> def minabs(xs):
>   a = iter(xs)
>   m = min(map(abs,takewhile(lambda x: x!=0, a)))
>   z = list(islice(a,1))
>   if z: return 0
>   return m

That would return 0 even when there is no 0 in xs at all.

(It would also return the absolute value, not a value whose absolute
value is minimal, but that is easy to fix.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-07 Thread Jussi Piitulainen
Rustom Mody writes:
> On a Saturday, Jussi Piitulainen wrote:

[snip]

>> You switched to a simpler operator. Would Haskell notice that
>> 
>>def minabs(x, y): return min(x, y, key = abs)
>> 
>> has a meaningful zero? Surely it has its limits somewhere and then
>> the programmer needs to supply the information.
>
> Over ℕ multiply has 1 identity and 0 absorbent
> min has ∞ as identity and 0 as absorbent
> If you allow for ∞ they are quite the same

There is nothing like ∞ in Python ints. Floats would have one, but we
can leave empty minimum undefined instead. No worries.

> Below I am pretending that 100 = ∞

Quite silly but fortunately not really relevant.

> Here are two lazy functions:
> mul.0.y = 0  -- Lazy in y ie y not evaluated
> mul.x.y = x*y
>
> minm.0.y = 0  -- likewise lazy in y
> minm.x.y = min.x.y

Now I don't see any reason to avoid the actual function that's been the
example in this thread:

minabs.0.y = 0
minabs.x.y = x if abs.x <= abs.y else y 

And now I see where the desired behaviour comes from in Haskell. The
absorbing clause is redundant, apart from providing the specific
stopping condition explicitly.

> Now at the interpreter:
> ? foldr.minm . 100.[1,2,3,4]
> 1 : Int
> ? foldr.minm . 100.[1,2,3,4,0]
> 0 : Int
> ? foldr.minm . 100.([1,2,3,4,0]++[1...])
> 0 : Int
>
> The last expression appended [1,2,3,4,0] to the infinite list of numbers.
>
> More succinctly:
> ? foldr.minm . 100.([1,2,3,4,0]++undefined)
> 0 : Int
>
> Both these are extremal examples of what Peter is asking for — avoiding an 
> expensive computation

Ok. Thanks.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-07 Thread Jussi Piitulainen
Chris Angelico writes:

> On Sat, Jan 7, 2017 at 7:12 PM, Jussi Piitulainen wrote:

>> You switched to a simpler operator. Would Haskell notice that
>>
>>def minabs(x, y): return min(x, y, key = abs)
>>
>> has a meaningful zero? Surely it has its limits somewhere and then
>> the programmer needs to supply the information.
>
> If the return value of abs is int(0..) then yeah, it could. (Or
> whatever the notation is. That's Pike's limited-range-int type
> syntax.)

Maybe so. If Haskell abs has such types. (For integers, rationals,
whatever numeric types Haskell has, which I've quite forgotten, or it
may have even changed since I knew some Haskell. It's been a while.)

I rewrite the question so that the answer cannot be deduced from just
the types of the functions:

def minabs(x, y): return min(x, y, key = lambda w: max(w, -w))

Surely max of two ints is an int. Maybe the Haskell compiler could
specialize the type, but my question is, is it _guaranteed_ to do so,
and how should the programmer know to rely on that?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-07 Thread Jussi Piitulainen
Rustom Mody writes:

> On Saturday, Jussi Piitulainen wrote:
>> Paul Rubin writes:
>> 
>> > Peter Otten writes:
>> >> How would you implement stopmin()?
>> >
>> > Use itertools.takewhile
>> 
>> How? It consumes the crucial stop element:
>> 
>>it = iter('what?')
>>list(takewhile(str.isalpha, it)) # ==> ['w', 'h', 'a', 't']
>>next(it, 42) # ==> 42
>
> I was also wondering how…
> In a lazy language (eg haskell) with non-strict foldr (reduce but
> rightwards) supplied non-strict operator this is trivial.
> ie in python idiom with reduce being right_reduce
> reduce(operator.mul, [1,2,0,4,...], 1)
> the reduction would stop at the 0
> Not sure how to simulate this in a strict language like python
> Making fold(r) non-strict by using generators is ok
> How to pass a non-strict operator?

I think it would have to be some really awkward pseudo-operator that
throws an exception when it encounters its zero, and then reduce (or
something outside reduce) would catch that exception. Something like
that could be done but it would still be awkward. Don't wanna :)

You switched to a simpler operator. Would Haskell notice that

   def minabs(x, y): return min(x, y, key = abs)

has a meaningful zero? Surely it has its limits somewhere and then the
programmer needs to supply the information.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-06 Thread Jussi Piitulainen
Paul Rubin writes:

> Peter Otten writes:
>> How would you implement stopmin()?
>
> Use itertools.takewhile

How? It consumes the crucial stop element:

   it = iter('what?')
   list(takewhile(str.isalpha, it)) # ==> ['w', 'h', 'a', 't']
   next(it, 42) # ==> 42
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Search a sequence for its minimum and stop as soon as the lowest possible value is found

2017-01-06 Thread Jussi Piitulainen
Peter Otten writes:

> Example: you are looking for the minimum absolute value in a series of 
> integers. As soon as you encounter the first 0 it's unnecessary extra work 
> to check the remaining values, but the builtin min() will continue.
>
> The solution is a minimum function that allows the user to specify a stop 
> value:
>
 from itertools import count, chain
 stopmin(chain(reversed(range(10)), count()), key=abs, stop=0)
> 0
>
> How would you implement stopmin()?

Only let min see the data up to, but including, the stop value:

from itertools import groupby

def takeuntil(data, pred):
'''Take values from data until and including the first that
satisfies pred (until data is exhausted if none does).'''
for kind, group in groupby(data, pred):
if kind:
yield next(group)
break
else:
yield from group

def stopmin(data, key, stop):
return min(takeuntil(data, lambda o : key(o) == stop),
   key = key)

data = '31415926'
for stop in range(5):
print(stop,
  '=>', repr(''.join(takeuntil(data, lambda o : int(o) == stop))),
  '=>', repr(stopmin(data, int, stop)))

# 0 => '31415926' => '1'
# 1 => '31' => '1'
# 2 => '3141592' => '1'
# 3 => '3' => '3'
# 4 => '314' => '1'

from itertools import count, chain
print(stopmin(chain(reversed(range(10)), count()), key=abs, stop=0))
print(stopmin(chain(reversed(range(10)), count()), key=abs, stop=3))

# 0
# 3
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cleaning up conditionals

2017-01-02 Thread Jussi Piitulainen
Deborah Swanson writes:
> Jussi Piitulainen wrote:

[snip]

>> With your particular conditions of non-emptiness, which is taken to
>> be truth, you can achieve variations of this result with any of the
>> following statements:
>> 
>> w = ( l1[v] if len(l1[v]) > 0 else
>>   l2[v] if len(l2[v]) > 0 else
>>   l1[v] )
>> 
>> x = l1[v] if l1[v] else l2[v] if l2[v] else l1[v]
>> 
>> y = l1[v] or l2[v] or l1[v]
>> 
>> z = l1[v] or l2[v]
>> 
>> The last one, which I originally suggested (and still prefer 
>> when otherwise appropriate), is subtly different from the 
>> others. That difference should be irrelevant.
>
> I agree, if the goal was to capture one of the field values in a
> scalar value.

To store into a list, specify a position in the list as the target.

My idea here has been to simply do this to all the relevant positions in
both lists, even when it means storing the old value back.

See below, concretely, with your two examples and the mixed one from
Dennis Lee Bieber, where I introduced a small difference of my own so
that corresponding non-empty fields differ. I have made it output Python
comments and inserted them at appropriate places.

The same function, merge, fills the empty fields from the other list in
all three cases using the method z from above. It does no harm when a
field is already non-empty.

def merge(l1, l2):
fields = range(5)
for v in fields:
l1[v] = l1[v] or l2[v]
l2[v] = l2[v] or l1[v]

l1 = [ '2 br, Elk Plains', '12-26', 'WA/pi', 'house', 'garage, w/d' ]
l2 = [ '2 br, Elk Plains', '12-29', '',  '',  '']

print('# Before:', l1, l2, sep = '\n# ', end = '\n# ')
merge(l1, l2)
print('After:', l1, l2, sep = '\n# ', end = '\n\n')

# Before:
# ['2 br, Elk Plains', '12-26', 'WA/pi', 'house', 'garage, w/d']
# ['2 br, Elk Plains', '12-29', '', '', '']
# After:
# ['2 br, Elk Plains', '12-26', 'WA/pi', 'house', 'garage, w/d']
# ['2 br, Elk Plains', '12-29', 'WA/pi', 'house', 'garage, w/d']

l1 = [ '2 br, Elk Plains', '12-26', '',  '',  '']
l2 = [ '2 br, Elk Plains', '12-29', 'WA/pi', 'house', 'garage, w/d' ]

print('# Before:', l1, l2, sep = '\n# ', end = '\n# ')
merge(l1, l2)
print('After:', l1, l2, sep = '\n# ', end = '\n\n')

# Before:
# ['2 br, Elk Plains', '12-26', '', '', '']
# ['2 br, Elk Plains', '12-29', 'WA/pi', 'house', 'garage, w/d']
# After:
# ['2 br, Elk Plains', '12-26', 'WA/pi', 'house', 'garage, w/d']
# ['2 br, Elk Plains', '12-29', 'WA/pi', 'house', 'garage, w/d']

l1 = [ '2 br, Elk Plains', '12-26', 'WA/pi', '',  '']
l2 = [ '2 br, Elf Plains', '12-29', '',  'house', 'garage, w/d' ]

print('# Before:', l1, l2, sep = '\n# ', end = '\n# ')
merge(l1, l2)
print('After:', l1, l2, sep = '\n# ', end = '\n\n')

# Before:
# ['2 br, Elk Plains', '12-26', 'WA/pi', '', '']
# ['2 br, Elf Plains', '12-29', '', 'house', 'garage, w/d']
# After:
# ['2 br, Elk Plains', '12-26', 'WA/pi', 'house', 'garage, w/d']
# ['2 br, Elf Plains', '12-29', 'WA/pi', 'house', 'garage, w/d']
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cleaning up conditionals

2017-01-01 Thread Jussi Piitulainen
Steve D'Aprano writes:

> On Sun, 1 Jan 2017 02:58 pm, Deborah Swanson wrote:
>
>>> It's possible to select either l1 or l2 using an expression,
>>> and then subscript that with [v]. However, this does not
>>> usually make for readable code, so I don't recommend it.
>>> 
>>> (l1 if whatever else l2)[v] = new_value
>>> 
>>> ChrisA
>> 
>> I'm not sure I understand what you did here, at least not well enough
>> to try it.
>
>
> The evolution of a Python programmer :-)
>
>
> (1) Step One: the naive code.
>
> if condition:
> l1[v] = new_value
> else:
> l2[v] = new_value
>
>
> (2) Step Two: add a temporary variable to avoid repeating the
> assignment
>
> if condition:
> temp = l1
> else:
> temp = l2
> temp[v] = new_value
>
>
> (3) Step Three: change the if...else statement to an expression
>
> temp = l1 if condition else l2
> temp[v] = new_value
>
>
> (4) Step Four: no need for the temporary variable
>
> (l1 if condition else l2)[v] = new_value

(l1 if bool(l1[v]) < bool(l2[v]) else
 l2 if bool(l1[v]) > bool(l2[v]) else
 l1)[v] = (l2 if bool(l1[v]) < bool(l2[v]) else
   l1 if bool(l1[v]) > bool(l2[v]) else
   l1)[v]

Merry new year :)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cleaning up conditionals

2016-12-31 Thread Jussi Piitulainen
Deborah Swanson writes:

> Jussi Piitulainen wrote:
>> Sent: Saturday, December 31, 2016 8:30 AM
>> Deborah Swanson writes:
>> 
>> > Is it possible to use some version of the "a = expression1 if
>> > condition else expression2" syntax with an elif? And for
>> > expression1 and expression2 to be single statements?  That's the
>> > kind of shortcutting I'd like to do, and it seems like python might
>> > be able to do something like this.
>> 
>> I missed this question when I read the thread earlier. The 
>> answer is simply to make expression2 be another conditional 
>> expression. I tend to write the whole chain in parentheses. 
>> This allows multi-line layouts like the following alternatives:
>> 
>> a = ( first if len(first) > 0
>>   else second if len(second) > 0
>>   else make_stuff_up() )
>> 
>> a = ( first if len(first) > 0 else
>>   second if len(second) > 0 else
>>   make_stuff_up() )
>> 
>> Expression1 and expression2 cannot be statements. Python 
>> makes a formal distinction between statements that have an 
>> effect and expressions that have a value. All components of a 
>> conditional expression must be expressions. A function call 
>> can behave either way but I think it good style that the 
>> calls in expresions return values.
>
> While I'm sure these terniaries will be useful for future problems, I
> couldn't make the second one work for my current problem.

(Note that those two things are just different layouts for the exact
same conditional expression.)

> I got as far as:
>
> a = l1[v] if len(l1[v] > 0 else 
> l2[v] if len(l2[v] > 0 else

(Parentheses needed, otherwise the first line is expected to be a whole
statement and then the unfinished expression in it is considered
malformed.)

> And didn't finish it because I couldn't see what a should be. I want
> it to be l2[v] if the first clause is true, and l1[v] if the second.
> If I was computing a value, this would work beautifully, but I don't
> see how it can if I'm choosing a list element to assign to. Maybe I
> just can't see it.

Do you here mean condition when you say clause? Then, if the first
condition is true, any other condition is not considered at all. When
you come to the final else-branch, you know that all conditions in the
chain were false.

I thought you originally wanted to keep l1[v] if it was non-empty, which
is what the code here says, but this time your prose seems different.
Anyhow, since you want a value when all conditions in the chain of
conditions are false, you want a value to use when the field is empty in
both records. To change nothing, store the old empty value back; or you
can supply your own default here.

With your particular conditions of non-emptiness, which is taken to be
truth, you can achieve variations of this result with any of the
following statements:

w = ( l1[v] if len(l1[v]) > 0 else
  l2[v] if len(l2[v]) > 0 else
  l1[v] )

x = l1[v] if l1[v] else l2[v] if l2[v] else l1[v]

y = l1[v] or l2[v] or l1[v]

z = l1[v] or l2[v]

The last one, which I originally suggested (and still prefer when
otherwise appropriate), is subtly different from the others. That
difference should be irrelevant.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cleaning up conditionals

2016-12-31 Thread Jussi Piitulainen
Deborah Swanson writes:

> Is it possible to use some version of the "a = expression1 if
> condition else expression2" syntax with an elif? And for expression1
> and expression2 to be single statements?  That's the kind of
> shortcutting I'd like to do, and it seems like python might be able to
> do something like this.

I missed this question when I read the thread earlier. The answer is
simply to make expression2 be another conditional expression. I tend to
write the whole chain in parentheses. This allows multi-line layouts
like the following alternatives:

a = ( first if len(first) > 0
  else second if len(second) > 0
  else make_stuff_up() )

a = ( first if len(first) > 0 else
  second if len(second) > 0 else
  make_stuff_up() )

Expression1 and expression2 cannot be statements. Python makes a formal
distinction between statements that have an effect and expressions that
have a value. All components of a conditional expression must be
expressions. A function call can behave either way but I think it good
style that the calls in expresions return values.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cleaning up conditionals

2016-12-30 Thread Jussi Piitulainen
"Deborah Swanson"  writes:

> Michael Torrie wrote:
>> On 12/30/2016 05:26 PM, Deborah Swanson wrote:
>> > I'm still wondering if these 4 lines can be collapsed to one or two 
>> > lines.
>> 
>> If the logic is clearly expressed in the if blocks that you 
>> have, I don't see why collapsing an if block into one or two 
>> lines would even be desirable.  Making a clever one-liner out 
>> of something isn't always a good thing.  In fact some 
>> programmers don't like to use the ternary operator or 
>> conditional expressions, preferring to use explicit if block logic.
>> 
>
> Maybe it isn't always a good thing, but learning the capabilities of
> python is. Besides, if the concern is future maintenance, a lot would
> depend on the proficiency of those expected to maintain the code.

One line:

l1[st], l2[st] = (l1[st] or l2[st]), (l2[st] or l1[st])

(The parentheses are redundant.)

Two lines:

l1[st] = l1[st] or l2[st]
l2[st] = l2[st] or l1[st]

Both of these store the same value back in the given field if it's
considered true (non-empty), else they store the corresponding value
from the other record in the hope that it would be considered true
(non-empty).
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python list index - an easy question

2016-12-19 Thread Jussi Piitulainen
Ben Bacarisse writes:

> BartC writes:
>
>> You need to take your C hat off, I think.
>
> It's a computing hat.  Indexes are best seen as offsets (i.e. as a
> measured distances from some origin or base).  It's a model that grew
> out of machine addressing and assembler address modes many, many
> decades ago -- long before C.  C, being a low-level language,
> obviously borrowed it, but pretty much all the well-thought out
> high-level languages have seen the value in it too, though I'd be
> interested in hearing about counter examples.

Julia, at version 0.5 of the language, is a major counter-example:
1-based, closed ranges. I think they have been much influenced by the
mathematical practice in linear algebra, possibly through another
computing language.

I think there's some work going on to allow other starting points, or at
least 0. Not sure about half-open ranges.

> The main issue -- of using a half open interval for a range -- is
> probably less widely agreed upon, though I think it should be.  EWD is
> correct about this (as about so many things).

Agreed.

(I even use pen and paper, though I don't always remember what I wrote.)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Simple code and suggestion

2016-11-30 Thread Jussi Piitulainen
g thakuri writes:

> I would want to avoid using multiple split in the below code , what
> options do we have before tokenising the line?, may be validate the
> first line any other ideas
>
>  cmd = 'utility   %s' % (file)
>  out, err, exitcode = command_runner(cmd)
>  data = stdout.strip().split('\n')[0].split()[5][:-2]

That .strip() looks suspicious to me, but perhaps you know better.

Also, stdout should be out, right?

You can use io.StringIO to turn a string into an object that you can
read line by line just like a file object. This reads just the first
line and picks the part that you want:

data = next(io.StringIO(out)).split()[5][:-2]

I don't know how much this affects performance, but it's kind of neat.

A thing I like to do is name all fields even I don't use them all. The
assignment will fail with an exception if there's an unexpected number
of fields, and that's usually what I want when input is bad:

line = next(io.StringIO(out))
ID, FORM, LEMMA, POS, TAGS, WEV, ETC = line.split()
data = WEV[:-2]

(Those are probably not appropriate names for your fields :)

Just a couple of ideas that you may like to consider.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: best way to read a huge ascii file.

2016-11-29 Thread Jussi Piitulainen
Heli writes:

> Hi all, 
>
> Let me update my question, I have an ascii file(7G) which has around
> 100M lines.  I read this file using :
>
> f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) 
>
> x=f[:,1] 
> y=f[:,2] 
> z=f[:,3] 
> id=f[:,0] 
>
> I will need the x,y,z and id arrays later for interpolations. The
> problem is reading the file takes around 80 min while the
> interpolation only takes 15 mins.

(Are there only those four columns in the file? I guess yes.)

> The following line which reads the entire 7.4 GB file increments the
> memory usage by 3206.898 MiB (3.36 GB). First question is Why it does
> not increment the memory usage by 7.4 GB?
>
> f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) 

In general, doubles take more space as text than as, well, doubles,
which (in those arrays) take eight bytes (64 bits) each:

>>> len("0.1411200080598672 -0.9899924966004454 -0.1425465430742778 
>>> 20.085536923187668 ")
78
>>> 4*8
32

> Finally I still would appreciate if you could recommend me what is the
> most optimized way to read/write to files in python? are numpy
> np.loadtxt and np.savetxt the best?

A document I found says "This function aims to be a fast reader for
simply formatted files" so as long as you want to save the numbers as
text, this is probably meant to be the best way.

https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html

Perhaps there are binary load and save functions? They could be faster.
The binary data file would be opaque, but probably you are not editing
it by hand anyway.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dictionary mutability, hashability, __eq__, __hash__

2016-11-27 Thread Jussi Piitulainen
Veek M writes:

> Jussi Piitulainen wrote:
>
>> Veek M writes:
>> 
>> [snip]
>> 
>>> Also if one can do x.a = 10 or 20 or whatever, and the class instance
>>> is mutable, then why do books keep stating that keys need to be
>>> immutable?  After all, __hash__ is the guy doing all the work and
>>> maintaining consistency for us. One could do:
>>>
>>> class Fruit:
>>>   editable_value = ''
>>> def __hash__(self):
>>>  if 'apple' in self.value:
>>>return 10
>>>  elif 'banana' in self.value:
>>>return 20
>>>
>>>
>>>  and use 'apple' 'bannana' as keys for whatever mutable data..
>>> Are the books wrong?
>> 
>> The hash does not do all the work, and the underlying implementation
>> of a dictionary does not react appropriately to a key changing its
>> hash value. You could experiment further to see for yourself.
>> 
>> Here's a demonstration that Python's dictionary retains both keys
>> after they are mutated so that they become equal, yet finds neither
>> key (because they are not physically where their new hash value
>> indicates).
>> 
>> I edited your class so that its methods manipulate an attribute that
>> it actually has, all hash values are integers, constructor takes an
>> initial value, objects are equal if their values are equal, and the
>> written representation of an object shows the value (I forgot quotes).
>> 
>> test = { Fruit('apple') : 'one', Fruit('orange') : 'two' }
>> 
>> print(test)
>> print(test[Fruit('orange')])
>> # prints:
>> # {Fruit(apple): 'one', Fruit(orange): 'two'}
>> # two
>> 
>> for key in test: key.value = 'banana'
>> 
>> print(test)
>> print(test[Fruit('banana')])
>> 
>> # prints:
>> # {Fruit(banana): 'one', Fruit(banana): 'two'}
>> # Traceback (most recent call last):
>> #   File "hash.py", line 25, in 
>> # print(test[Fruit('banana')])
>> # KeyError: Fruit(banana)
>
> ah! not so: that's because you are messing/changing the integer value 
> for the key. If apple-object was returning 10, you can't then return 20 
> (the text mangling seems to be completely irrelevant except you need it 
> to figure out which integer to return but barring that..).

It was my best guess to what you intended __hash__ to be. You took that
risk when you posted obviously broken code.

Your new __hash__ function below behaves the same way.

> Here's an example of what you're doing (note 'fly' is returning 20 BUT 
> the object-instance is 'apple' - that obviously won't work and has 
> nothing to do with my Q, err.. (don't mean to be rude):
> class Fruit(object):
> def __init__(self, text):
> self.text = text
> 
> def mangle(self,text):
> self.text = text
> 
> def __hash__(self):
> if 'apple' in self.text:
> return 10
> elif 'orange' in self.text:
> return 20
> elif 'fly' in self.text:
> return 20
> else:
> pass
> 
> apple = Fruit('apple')
> orange = Fruit('orange')
>
> d = { apple : 'APPLE_VALUE', orange : 'ORANGE_VALUE' }
> print d
>
> apple.mangle('fly')
> print d[apple]

Did you bother to try that? I get a KeyError (because the hash value of
the key object has changed).

> The Question is specific.. what I'm saying is that you can change
> attributes and the contents and totally mash the object up, so long as
> __hash__ returns the same integer for the same object. Correct?

Your __hash__ doesn't.

In your own example just above, you get 10 before mangling, and 20
after.

> Where does __eq__ fit in all this?

Make two different objects hash the same. Make them be __eq__ by
mangling them when they are already keys. See if you can still use both
as an index to get at the associated value. (You can't.)

But yes, you should be free to mutate fields that do not affect hashing
and equality. Object identity should work, if you are otherwise happy to
use object identity.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: dictionary mutability, hashability, __eq__, __hash__

2016-11-27 Thread Jussi Piitulainen
Veek M writes:

[snip]

> Also if one can do x.a = 10 or 20 or whatever, and the class instance
> is mutable, then why do books keep stating that keys need to be
> immutable?  After all, __hash__ is the guy doing all the work and
> maintaining consistency for us. One could do:
>
> class Fruit:
>   editable_value = ''
> def __hash__(self):
>  if 'apple' in self.value: 
>return 10
>  elif 'banana' in self.value:
>return 20
>
>
>  and use 'apple' 'bannana' as keys for whatever mutable data..
> Are the books wrong?

The hash does not do all the work, and the underlying implementation of
a dictionary does not react appropriately to a key changing its hash
value. You could experiment further to see for yourself.

Here's a demonstration that Python's dictionary retains both keys after
they are mutated so that they become equal, yet finds neither key
(because they are not physically where their new hash value indicates).

I edited your class so that its methods manipulate an attribute that it
actually has, all hash values are integers, constructor takes an initial
value, objects are equal if their values are equal, and the written
representation of an object shows the value (I forgot quotes).

test = { Fruit('apple') : 'one', Fruit('orange') : 'two' }

print(test)
print(test[Fruit('orange')])
# prints:
# {Fruit(apple): 'one', Fruit(orange): 'two'}
# two

for key in test: key.value = 'banana'

print(test)
print(test[Fruit('banana')])

# prints:
# {Fruit(banana): 'one', Fruit(banana): 'two'}
# Traceback (most recent call last):
#   File "hash.py", line 25, in 
# print(test[Fruit('banana')])
# KeyError: Fruit(banana)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Clean way to return error codes

2016-11-20 Thread Jussi Piitulainen
Steven D'Aprano writes:

> I have a script that can be broken up into four subtasks. If any of
> those subtasks fail, I wish to exit with a different exit code and
> error.
>
> Assume that the script is going to be run by system administrators who
> know no Python and are terrified of tracebacks, and that I'm logging
> the full traceback elsewhere (not shown).
>
> I have something like this:
>
>
> try:
> begin()
> except BeginError:
> print("error in begin")
> sys.exit(3)
>
> try:
> cur = get_cur()
> except FooError:
> print("failed to get cur")
> sys.exit(17)
>
> try:
> result = process(cur)
> print(result)
> except FooError, BarError:
> print("error in processing")
> sys.exit(12)
>
> try:
> cleanup()
> except BazError:
> print("cleanup failed")
> sys.exit(8)
>
>
>
> It's not awful, but I don't really like the look of all those
> try...except blocks. Is there something cleaner I can do, or do I just
> have to suck it up?

Have the exception objects carry the message and the exit code?

try:
begin()
cur = get_cur()
result = process(cur)
print(result)
cleanup()
except (BeginError, FooError, BarError, BazError) as exn:
print("Steven's script:", message(exn))
sys.exit(code(exn))
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why does this list swap fail?

2016-11-14 Thread Jussi Piitulainen
38016226...@gmail.com writes:

> L=[2,1]
> L[0],L[L[0]-1]=L[L[0]-1],L[0]
>
> The L doesn't change. Can someone provide me the detail procedure of
> this expression?

The right-hand side evaluates to (1,2), but then the assignments to the
targets on the left-hand side are processed in order from left to right.
This happens:

L[0] = 1
L[L0] - 1] = 2

https://docs.python.org/3/reference/simple_stmts.html#assignment-statements
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to print variable few time?

2016-11-13 Thread Jussi Piitulainen
andy writes:

> Sat, 12 Nov 2016 04:58:20 -0800 wrote guy asor:
>
>> hello!
>> 
>> this is my code:
>> 
>> word=raw_input()
>> print word*3
>> 
>> 
>> with this code im getting - wordwordword.
>> what changes i need to make to get - word word word - instead?
>> 
>> thanks
>
> using python3.x:
>
> word=input()
> print((word+' ')*2, end='')
> print(word)

print(*[word]*3)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Appending to a list, which is value of a dictionary

2016-10-15 Thread Jussi Piitulainen
Chris Angelico writes:

> On Sat, Oct 15, 2016 at 11:35 PM, Uday J wrote:
> bm=dict.fromkeys(l,['-1','-1'])
>
> When you call dict.fromkeys, it uses the same object as the key every
> time. If you don't want that, try a dict comprehension instead:

s/key/value/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Appending to a list, which is value of a dictionary

2016-10-15 Thread Jussi Piitulainen
Uday J writes:

> Hi,
>
> Here is the code, which I would like to understand.
>
 l=['a','b','c']
 bm=dict.fromkeys(l,['-1','-1'])
 u={'a':['Q','P']}
 bm.update(u)
 bm
> {'a': ['Q', 'P'], 'c': ['-1', '-1'], 'b': ['-1', '-1']}
 for k in bm.keys():
> bm[k].append('DDD')
>
 bm
> {'a': ['Q', 'P', 'DDD'], 'c': ['-1', '-1', 'DDD', 'DDD'], 'b': ['-1', '-1',
> 'DDD', 'DDD']}
>
> I was expecting appending DDD to happen once for 'c' and 'b'.
> {'a': ['Q', 'P', 'DDD'], 'c': ['-1', '-1', 'DDD'], 'b': ['-1', '-1', 'DDD']}

It happened once for 'c' and once for 'd' but bm['c'] and bm['d'] are
the same list so it happened twice for that list.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python-based monads essay part 2

2016-10-13 Thread Jussi Piitulainen
Gregory Ewing writes:

> A bit more on SMFs, and then some I/O.
>
> http://www.cosc.canterbury.ac.nz/greg.ewing/essays/monads/DemystifyingMonads2.html

Thanks.

It would be good to spell out SMF at the start of the page.

"The definition of / above" (__truediv__ method) was not given "above"
(in the definition of SMF).
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   >