RE: "Invalid literal for int() with base 10": is it really a literal?

2023-05-26 Thread avi.e.gross
Roel,

In order for the code to provide different error messages, it needs a way to 
differentiate between circumstances. 

As far as the int() function is concerned, it sees a string of characters and 
has no clue where they came from. In Python, int(input()) just runs input() 
first and creates a string and then passes it along to int().

You can of course argue there are ways to phrase an error message that may be 
less technicalese.

-Original Message-
From: Python-list  On 
Behalf Of Roel Schroeven
Sent: Friday, May 26, 2023 3:55 AM
To: python-list@python.org
Subject: "Invalid literal for int() with base 10": is it really a literal?

Kevin M. Wilson's post "Invalid literal for int() with base 10?" got me 
thinking about the use of the word "literal" in that message. Is it 
correct to use "literal" in that context? It's correct in something like 
this:

 >>> int('invalid')
Traceback (most recent call last):
   File "", line 1, in 
ValueError: invalid literal for int() with base 10: 'invalid'

But something like this generates the same message:

 >>> int(input())
hello
Traceback (most recent call last):
   File "", line 1, in 
ValueError: invalid literal for int() with base 10: 'hello'

In cases like this there is no literal in sight.

I'm thinking it would be more correct to use the term 'value' here: 
ValueError: invalid value for int() with base 10: 'hello'
Does my reasoning make sense?

-- 
"I love science, and it pains me to think that to so many are terrified
of the subject or feel that choosing science means you cannot also
choose compassion, or the arts, or be awed by nature. Science is not
meant to cure us of mystery, but to reinvent and reinvigorate it."
 -- Robert Sapolsky

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: OT: Addition of a .= operator

2023-05-24 Thread avi.e.gross
It may be a matter of taste and policies, Dave.

I am talking about whether to write your code so it looks good to you, and
dealing with issues like error messages only when needed, or whether to
first do all kinds of things to catch errors or make it easier if they pop
up.

Python can be written fairly compactly and elegantly when trying out an
algorithm. But if you pepper it with print statements all over the place
showing the current values of variables and return codes (perhaps commented
out or hiding in an IF statement set to False) then you have a version
harder to read even if it can potentially be very useful. If your code is
constantly specifying what types variables must be or testing constraints,
it may well compile but for some users all that gets in the way of seeing
the big picture. 

In this case, we are discussing issues like how to spread code onto multiple
lines and opinions differ. In languages that do not use indentation as
having special meaning, I often like to stretch out and use lots of lines
for something like a function call with umpteen arguments and especially one
containing nested similar dense function calls. A good text editor can be
helpful in lining up the code so things at the same indentation level have
meaning as do things at other levels.

I will say that the try/catch type idioms that surround every piece of code,
often nested, can make code unreadable.

Similarly, some languages make it easy to do chaining in ways that use
multiple lines.

Since python is (justifiably) picky about indentation, I use such features
less and more cautiously and sometimes need to carefully do things like add
parentheses around a region to avoid inadvertent misunderstandings. 

When I do my work for myself and am not expecting serious errors I tend to
write the main program first and only then enhance it as needed. If working
with a group and established standards, of course, we follow whatever
methods are needed, and especially if a large part of the effort is to test
thoroughly against requirements.



-Original Message-
From: Python-list  On
Behalf Of dn via Python-list
Sent: Wednesday, May 24, 2023 1:19 AM
To: python-list@python.org
Subject: Re: OT: Addition of a .= operator

On 24/05/2023 12.27, Chris Angelico wrote:
> On Wed, 24 May 2023 at 10:12, dn via Python-list 
wrote:
>> However, (continuing @Peter's theme) such confuses things when something
>> goes wrong - was the error in the input() or in the float()?
>> - particularly for 'beginners'
>> - and yes, we can expand the above discussion to talk about
>> error-handling, and repetition until satisfactory data is input by the
>> user or (?frustration leads to) EOD...
> 
> A fair consideration! Fortunately, Python has you covered.
> 
> $ cat asin.py
> import math
> 
> print(
>  math.asin(
>  float(
>  input("Enter a small number: ")
>  )
>  )
> )
> $ python3 asin.py
> Enter a small number: 1
> 1.5707963267948966
> $ python3 asin.py
> Enter a small number: 4
> Traceback (most recent call last):
>File "/home/rosuav/tmp/asin.py", line 4, in 
>  math.asin(
> ValueError: math domain error
> $ python3 asin.py
> Enter a small number: spam
> Traceback (most recent call last):
>File "/home/rosuav/tmp/asin.py", line 5, in 
>  float(
> ValueError: could not convert string to float: 'spam'
> 
> Note that the line numbers correctly show the true cause of the
> problem, despite both of them being ValueErrors. So if you have to
> debug this sort of thing, make sure the key parts are on separate
> lines (even if they're all one expression, as in this example), and
> then the tracebacks should tell you what you need to know.


Yes, an excellent example to show newcomers to make use of 'the 
information *provided*' - take a deep breath and read through it all, 
picking-out the important information...


However, returning to "condense this into a single line", the 
frequently-seen coding is (in my experience, at least):

 quantity = float( input( "How many would you like? " ) )

which would not produce the helpful distinction between 
line-numbers/function-calls which the above (better-formatted) code does!


Summarising (albeit IMHO):

- if relatively trivial/likely to be well-known: collect the calls into 
a chain, eg

 user = user.strip().lower()

- otherwise use separate lines in order to benefit from the stack-trace

- (still saying) use separate assignments, rather than chaining more 
complex/lesser-known combinations

- ensure that the 'final' identifier is meaningful (and perhaps the 
first, and/or even an 'intermediate' if pertinent or kept for re-use later)

- perhaps re-use a single identifier-name as a temp-variable, if can 
reasonably 'get away with it'
(and not confuse simple minds, like yours-truly)


-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Addition of a .= operator

2023-05-20 Thread avi.e.gross
I would suggest thinking carefully about ramifications as well as any benefits 
of adding some or .=operator.

It sounds substantially different than the whole slew of +=, *= and so on  
types of operators. The goal some would say of those is to either allow the 
interpreter optimize by not evaluating twice as in x = x + 1 or python 
extensions of the dunder type like __iadd__() that allow you to control what 
exactly is done and which sometimes make a += b do things a bit different than 
a= a+b.

So what would a __i_invoke_method__() function look like? It seems you want the 
command to sort of replace
Object = Object.method(args)

But for any method whatsoever.

But Python objects can have methods all over the place as they may be 
subclasses that inherit methods, or may implement an abstract method or use 
multiple inheritance. All that searching happens in the current way, so if 

Object.method(args))

Works as expected, would your new method be just syntactic sugar, or would it 
look for a dunder method that may have no idea initially what you want?

Just curious.

Is there an alternative way you could get the functionality without using the 
same way that is used for a very different circumstance?


-Original Message-
From: Python-list  On 
Behalf Of 2qdxy4rzwzuui...@potatochowder.com
Sent: Saturday, May 20, 2023 2:49 PM
To: python-list@python.org
Subject: Re: Addition of a .= operator

On 2023-05-21 at 06:11:02 +1200,
dn via Python-list  wrote:

> On 21/05/2023 05.54, Alex Jando wrote:
> > I have many times had situations where I had a variable of a certain type, 
> > all I cared about it was one of it's methods.
> > 
> > For example:
> > 
> > 
> > import hashlib
> > hash = hashlib.sha256(b'word')
> > hash = hash.hexdigest()
> > 
> > import enum
> > class Number(enum.Enum):
> >  One: int = 1
> >  Two: int = 2
> >  Three: int = 3
> > num = Number.One
> > num = num.value
> > 
> > 
> > Now to be fair, in the two situations above, I could just access the method 
> > right as I declare the object, however, sometimes when passing values into 
> > functions, it's a lot messier to do that.

Can you give an example, preferably one from an actual program, that
shows the mess?  Is it More Messier™ than the difference between the
following examples?

# example 1
hash = hashlib.sha256(b'word')
f(hash.hexdigest()) # call f with hash's hexdigest

# example 2
hash = hashlib.sha256(b'word')
hash = hash.hexdigest() # extract hash's hexdigest
f(hash) # call f with hash's hexdigest

Can you also show what your code would look like with a .= operator?

> > So what I'm suggesting is something like this:
> > 
> > 
> > import hashlib
> > hash = hashlib.sha256(b'word')
> > hash.=hexdigest()
> > 
> > import enum
> > class Number(enum.Enum):
> >  One: int = 1
> >  Two: int = 2
> >  Three: int = 3
> > num = Number.One
> > num.=value
> > 
> 
> A custom-class wrapper?
> Even, a decorator-able function?
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: An "adapter", superset of an iterator

2023-05-03 Thread avi.e.gross
As others have mentioned features added like this need careful examination
not only at effects but costs.

As I see it, several somewhat different ideas were raised and one of them
strikes me oddly. The whole point of an iterable is to AVOID calculating the
next item till needed. Otherwise, you can just make something like a list.

To talk about random access to an iterable is a tad weird as it would mean
you need to get the first N items and store them and return the Nth item
value as well as maintain the remainder of the unused part of the iterable.
Further requests already in the cache would be gotten from there and any
beyond it would require iterating more and adding more to the cache.

So say my iterator returns the first N primes or just those below 100. What
should be the functionality if you request item 1,000? 

As for reversing it, that requires you to basically do list(iterable) and
use it up. What if the iterable is infinite as in all the odd numbers? 

If you really want an iterable that return something like prime numbers
below some level in reverse order, that can be done by changing the iterable
to create them going downward and that would be a different iterator. But
how easily could you make some iterators go backward? Fibonacci, maybe so.
Other things perhaps not.

But again, as noted, anything already in a list can be set up as an iterator
that returns one item at a time from that list, including in reverse. There
won't be much savings as the data structure inside would likely be spread
out to take all the memory needed, albeit it may simplify the code to look
like it was being delivered just in time.

As with many things in python, rather than asking for a global solution that
affects many others, sometimes in unexpected ways, it may be more reasonable
to make your own patches to your code and use them in ways you can control.
In the case being discussed, you simply need to create a generator function
that accepts an iterator, converts it to a list in entirety, reverses the
list (or deals with it from the end) and enters a loop where it yields one
value at a time till done. This should work with all kinds of iterators and
return what looks like an iterator without any changes to the language.

Of course, I am likely to be missing something. And, certainly, there may
already be modules doing things like the above or the opportunity for
someone to create a module such as the itertools module with nifty little
functions including factory functions.

-Original Message-
From: Python-list  On
Behalf Of Oscar Benjamin
Sent: Wednesday, May 3, 2023 3:47 PM
To: python-list@python.org
Subject: Re: An "adapter", superset of an iterator

On Wed, 3 May 2023 at 18:52, Thomas Passin  wrote:
>
> On 5/3/2023 5:45 AM, fedor tryfanau wrote:
> > I've been using python as a tool to solve competitive programming
problems
> > for a while now and I've noticed a feature, python would benefit from
> > having.
> > Consider "reversed(enumerate(a))". This is a perfectly readable code,
> > except it's wrong in the current version of python. That's because
> > enumerate returns an iterator, but reversed can take only a sequence
type.
>
> Depending on what you want to give and receive, enumerate(reversed(a))
> will do the job here.  Otherwise list() or tuple() can achieve some of
> the same things.

I don't think that is equivalent to the intended behaviour:

reversed(enumerate(a)) # zip(reversed(range(len(a))), reversed(a))
enumerate(reversed(a)) # zip(range(len(a)), reversed(a))

In principle for a sequence input enumerate(a) could be something that
behaves like a sequence and therefore could be reiterated or reversed
etc. The enumerate(a).__reversed__ method could then delegate to
a.__reversed__ and a.__len__ if they exist. This could be confusing
though because the possible behaviour of enumerate(a) would be
different depending on the type of a.

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to 'ignore' an error in Python?

2023-04-29 Thread avi.e.gross
I get a tad suspicious when someone keeps telling us every offered solution
does not feel right. Perhaps they are not using the right programming
language as clearly they are not willing to work with it as it is not as it
should be.

 After all the back and forth, there are several choices including accepting
whatever method is least annoying to them, or rolling their own.

Create a function with a name like do_it_my_way_or_the_highway() that uses
any acceptable method but completely hides the implementation details from
your code. One way might be to get the source code and copy it under your
own name and modify it so the default is to do what you want. It can even be
as simple as a small wrapper that forwards to the original function with a
keyword set by default.

After all, this does seem to be a bit like what you are asking for. A way to
call your functionality that does it the way you insist it should have been
designed, never mind that many others are happy with it as it is and use the
techniques mentioned at other times.

But I do have sympathy. I have seen lots of simple-minded code that seems to
cleanly and elegantly solve a problem as long as all the ducks are just-so.
Then someone points out that the code may break if it is called with some
other type than expected or if it tries to divide by zero or if something
else changes a variable between the time you looked at it and the time you
update it and so on. Next thing you know, your code grows (even
exponentially) to try to handle all these conditions and includes lots of
nested IF statements and all kinds of TRY statements and slows down and is
hard to read or even think about. And to make it worse, people ask for your
formerly simple function to become a Swiss army knife that accepts oodles of
keyword arguments that alter various aspects of the behavior!

So, yes, it can feel wrong. But so what? Sometimes you can find ways to
reduce the complexity and sometimes you simply create a few accessory
functions you can use that tame the complexity a bit. But almost any complex
program in any language can require a loss of simplicity.


-Original Message-
From: Python-list  On
Behalf Of Kushal Kumaran
Sent: Saturday, April 29, 2023 12:19 AM
To: python-list@python.org
Subject: Re: How to 'ignore' an error in Python?

On Fri, Apr 28 2023 at 04:55:41 PM, Chris Green  wrote:
> I'm sure I'm missing something obvious here but I can't see an elegant
> way to do this.  I want to create a directory, but if it exists it's
> not an error and the code should just continue.
>
> So, I have:-
>
> for dirname in listofdirs:
> try:
> os.mkdir(dirname)
> except FileExistsError:
> # so what can I do here that says 'carry on regardless'
> except:
> # handle any other error, which is really an error
>
> # I want code here to execute whether or not dirname exists
>
>
> Do I really have to use a finally: block?  It feels rather clumsy.
>
> I suppose I could test if the directory exists before the os.mkdir()
> but again that feels a bit clumsy somehow.
>
> I suppose also I could use os.mkdirs() with exist_ok=True but again
> that feels vaguely wrong somehow.
>

Why does exist_ok=True feel wrong to you?  This is exactly what it is
there for.

-- 
regards,
kushal
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Question regarding unexpected behavior in using __enter__ method

2023-04-25 Thread avi.e.gross


I think you got that right, Rob. A method created in a class is normally 
expected to care about the class in the sense that it often wants to access 
internal aspects and is given a "this" or "self" or whatever name you choose as 
a first argument. As noted, it is sometimes possible to create a function 
attached not to an object but to the class itself  as in, I think, the math 
class that is not normally instantiated as an object but lets you use things 
like math.pi and math.cos() and so on.

A comment on dunder methods in python is that they have a sort of purpose 
albeit you can hijack some to do other things. The protocol for WITH is a bit 
slippery as __enter__() and __exit__ are expected to do some abstract things 
that loosely are intended to set up something at the start in a way that will 
be (guaranteed) to be done if the exit routine is called when done. This can be 
about opening a file, or network connection and later closing it, or setting up 
some data structure and freeing the memory at the end, but it could be ANYTHING 
you feel like. For example, it can turn logging of some kind on and off and 
also compress the log file at the end. Or it could set up changes to the object 
that are there for the duration of the WITH and then reset the changes back at 
the end. 

An imaginary example might be to start caching what some methods are doing or 
replace a method by another, then empty the cache at the end or put back the 
redirected one.

And if what you want done at the beginning or end is outside the object being 
worked on, fine. Consider wrapping your function call in a simple function that 
calls the one you want after ignoring or removing the first argument. There are 
decorators that can do things like that.

So if you want int() or some existing plain non-member function, define an f() 
whose body calls int() with all arguments passed along other than the first. 

I just wrote and tested a trivial example where for some reason you just want 
to call sum() either with an iterable argument or with a second unnamed or 
named argument that specified a start you can add to. If this is written as a 
class method, it would have a first argument of "self" to ignore so I simulate 
that here:

def plusone(first, *rest, **named):
  return(sum(*rest, **named))

If you call this as below with valid arguments, it sort of swallows the first 
argument and passes the rest along:

>>> plusone("ignore", [])
0
>>> plusone("ignore", [1,2,3])
6
>>> plusone("ignore", [1,2,3], 100)
106
>>> plusone("ignore", range(7), start=100)
121

Yes, anything like this adds overhead. It does add flexibility and allows you 
to hijack the WITH protocol to do other things perhaps never anticipated but 
that may make sense, such as changing a companion object rather than the 
current one. But you need to live within some rules to do things and that means 
knowing there will be a first argument.

Avi
-Original Message-
From: Python-list  On 
Behalf Of Rob Cliffe via Python-list
Sent: Saturday, April 22, 2023 9:56 AM
To: Lorenzo Catoni ; python-list@python.org
Subject: Re: Question regarding unexpected behavior in using __enter__ method

This puzzled me at first, but I think others have nailed it.  It is not 
to do with the 'with' statement, but with the way functions are defined.
When a class is instantiated, as in x=X():
 the instance object gets (at least in effect), as attributes, 
copies of functions defined *in the class* (using def or lambda) but 
they become "bound methods", i.e. bound to the instance.  Whenever they 
are called, they will be called with the instance as the first argument, 
aka self:
 class X(object):
 def func(*args, **kargs): pass
 x = X()
 y = ()
x.func and y.func are two *different" functions.  When x.func is called, 
x is added as the first argument.  When y.func is called. y is added as 
the first argument.
  boundFunc = y.func
 boundFunc() # Adds y as first argument.
Indeed, these functions have an attribute called __self__ whose value is 
... you guessed it ... the object they are bound to
When a function is defined outside of a class, it remains a simple 
function, not bound to any object.  It does not have a __self__ 
attribute.  Neither does a built-in type such as 'int'.
Nor for that matter does the class function X.func:
 X.func() # Called with no arguments

Best wishes
Rob Cliffe

On 20/04/2023 23:44, Lorenzo Catoni wrote:
> Dear Python Mailing List members,
>
> I am writing to seek your assistance in understanding an unexpected
> behavior that I encountered while using the __enter__ method. I have
> provided a code snippet below to illustrate the problem:
>
> ```
 class X:
> ... __enter__ = int
> ... __exit__ = lambda *_: None
> ...
 with X() as x:
> ... pass
> ...
 x
> 0
> ```
> As you can see, the __enter__ method does not throw any exceptions and
> returns the output of "int()" correctly. However, one would 

RE: Weak Type Ability for Python

2023-04-14 Thread avi.e.gross
Dennis,

Before I reply, let me reiterate I am NOT making a concrete suggestion, just 
having a somewhat abstract discussion.

The general topic is a sort of polymorphism I envisioned where a select group 
of classes/objects that can be seen as different aspects of an elephant can be 
handled to provide some functionality in a consistent way. We all agree much of 
the functionality can be done deliberately by individual programmers. The 
question was whether anyone had done a more general implementation or even saw 
any reason to do so.

Fair enough?

So let us assume I have an object, call it obj1, that encapsulates data the old 
fashioned way. Consider a classical case like an object holding information 
about a parallelopiped or something like a shoebox. How you store the info can 
vary, such as recording a height/width/depth, or a series of x,y,z coordinates 
representing some of the vertices. But no matter how you store the basic info, 
you can derive many things from them when asked to provide a volume or surface 
area or whether it will fit within another object of the same kind assuming the 
sides have no width. Or, you can ask it to return another instance object that 
has double the width or many other things.

There are several ways to provide the functionality, actually quite a few, but 
one is to make a method for each thing it does such as obj1.get_surface_area(), 
obj1.get_volume() and obj1.does_it_fit_in(cl2) and of course you can have 
methods that change the orientation or ask what angles it is oriented at now 
and whatever else you want.

Each such method will return something of a usually deterministic type. Volumes 
will be a double, for example. But what if you design a small language so you 
can type obj1.get_by_name("volume") and similar requests, or even a comma 
separated grouping of requests that returns a list of the answers? It now is 
not so deterministic-looking to a linter. But normal Python allows and often 
encourages such polymorphism so is this anything new?

What I envisioned is a tad closer to the latter. Not this:

a = thisType(3)
b = thisType(7)
c = 9   #plain integer
print(a + b + c)

Note the above example is standard. My thoughts are a bit more arcane and 
focused on convertibility of a single value into multiple forms.

Say I have a data type that stores a number representing a temperature. It may 
have ways to initialize (or change) the temperature so it can be input as 
degrees centigrade or Fahrenheit or Kelvin or Rankine or even more indirect 
ways such as 10 degrees Fahrenheit above the freezing point of pure water at a 
particular atmospheric pressure and so on.

What I want to add is a bit like this. Internally many methods may get created 
that may not be expected to be used except through a designated interface. Call 
them f1() and f2() ... fn() for now.

Also in the class initialization or perhaps in the object dunder init, you 
create something like a dictionary consisting of key words matched by pointers 
to the aforementioned functions/methods. This structure will have some name 
designated by the protocol such as _VIEWS and may be visible to anyone looking 
at the object. The details can be worked out but this is a simplistic 
explanation.

In this dictionary we may have words like "Celsius", "Fahrenheit" and so on, 
perhaps even several variants that point to the same functions. If a user wants 
the temperature in absolute terms, they may call a standard function like 
"obj1.as_type('Kelvin')" and that function will search the dictionary and get 
you the results using the appropriate conversion method. You may also support 
other accessors like 'obj1.supports_type("Fahrenheit451")' that reports as 
True/False whether the object can handle that output. It merely checks the 
internal dictionary. You may have another that returns a list of the data types 
as keys and whatever else is part of the design.

You can, of course, have a second set of such directives that instead of 
returning a temperature as a double, will return a printable text version that 
includes ℃ or °K or °R or °F".

A second example could be something holding a date with plenty of internal 
abilities to display it in a wide variety of formats or maybe just holds a day 
of the week that it will display as a string in any language it handles, such 
as Sunday being shown as:
יוֹם רִאשׁוֹן
Dimanĉo
Vasárnap
Sonntag
Dimanche
Zondag
日曜日
रविवार

And so on. Again, it need not store the text for every language but can call 
translation software as needed and it can be more than the name of a day of the 
week. It could have a dictionary containing all the languages it handles as 
described for another example and access methods. Of course, if called on 
repeatedly and often for the same languages, it could cache results.

My question, again, is not whether this can be done but whether some kind of 
protocol can be created that is published and suggests the names and so on to 
use in 

RE: RE: Weak Type Ability for Python

2023-04-13 Thread avi.e.gross
Yes, Dave, there are many data structures that can be used to maintain a
list of output types the class claims to support. Dictionaries have the
interesting property that you can presumably have a value that holds a
member function to access the way the key specifies.

Ideally, the order is not important for what I am looking for. Generally, I
would think that any class like the ones I have been discussing, would want
to broadcast a fairly short list of output types it would support.

Of course, if you look at my date example, the list could be quite big but
if you simply use something like strftime() for many of the formats, perhaps
you may not need to list all possible ones. Any valid format could be
accepted as an argument and passed to such a utility function. Your
dictionary might simply store some commonly used formats known to work.

But I repeat. This is not a serious request. I know how to build limited
functionality like this if I ever want it, but wonder if anyone has ever
created a proposal for some protocols and perhaps helpers like say an
embedded object that handles aspects of it once you have initialized your
dictionary and also handles requests to show part of what is stored for any
shoppers wondering if you are compatible with their needs.

-Original Message-
From: Python-list  On
Behalf Of 2qdxy4rzwzuui...@potatochowder.com
Sent: Thursday, April 13, 2023 10:27 PM
To: python-list@python.org
Subject: Re: RE: Weak Type Ability for Python

On 2023-04-13 at 22:14:25 -0400,
avi.e.gr...@gmail.com wrote:

> I am looking at a data structure that is an object of some class and
> stores the data in any way that it feels like. But it may be a bit of
> a chameleon that shows one face or another as needed. I can write code
> now that simply adds various access methods to the class used and also
> provides a way to query if it supports some interfaces.

Python dicts act mostly like hash tables.  All by themselves, hash
tables are unordered (and in return for giving up that order, you get
O(1) access to an item if you know its key).

But when you ask a Python dict for the keys, you always get them in the
same order, skipping those that have been deleted since the last time
you asked, and appending the new keys to the end of the list in the
order in which you added them.

There's your chameleon.
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: RE: Weak Type Ability for Python

2023-04-13 Thread avi.e.gross
Alan,

Your guess is not quite what I intended.

Something like a C union is just a piece of memory large enough to hold one of 
several kinds of content and some way to figure out which is currently in place.

I am looking at a data structure that is an object of some class and stores the 
data in any way that it feels like. But it may be a bit of a chameleon that 
shows one face or another as needed. I can write code now that simply adds 
various access methods to the class used and also provides a way to query if it 
supports some interfaces.

Consider a dumb example. I have an object that holds a temperature and stores 
it in say degrees Celsius. There are simple formulas that can convert it to 
Fahrenheit or Kelvin or Rankine. So you can create access methods like 
get_as_Rankine() but this will only be useful for some programs that know about 
the interface.

So what if you had a variable in the class such as supported_formats that 
presented something like a list of scales supported using an official set of 
names? It may even be possible to get a reference to the function to call to 
get that functionality, or perhaps you have one access function that accepts 
any argument on the list and delivers what is wanted.

The temperature would only need to be stored in one format but be available in 
many. Of course, you could choose to precalculate and store others, or cache 
them when one request has come in and so forth.

Another example  would be dates stored in some format in a class that can 
deliver the result in all kinds of formats. Yes, we have functions that do 
things like that. But can you see advantages to the class hiding lots of 
details internally?

These are just examples but the point is motivated by some interfaces I have 
seen.

How do you know if something can be used by a context manner such as in a 
"with" statement? There may be other ways, but it seems two dunder methods, if 
present, likely mean it is. They are __enter__() and __exit__().

There are other interfaces like for iterators, that sometimes are more complex 
as when some things are not present, it uses others. Can you have a general 
function like is_iterator() or is_context_manager() that pretty much guarantees 
it is safe for the rest of the code to use the object in the way it wants?

My comments about overloading plus were a sort of extra idea. I think we have 
discussed the general algorithm for how Python tries to resolve something like 
"obj1 op obj2" and not just for the plus operator. There are quite a few dunder 
methods that cover many such operators.

What I was thinking about was a bit of a twist on that algorithm. I did 
something very vaguely like this years ago when I was working on how to 
translate documents from one format to another, such as WANG, Multimate, 
Wordperfect, plain text, etc. The goal was for a sender of an email to add an 
attachment and send it to many people at once. Each recipient would have a 
known preference for the type of document format they preferred. I wrote an 
algorithm in C++ which I got early access to as I was working at Bell Labs that 
effectively used a registered series of translator software along with info on 
how well or fast they worked, to do as few translations as possible and send 
each recipient the format they wanted.

Yes, there were major incompatibilities and you sometimes ended up with some 
features being dropped or changed. But that is not the point. If format A had 
do direct translator to format Z, I would find the intersection of formats we 
had software for to translate to from A, and anther set of languages that could 
be used to translate from to Z. Sometimes it needed multiple hops. It worked 
fine but never saw the light of day as, weirdly, the project had been canceled 
months earlier and they were stalling while planning the next project and thus 
let me do what I wanted even though I was integrating my C++ code into a 
project that was otherwise al in C. 

Now back to Python in this regard. If I type alpha + beta then maybe after 
trying the methods we have described, if still failing, the algorithm could see 
if alpha and beta registered what types they could output and see if a match 
could be made. If a number object offered a string version, that would be a 
match. If the string offered a numeric version, again problem solved. And even 
if the match was not precise, sometimes the interpreter might know enough to do 
a bit more and say convert an integer into a double if the sizes of the 
contents allowed.

The problem with this, and there are many, is that there is a certain 
nondeterministic aspect that may cause surprises and plenty of cost. 

It was just a academic thought that probably is not needed in the context 
albeit may  be implemented in some projects to bridge things as described or in 
other novel ways. 

-Original Message-
From: Alan Gauld  
Sent: Thursday, April 13, 2023 8:14 PM
To: avi.e.gr...@gmail.com; 

RE: Weak Type Ability for Python

2023-04-13 Thread avi.e.gross
Can I bring a part of this discussion a bit closer to Python?

I stipulate that quite a few languages, including fairly early ones, treated
text often as numbers. Ultimately, much of programming is about taking in
text and often segregating parts into various buckets either explicitly or
by checking if it has a decimal point or looks like scientific notation or
in how it seems to be used.

Is there any concept in Python of storing information in some way, such as
text, and implementing various ideas or interfaces so that you can query if
the contents are willing and able to be viewed in one of many other ways?

As an example, an object may store a fairly large number as a text string in
decimal format. The number is big enough that it cannot be fully represented
in an 8 bit storage as in a byte or in a signed or unsigned 16-bit integer
but can be stored in something larger. It may be possible to store it in a
double precision floating point but not smaller. Yes, I know Python by
default uses indefinite length integers, but you get the idea.

Or it may be storing text in some format but the object is willing to
transform the text into one of several other formats when needed. The text
may also have attributes such as whether it is in English or Hungarian or is
mixed-language.

So for some applications, perhaps leaving the object as a string all the
time may be reasonable. If some operation wishes to use the contents, the
interface can be queried to see what other formats it can be coerced into
and perhaps a specified set of methods need to be included that perform your
transition for you such as object.return_as_int64() 

This can have wider implications. Imagine an object holding text in French
that has been tested by humans using a program like Google Translate and
deemed reasonable for translating into a specific dozen languages such as
Esperanto  and not certified into others like Klingon or High Valyrian or
ASL. The object could contain interfaces for the languages it supports but
does not store the translations, especially when the content is dynamic,
such as a form letter that has been instantiated with a name and address and
perhaps a part detailing what is being billed or shipped. Instead, the
object can present an interface that lets a user determine if it supports
dynamic translation to one or more target, such as the Quebec version of
French or a British versus American version of English.

I am being quite general here and lots of programs out there already
probably have their own way of providing such facilities on a case-by-case
basis. But do some languages have some support within the language itself?

I do note some languages allow objects to oddly belong to multiple classes
at once and have used that feature as a way to check if an object has some
capabilities. Some languages have concepts like a mix-in and Python does
allow thing like multiple inheritance albeit it often is best not to use it
much. It does have ideas about how to test if a class implements some things
by seeing if various dunder methods are in place.

My reason for asking, is based on the discussion. If I want to use plus with
an integer and a string, it may be reasonable for the interpreter to ask one
or the other operand if they are able to be seen another way. If an integer
indicates it can be seen as text, great. If a string indicates it believes
it can deliver a number, great.

Unfortunately, if they BOTH are flexible, how do you decide whether to add
them as numbers or concatenate them as strings?

Sigh!


-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Thursday, April 13, 2023 3:35 PM
To: python-list@python.org
Subject: Re: Weak Type Ability for Python

On Fri, 14 Apr 2023 at 03:29, Dennis Lee Bieber 
wrote:
>
> On Thu, 13 Apr 2023 12:21:58 +1000, Cameron Simpson 
> declaimed the following:
>
> >On 12Apr2023 22:12, avi.e.gr...@gmail.com  wrote:
> >>I suspect the OP is thinking of languages like PERL or JAVA which guess
> >>for
> >>you and make such conversions when it seems to make sense.
> >
> >JavaScript guesses. What a nightmare. Java acts like Python and will
> >forbid it on type grounds (at compile time with Java, being staticly
> >typed).
> >
>
> REXX -- where everything is considered a string until it needs to
be
> something else.
>
> REXX-ooRexx_5.0.0(MT)_64-bit 6.05 23 Dec 2022
>   rexxtry.rex lets you interactively try REXX statements.
> Each string is executed when you hit Enter.
> Enter 'call tell' for a description of the features.
>   Go on - try a few...Enter 'exit' to end.
> x = 1;
>   ... rexxtry.rex on WindowsNT
> y = "a";
>   ... rexxtry.rex on WindowsNT
> say x||y;
> 1a
>   ... rexxtry.rex on WindowsNT

REXX - where everything is a string, arithmetic can be done on
strings, and data structures are done in the variable 

RE: Weak Type Ability for Python

2023-04-13 Thread avi.e.gross
This reminds me a bit of complaints that the parser does not do what you
want when you do not supply parentheses in an expression like:

5 * 4 + 3

In many and maybe most languages it is seen as (5*4)+3 UNLESS you tell it
you want 5*(4+3). There are precedence and associativity rules.

Of course the computer might guess you meant the latter or could refuse to
do it and offer you a choice before calculating it. Or the language may
insist on parentheses always so you would need to also say ((5*4)+3) with no
default behavior.

The golden rule remains. If there is more than one way something can be
done, then either the programmer must make the choice explicit OR the
documentation must very clearly warn which path was chosen and perhaps point
to ways to do other choices. 

Some people take more complex (but not Complex) arithmetic than the above
and break it up into quite a few simple parts like:

temp1 = 4 + 3
result = 5 + temp1

Of course, the latter can be hard to read and understand for some people,
and some (others?) find fully parenthesized versions hard. But having
precedence rules and also allowing the other methods, should work fine for a
good segment of people except perhaps the ones who like Reverse Polish
Notation and insist on 5 4 3 + * instead.


-Original Message-
From: Python-list  On
Behalf Of aapost
Sent: Thursday, April 13, 2023 12:28 PM
To: python-list@python.org
Subject: Re: Weak Type Ability for Python

On 4/12/23 04:03, Ali Mohseni Roodbari wrote:
 >
On 4/13/23 07:50, Stefan Ram wrote:
 >If tomorrow Python would allow "string+int" and "int+string"
 >in the sense of "string+str(int)" and "str(int)+string",
 >what harm would be there?
 >
 >But for now, I think a typical approach would be to just use "str",
 >i.e., "string+str(int)" and "str(int)+string".


I agree with Py Zen rule 2 in this case:
Explicit is better than implicit.

I hate when things try to guess what I am doing... It is why I can't use 
lxml.
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Weak Type Ability for Python

2023-04-13 Thread avi.e.gross
[THIS CLAIMER: a bit off a bit off a bit off topic, imagine that]

Chris,

You have a gift of taking things I think about but censor myself from
including in my post and then blurting it out! LOL!

The original question in this thread now seems a dim memory but we are now
discussing not how to add a number to a string but how to multiply a string
to make n combined copies and then what it means to have a fractional copy
and finally a way to specify a rotation to the result. Argh

But since you brought it up as a way of looking at what multiplying by an
imaginary number might mean, as in rotating text, I am now going to throw in
a May Tricks even if it is only April.

So should I now extend a language so a rotation matrix is allowed to
multiply text or even a nested list like:

[ [ cos(theta), -sin(theta) ],
  [ sin(theta), cos(theta) ]

While we are at it, why stop with imaginary numbers when you can imagine
extensions thereof? Unfortunately, it has been proven there are and can only
be two additional such constructs. Quaternions have three distinct imaginary
axes called i,j,k and some see them as interesting to show multidimensional
objects in all kinds of places such as computer vision or orbital mechanics.
Octonions have seven such other imaginary axes and have uses in esoteric
places like String Theory or Quantum Logic.

And, yes, you can use these critters in python. You can add a quaternion
type to numpy for example. Yep, octonions too. See modules like pyoctonion
and pyquaternion and much more.

The immoral moral of this story is that once you start opening some doors,
you may find people clamoring to let in ever more things and features. You
can easily bog down your code to the point where finding the commonly used
parts becomes a chore as you trudge through lots of code that is rarely used
but there for completeness.

Oh, I want to make something clear before I get another message spelling out
what I was thinking but chose to omit.

I slightly misled you above. Yes, it has been proven no number higher than 8
(meaning one real dimension and seven distinct imaginary ones) can exist so
octonions are the final part of that story. Well, not exactly. You lose
commutativity when going from quaternions to octonions and you lose full
associativity if you go higher. But you can make all kinds of mathematical
constructs like sedenions with 16 dimensions.

I cannot imagine ever trying to multiply a string by these critters but who
knows? As I noted above, if you set some parts of each of the above to zero,
they all can look like something with a real part like 3, and no (meaning
zero point zero) imaginary parts. So you could argue you should support all
kinds of things that MAY on examination turn out to be convertible to an
integer or double.

-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Thursday, April 13, 2023 12:12 PM
To: python-list@python.org
Subject: Re: Weak Type Ability for Python

On Fri, 14 Apr 2023 at 02:05,  wrote:
> So why not extend it to allow complex numbers?
>
> >>> "Hello" * complex(5,0)
> TypeError: can't multiply sequence by non-int of type 'complex'
> >>> "Hello" * complex(0,5)
> TypeError: can't multiply sequence by non-int of type 'complex'
>

Clearly a missed opportunity to rotate the text through a specified angle.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Weak Type Ability for Python

2023-04-13 Thread avi.e.gross
Chris, I was not suggesting it for Python as one of many possible
implementations.

I do see perfectly valid uses in other contexts. For example, if I have a
program that displays my text as pixels in some font and size, I may indeed
want the text clipped at 2 1/2 repetitions. But as always, when there are
choices to be made, you have to very clearly document the choice or offer
ways to do it another way. In a non-fixed-width font, 2.5 may mean knowing
how many pixels and adjusting so that a narrow letter "i" may be shown in
one font and not another, for example. 

If you want completeness, sure, you can define fractional parts of a string
in this context by the percentage of CHARACTERS in it. But as we have often
seen, in other encodings you need to differentiate between varying numbers
of bytes versus the underlying symbols they represent. Your  example from
the Pike language not only supports the multiplication of a string but
division and mod. Python currently does not allow those.

So why not extend it to allow complex numbers? 

>>> "Hello" * complex(5,0)
TypeError: can't multiply sequence by non-int of type 'complex'
>>> "Hello" * complex(0,5)
TypeError: can't multiply sequence by non-int of type 'complex'

The first one above is actually perfectly valid in the sense that the real
part is 5 and there is no imaginary component. With a bit of effort, I can
use the complexity to work:

>>> "Hello" * int(complex(5,0).real)
'HelloHelloHelloHelloHello'

Let me reiterate. There are languages that do all kinds of interesting
things and some of what Python has done is seen by others as interesting.
They regularly borrow from each other or use parts and innovate further. I
have no serious objection to making well-thought-out changes if they are
determined to be not only useful, but of higher priority than a long
shopping list of other requests. I am wary of overly bloating a language by
placing too many things in the core.

It strikes me as doable to create a module that encapsulates a feature like
this in a limited way. What may be needed is just a carefully constructed
class that starts off as similar to str and adds some methods. Any user
wanting to use the new feature would either start using the new class
directly or cast their str to it when they want it to be useable.

But the good news is that I am nowhere in the python hierarchy and have no
ability to make any changes. This is purely academic for me. And, if I want
such features and see tons of existing ways to get what I want or can roll
it for myself, ...


-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Thursday, April 13, 2023 3:02 AM
To: python-list@python.org
Subject: Re: Weak Type Ability for Python

On Thu, 13 Apr 2023 at 15:40,  wrote:
> And, no, I do not suggest 2.5 be interpreted as putting in an
> approximate percentage so that .8 * "Hello" should result in "Hell" ...

$ pike
Pike v8.1 release 15 running Hilfe v3.5 (Incremental Pike Frontend)
Ok.
> "Hello, world! " * 2.5;
(1) Result: "Hello, world! Hello, world! Hello, "
> "Hello, world! Hello, world! Hello, " / 10;
(2) Result: ({ /* 3 elements */
"Hello, wor",
"ld! Hello,",
" world! He"
})
> "Hello, world! Hello, world! Hello, " % 10;
(3) Result: "llo, "
> "Hello, world! Hello, world! Hello, " / 10.0;
(4) Result: ({ /* 4 elements */
"Hello, wor",
"ld! Hello,",
" world! He",
"llo, "
})
>

Multiplying and dividing strings by floats makes perfect sense. (The
({ }) notation is Pike's array literal syntax; consider it equivalent
to Python's square brackets for a list.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Weak Type Ability for Python

2023-04-12 Thread avi.e.gross
Given the significant number of languages other than Python that have some
version of a feature that allows implicit conversion of unlike operands to
concatenate something like a "number" and a string into a string, the
question may not be silly as to how or why Python chose as it chose.

As I see it, python did put in quite a bit of customizability and
flexibility including the ability to create your own object types that alter
the behavior of an operator like plus. I have seen plenty of code that takes
advantage and makes the user of the code assume that different types can
interact seamlessly.

But it is like many other things. Languages that support a wide variety of
integer types such as signed and unsigned versions of integers in 8 bits,
sixteen bits, 32 bits and even 64 bits or perhaps higher, will often look
like they allow you to mix them in various combinations for not just
addition. But underneath it all can be lots of hidden complexity. I have
seen languages that make lots of functions with the same names but different
signatures and then dispatch a call like add(int8, unsignedint32) to the
right function that best matches their signature. The functions internally
can do many things but often convert their arguments to a common format,
perform the operations, then perhaps convert back to whatever result output
was expected.

In the case being discussed we might have to create something that looks
like do_plus(int, int) and then do_plus(int, char) and so on.

The other alternatives can include tables of "generality" and
"convertibility" with rules that govern how to perform a calculation by
changing or upgrading or downgrading things to make a match that can then be
handled. 

The issue here is a sort of operator overloading. In Python, "+" does not
mean plus at all. It means whatever the programmer wanted it to mean. An
infix line of code that includes "obj1 + obj2" is supposed to investigate
how to do it. I am not sure if some built-in objects may be different, but
it does a sequence of operations till it finds what it needs and does it.

I believe it looks a bit like this. The LHS object, obj1, is examined to see
if it has defined a __add__() method. If found, it is called and either
produces a result or signals it cannot handle obj2 for some reason. If that
method is not found, or fails, then it looks to see if the RHS, obj2, has a
__radd__() method defined. That can be given obj1 and return an answer or
signal failure. 

If obj1 is 5 and obj2 is "pm" then we are using the built-in predefined
classes that currently have no interest in concatenating or adding unlike
types like this. If you could add or change these dunder methods (and note
you may also need to deal with __iadd__() to handle the case where x=5 and
you write "x += "pm" albeit that could weirdly change the type of x without
a linter having a clue.

I suggest the latter argument may be a good enough reason that Python did
not implement this. There are many good reasons. Does anyone like a language
that lets you type 2 + "three" and quietly makes that be 5? Sure, it can be
made to work on a subset of number in say English, but will it work in other
languages. It can be hard enough now to write code in UNICODE (I have seen
some) that tries to determine if a code point represents a number in some
language or representation and treats it as a numeral. I have seen the
numeric bullets such as a dark circle containing a white nine be treated as
if the user had put in a nine, for example.

As was mentioned, some languages have different operators for addition
versus concatenation and in that context, it may be intuitive that using the
wrong object type is an implicit call to conversion. Python uses "+" for
both purposes depending on context and potentially for many more purposes
the programmer can devise. 

Consider the asterisk operator as a companion concept. It gladly accepts a
string and a number in either order and does something somewhat intuitive
with them by treating multiplication as a sort of repeated addition:

>>> 5 * "6"
'6'
>>> "5" * 6
'55'
>>> 3 * "Hello? "
'Hello? Hello? Hello?

But it will not handle float.

>>> "Hello" * 2.5
TypeError: can't multiply sequence by non-int of type 'float'

If you want to either truncate a float to an into or round it or take a
ceiling, though, it will not guess for you and you must do something
explicit like this:

>>> "Hello" * round(2.5)
'HelloHello'
>>> "Hello" * round(2.6)
'HelloHelloHello'
>>> "Hello" * int(2.6)
'HelloHello'

There is a parallel argument here in arguing it should accept a float and
truncate it. But since you can easily cast a float to an int, in any of many
ways, why have the program choose when it quite likely reflects an error in
the code. And, no, I do not suggest 2.5 be interpreted as putting in an
approximate percentage so that .8 * "Hello" should result in "Hell" ...




-Original Message-
From: Python-list  On
Behalf Of Cameron Simpson
Sent: 

RE: Weak Type Ability for Python

2023-04-12 Thread avi.e.gross
On closer reading, the OP may be asking how to make a function doing what
they want, albeit without a plus.

Here is a python function as a one-liner that takes exactly two arguments of
any kind (including string and integer) and concatenates them into one
string without anything between and prints them:

def strprint(first, second): print(str(first) + str(second))

>>> strprint(5,1)
51
>>> strprint("a5",1)
a51
>>> strprint(12,"o'clock")
12o'clock
>>> strprint(3.1415926535,complex(3,4))
3.1415926535(3+4j)

Want something similar for any number of arguments? Here is a slightly
longer one-liner:

def strprintall(*many): print(''.join([str(each) for each in many]))

>>>
strprintall(1,"=egy\n",2,"=kettő\n",3,"=három\n",4,"=négy\n",5,"=öt\n","in
my childhood language.\n")
1=egy
2=kettő
3=három
4=négy
5=öt
in my childhood language.

Note my meager attempt is not using a plus sign as I addressed a sort of way
that could be done using a __radd__ method or other ways like an f-string.

I can not repeat this often enough. The easiest way to do something you want
in a new language is to work within the existing language as-is and not to
ask the language to change to be the way you want. That can take years or
never happen, and especially if the designers did not want the feature you
ask for. 





-Original Message-
From: Python-list  On
Behalf Of 2qdxy4rzwzuui...@potatochowder.com
Sent: Wednesday, April 12, 2023 3:17 PM
To: python-list@python.org
Subject: Re: Weak Type Ability for Python

On 2023-04-12 at 14:51:44 -0400,
Thomas Passin  wrote:

> On 4/12/2023 1:11 PM, Chris Angelico wrote:
> > On Thu, 13 Apr 2023 at 03:05, Ali Mohseni Roodbari
> >  wrote:
> > > 
> > > Hi all,
> > > Please make this command for Python (if possible):
> > > 
> > > > > > x=1
> > > > > > y='a'
> > > > > > wprint (x+y)
> > > > > > 1a
> > > 
> > > In fact make a new type of print command which can print and show
strings
> > > and integers together.
> > > 
> > 
> > Try:
> > 
> > print(x, y)
> > 
> > ChrisA
> 
> It puts a space between "1" and "a", whereas the question does not want
the
> space.  print(f'{x}{y}') would do it, but only works for variables named
"x"
> and "y".

Or possibly print(x, y, sep='').

> As happens so often, the OP has not specified what he actually wants to do
> so we can only answer the very specific question.

Agreed.
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Weak Type Ability for Python

2023-04-12 Thread avi.e.gross
As originally written, the question posed has way too many possible answers
but the subject line may give a hint. Forget printing.

The Python statement
1 + "a"

SHOULD fail. The first is an integer and the second is  string. These two
are native Python objects that neither define what to do if they are paired
with an object of the other type on the left or the right.

In any case, what should the answer be? Since "a" has no integer value, it
presumably was intended to be the string "1a".

So why NOT use the built-in conversion and instead of:

print(x+y) # where x=1, y='a'

It should be:

print(str(x) + y)

Could this behavior be added to Python? Sure. I wonder how many would not
like it as it often will be an error not caught!

If you defined your own object derived from string and added a __radd__()
method then the method could be made to accept whatever types you wanted
(such as integer or double or probably anything) and simply have code that
converts it to the str() representation and then concatenates them with, or
if you prefer without, any padding between.

I suspect the OP is thinking of languages like PERL or JAVA which guess for
you and make such conversions when it seems to make sense.

Python does not generally choose that as it is quite easy to use one of so
many methods, and lately an f-string is an easy way as others mentioned.


-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Wednesday, April 12, 2023 2:52 PM
To: python-list@python.org
Subject: Re: Weak Type Ability for Python

On 4/12/2023 1:11 PM, Chris Angelico wrote:
> On Thu, 13 Apr 2023 at 03:05, Ali Mohseni Roodbari
>  wrote:
>>
>> Hi all,
>> Please make this command for Python (if possible):
>>
> x=1
> y='a'
> wprint (x+y)
> 1a
>>
>> In fact make a new type of print command which can print and show strings
>> and integers together.
>>
> 
> Try:
> 
> print(x, y)
> 
> ChrisA

It puts a space between "1" and "a", whereas the question does not want 
the space.  print(f'{x}{y}') would do it, but only works for variables 
named "x" and "y".

As happens so often, the OP has not specified what he actually wants to 
do so we can only answer the very specific question.

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: [Python-Dev] Small lament...

2023-04-03 Thread avi.e.gross
Sadly, between Daylight Savings time and a  newer irrational PI π Day, I am 
afraid some April Foolers got thrown off albeit some may shower us with 
nonsense  in May I.

-Original Message-
From: Python-list  On 
Behalf Of Barry Warsaw
Sent: Monday, April 3, 2023 8:31 PM
To: Skip Montanaro 
Cc: Python ; Python Dev 
Subject: Re: [Python-Dev] Small lament...

I heard it on reasonably believable authority that the FLUFL took the year off. 
 Lamentable.

-Barry

> On Apr 1, 2023, at 11:19, Skip Montanaro  wrote:
> 
> Just wanted to throw this out there... I lament the loss of waking up on 
> April 1st to see a creative April Fool's Day joke on one or both of these 
> lists, often from our FLUFL... Maybe such frivolity still happens, just not 
> in the Python ecosystem? I know you can still import "this" or "antigravity", 
> but those are now old (both introduced before 2010). When was the last time a 
> clever easter egg was introduced or an April Fool's Day joke played?
> 
> ¯\_(ツ)_/¯
> 
> Skip
> 
> ___
> Python-Dev mailing list -- python-...@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-...@python.org/message/Q62W2Q6R6XMX57WK2CUGEENHMT3C3REF/
> Code of Conduct: http://python.org/psf/codeofconduct/


-- 
https://mail.python.org/mailman/listinfo/python-list


RE: built-in pow() vs. math.pow()

2023-03-30 Thread avi.e.gross
Some questions are more reasonable than others.

If the version of a function used in a package were IDENTICAL to the
built-in, then why have it?

There are many possible reasons a package may tune a function for their own
preferences or re-use a name that ends up blocking the view of another name.

The bottom line is if you do not want the other one, then don't ask for it
by not importing the entire module into your namespace or by explicitly
asking for the base function in the ways python provides.

Others have replied about differences in various implementations of pow()
and reiterated my point above that if you want a specific function instance,
it is your responsibility to make sure you get it.

One method I would mention that I have not seen is to copy pow() to your own
name before importing other things. Something like:

pow3 = pow
import ...

Then use the new name.

Or import all of math (despite advice not to) and then make pow3 a synonym
for the base version.

Most people most of the time will want a small and fast function that does
what they asked for and does not waste time looking for an optional third
argument and doing something additional. Would you be satisfied if
math::pow() simply checked for a third argument and turned around and called
base::pow() to handle it?

A deeper question I can appreciate is wondering if it is a bug or feature
that python (and many other languages) allow results where you can hide a
variable or function name. I call it a feature. As with all such variables,
scope rules and other such things apply and make the language powerful and
sometimes a tad dangerous.



-Original Message-
From: Python-list  On
Behalf Of Andreas Eisele
Sent: Thursday, March 30, 2023 5:16 AM
To: python-list@python.org
Subject: built-in pow() vs. math.pow()

I sometimes make use of the fact that the built-in pow() function has an
optional third argument for modulo calculation, which is handy when dealing
with tasks from number theory, very large numbers, problems from Project
Euler, etc. I was unpleasantly surprised that math.pow() does not have this
feature, hence "from math import *" overwrites the built-in pow() function
with a function that lacks functionality. I am wondering for the rationale
of this. Does math.pow() do anything that the built-in version can not do,
and if not, why is it even there?
Thanks in advance for any enlightening comment on this.
Best regards, Andreas
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: =- and -= snag

2023-03-14 Thread avi.e.gross
There seem to be a fundamental disconnect here based on people not 
understanding what can happen when spaces are optional. Yes, I have had my 
share of times I found what I programmed was not quite right and been unhappy 
but the result was mostly learning how to not not not not do that next time and 
follow the rules.

When I write:

Var = -1

The result of five minus signs in a row is not a new long operator called 
"-". It is five instances of "-" and looks more like:

Var=-(-(-(-(-1

You can even throw in some plus signs and they are effectively ignored.

It is no different than some other operators like:

Var = not not not not True

So the "=-" case is not a single operator even as "-=" is a single operator.

If I add another negative symbol to the above, I have two operators with a 
binary operator of "-=" that may be implemented more efficiently or use a 
dunder method that handles it and then a unary "-" operator. 

>>> Var = 1
>>> Var -=-1
>>> Var
2

Yes, I can sympathize with humans who think the computer should do what they 
meant. They may also not like scenarios where they mess up the indentation and 
assume the computer should guess what they meant.

But life is generally not like that. Whenever you are in doubt as to how 
something will be parsed especially given how many precedence levels python has 
and so on, USE PARENTHESES and sometimes spaces.

If you type "Var - = 5" you get a syntax error because it sees two different 
operators that do not mesh well. If you type "Var = - 5" you get a result that 
should make sense as the minus binds to the 5 and negates it and then the 
assignment is done. If you leave the space between symbols out, then "-=" 
evaluates to a new operator and "=-" evaluates to two operators as described.

There are lots of trivial bugs people find including truly simple ones like 
subtracting instead of adding or using a floating point number like 5.0 when 
you meant to use an integer or forgetting that 5/3 and 5//3 do somewhat 
different things. People often substitute things like bitwise operators and the 
computer does what you tell it. Many languages have doubled operators like "&" 
versus "&&" that do different things. And if you want to make a list and 
instead of using square brackets use curly brackets, in python, you get a set! 
There are so many places you can mess up.

ALL languages have such features where people can and do make mistakes and 
sometimes cannot easily find them. Add in mistakes where different parts of a 
program use the same variable name and one changes it out and the other gets a 
confusing result. Simply put, lots of stuff is not only legal but often useful 
and even when a linter or other program sees what might be a mistake, it will 
often be wrong and something you wanted done.

Consider another such arena in the lowly spelling Checker that keeps telling me 
things are spelled wrong because it does not know Schwarzenegger is a name or 
that Grosz and Groß are valid variations on the spelling of my name.  Imagine 
what it does when I write in any language other than English. One solution has 
been to have it add such words to a dictionary but that backfires when it 
allows words as valid even though in the current context, it is NOT VALID. So 
some such programs allow you to designate what dictionary/language to use to 
check a region such as a paragraph. Some may transition to being closer to 
grammar checkers that try to parse your sentences and make sure not only that a 
word is a valid spelling but valid given what role it plays in the sentence!

Computer languages are both far simpler and yet weirder. You need to use them 
in ways the documentation says. But when you consider various ideas about 
scope, you can end up with it needing to know which of perhaps many copies of 
the same variable name is being referenced. So if you wrote a program where you 
had a local variable inside a function that you changed and you ASS U ME d you 
could use that variable outside the scope later and another variable already 
exists there with the same name, how is it supposed to know you made a mistake? 
How does it know you wanted a deep copy rather than a reference or shallow copy?

There are so many other examples that the short answer to many questions is 
something like THAT IS THE WAY IT IS. Don't do that!

I doubt anyone would like it if computer programs were written with multiple 
layers of redundancy so that many errors could be detected when all the parts 
do not align along with a checksum. Human languages that do something similar 
are, frankly, a royal pain. I mean once you have a gender and a tense and a  
singular/plural, for example, everything else nearby, such as adjectives, must 
be adjusted to have the right form or endings to match it. That may have been 
great when it was hard to hear a speaker from the back of the theater so the 
redundancy helped offer possible corrections but these days just makes the 

RE: =- and -= snag

2023-03-13 Thread avi.e.gross
Morten,

Suggesting something is UNPYTHONIC is really not an argument I take
seriously.

You wrote VALID code by the rules of the game and it is not a requirement
that it guesses at what you are trying to do and calls you an idiot!

More seriously, python lets you do some completely obscure things such as
check whether some random object or expression is truthy or not. There is no
way in hell the language, as defined, can catch all kinds of mistakes.

Now some languages or their linters have chosen to provide warnings of code
that may be valid but is often an error.

Consider:

  x  = 1
  y = 0
  x = y

Do I want to rest x to the value of y? Maybe. Or do I want the interpreter
to print out whether x == y perhaps?

Well what if the third line above was 

  x  == y

Is that too a warning? 

To add to the confusion some languages have an ===, :=, +=, -=, /=, |= or
oddities like %=% and many of these are all variations on meanings vaguely
related to equality before or after ...

So, no, it is not only not unpythonic, in my opinion, but quite pythonic to
let the interpreter interpret what you wrote and not know what you meant.

Is there possible a flag that would require your code to use spaces in many
places that might cut down on mistakes? There could be and so your example
of something like "new =- old" might be asked to be rewritten as "new = -
old" or even "new = (-old)" but for now, you may want to be more careful.

I do sympathize with the problem of a hard to find bug because it LOOKS
RIGHT to you. But it is what it is.

Avi

-Original Message-
From: Python-list  On
Behalf Of Morten W. Petersen
Sent: Monday, March 13, 2023 5:26 PM
To: python-list 
Subject: =- and -= snag

Hi.

I was working in Python today, and sat there scratching my head as the
numbers for calculations didn't add up.  It went into negative numbers,
when that shouldn't have been possible.

Turns out I had a very small typo, I had =- instead of -=.

Isn't it unpythonic to be able to make a mistake like that?

Regards,

Morten

-- 
I am https://leavingnorway.info
Videos at https://www.youtube.com/user/TheBlogologue
Twittering at http://twitter.com/blogologue
Blogging at http://blogologue.com
Playing music at https://soundcloud.com/morten-w-petersen
Also playing music and podcasting here:
http://www.mixcloud.com/morten-w-petersen/
On Google+ here https://plus.google.com/107781930037068750156
On Instagram at https://instagram.com/morphexx/
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Can you process seismographic signals in Python or should I switch to Matlab ?

2023-03-13 Thread avi.e.gross
Hi,

This seems again to be a topic wandering. Was the original question whether
Python could be used for dealing with Seismic data of some unspecified sort
as in PROCESSING it and now we are debating how to clean various aspects of
data and make things like data.frames and extract subsets for analysis?

Plenty of the above can be done in any number of places ranging from
languages like Python and R to databases and SQL. If the result you want to
analyze can then be written in a format with rows and columns containing the
usual suspects like numbers and text and dates and so on, then this part of
the job can be done anywhere you want.

And when you have assembled your data and now want to make a query to
generate a subset such as data in a date range that is from a set of
measuring stations and with other qualities, then you can simply save the
data to a file in whatever format, often something like a .CSV.

It is the following steps where you want to choose your language based on
what is available. Are you using features like a time series, for example?
Are you looking or periodicity. Is graphing a major aspect and do you need
some obscure graph types not easily found but that are parts of
packages/modules in some language like R or Python? Do you need the analysis
to have interactive aspects such as from a GUI, or a web page? Does any
aspect of your work include things like statistical analyses or machine
learning? The list goes on.

As mentioned, people who do lots of stuff along these lines can share some
tools in python, or elsewhere, they find useful and that might help fit the
needs of the OP but they work best when they have a better idea of what
exactly you want to do. Part of what I gleaned, was a want to do a 3-D graph
that rotates. Python has multiple graphics packages and so on as do
languages like R. The likelihood of finding something useful goes up if you
identify if there are communities of people doing similar work and can share
some of their tools. 

Hence the idea of focused searches. Asking here will largely get you people
mainly who use Python and if it turns out R or something entirely else meets
your needs better, perhaps Mathematica  even if you have to pay for it if
that is expected by your peers.

My guess is that python would be a decent choice as it can do almost
anything, but for practical purposes, you do not want to stick with what is
in the base and probably want to use extensions like numpy/pandas and
perhaps others like scipy and if doing graphics, there are too many
including matplotlib and seaborn but you may need something specialized for
your needs.

I cannot stress the importance of making sure the people evaluating and
using your work can handle it. Python is fairly mainstream and free enough
that it can foot your bill. But it has various versions and clearly nobody
would advise you to use version 2. Some versions are packaged with many of
the tools you may want to use, such as Anaconda. It depends on your level of
expertise already and how much you want to learn to get this task done. You
make it sound like your kind of work must be done alone, and that can
simplify things but also mean more work for you.

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Monday, March 13, 2023 2:10 PM
To: python-list@python.org
Subject: Re: Can you process seismographic signals in Python or should I
switch to Matlab ?

On 3/13/2023 11:54 AM, Rich Shepard wrote:> On Mon, 13 Mar 2023, Thomas 
Passin wrote:
 >
 >> No doubt, depending on the data formats used. But it's still going
 >> to be a big task.
 >
 > Thomas,
 >
 > True, but once you have a dataframe with all the information about
 > all the earthquakes you can extract data for every analysis you want
 > to do.
This message would better have gone to the list instead of just me.

I'm not saying that Pandas is a bad choice!  I'm saying that getting all
that data into shape so that it can be ingested into a usable dataframe
will be a lot of hard work.

 > If you've not read Wes McKinney's "Python for Data Analysis: Data
 > Wrangling with Pandas, NumPy, and IPython" I encourage you to do so.

I've been interested in that title, but since I don't currently have any
large, complex data wrangling problems I've put it off.




-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Can you process seismographic signals in Python or should I switch to Matlab ?

2023-03-11 Thread avi.e.gross
I have used GNU Octave as a sort of replacement for MATLAB as a free
resource. I have no idea if it might meet your needs. 

Although Python is a good environment for many things, if you have no
knowledge of it yet, it can take a while to know enough and if you just need
it for one project, ...

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Sunday, March 12, 2023 12:02 AM
To: python-list@python.org
Subject: Re: Can you process seismographic signals in Python or should I
switch to Matlab ?

On 3/11/2023 6:54 PM, a a wrote:
> My project
>
https://www.mathworks.com/help/matlab/matlab_prog/loma-prieta-earthquake.htm
l

If your goal is to step through this Matlab example, then clearly you 
should use Matlab. If you do not have access to Matlab or cannot afford 
it, then you would have to use something else, and Python would be a 
prime candidate.  However, each of the techniques and graphs in the 
lesson have been pre-packaged for you in the Matlab case but not with 
Python (many other case studies on various topics that use Python Python 
can be found, though).

Everything in the Matlab analysis can be done with Python and associated 
libraries.  You would have to learn various processing and graphing 
techniques.  You would also have to get the data from somewhere.  It's 
prepackaged for this analysis and you would have to figure out where to 
get it.  There is at least one Python package that can read and convert 
Matlab files - I do not remember its name, though.

A more important question is whether doing the Matlab example prepares 
you to do any other analyses on your own. To shed some light on this, 
here is a post on some rather more advanced analysis using data on the 
same earthquake, done with Python tools -

https://towardsdatascience.com/earthquake-time-series-forecasts-using-a-hybr
id-clustering-lstm-approach-part-i-eda-6797b22aed8c



-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Can you process seismographic signals in Python or should I switch to Matlab ?

2023-03-11 Thread avi.e.gross
A a,

Consider asking a more specific question. Many things can be done in many
different programming languages.

Are you asking if there are helpers you can use such as modules that
implement parts of the algorithms you need? Are you asking about speed or
efficiency?

Have you considered how few people here are likely to know much about a
specialized field and perhaps a search using a search engine might get you
something like this:

https://www.google.com/search?q=python+process+seimographic+signals=pytho
n+process+seimographic+signals=chrome..69i57j0i546l5.16718j0j7=
chrome=UTF-8

For example:

https://www.geophysik.uni-muenchen.de/~megies/www_obsrise/

You can of course search for say signal processing or whatever makes sense
to you.

My answer, if not clear, is that your question may not be primarily about
Python and about finding whatever environment gives you both access to
software that helps you as well as a language that lets you wrap lots of it
together, make reports and so on. Python is likely a decent choice but
perhaps others are better for your own situation, such as having others
nearby you can learn from.

Good luck.

-Original Message-
From: Python-list  On
Behalf Of a a
Sent: Saturday, March 11, 2023 6:54 PM
To: python-list@python.org
Subject: Can you process seismographic signals in Python or should I switch
to Matlab ?

My project
https://www.mathworks.com/help/matlab/matlab_prog/loma-prieta-earthquake.htm
l
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Feature migration

2023-03-08 Thread avi.e.gross
Greg,

Yes, it is very possible from other sources. I doubt it hurts if a popular
language, albeit not compiled the same way, uses a feature.

I see it a bit as more an impact on things like compiler/interpreter design
in that once you see it can reasonably be implemented, some features look
doable.

I will say the exact methods and rules are different enough and interact
with things differently. As an example, you can use an "end" statement at
the end of a block to signal what is ending.

As regularly repeated. There is no one right way but there are ways
supported by the language you are in and others ways that are NOT supported.

-Original Message-
From: Python-list  On
Behalf Of Greg Ewing via Python-list
Sent: Wednesday, March 8, 2023 5:47 PM
To: python-list@python.org
Subject: Re: Feature migration

On 9/03/23 8:29 am, avi.e.gr...@gmail.com wrote:
> They seem to be partially copying from python a
> feature that now appears everywhere but yet strive for some backwards
> compatibility. They simplified the heck out of all kinds of expressions by
> using INDENTATION.

It's possible this was at least parttly inspired by functional languages
such as Haskell. Haskell has always allowed indentation as one way of
expressing structure. Python wasn't the first language to use
indentation semantically.

-- 
Greg
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Feature migration

2023-03-08 Thread avi.e.gross
This may be of interest to a few and is only partially about Python.
 
In a recent discussion, I mentioned some new Python features (match) seemed
related to a very common feature that has been in a language like SCALA for
a long time. I suggested it might catch on and be used as widely as in SCALA
and become the pythonic way to do many things, whatever that means, even as
it's origins lie elsewhere.
 
This motivated me to go take a new look at SCALA and I was a bit surprised.
I will only mention two aspects as they relate to python. One is that they
made a version 3 that has significant incompatibilities with version 2.
Sounds familiar?
 
The other fascinated me. They seem to be partially copying from python a
feature that now appears everywhere but yet strive for some backwards
compatibility. They simplified the heck out of all kinds of expressions by
using INDENTATION. Lots of curly braces are now gone or optional. You need
to indent carefully, and in places it is not quite the same as python. It is
way more readable.
 
Python always had indentation as a key feature. Since SCALA did not, it
allows you to set options to turn off the new feature, sort of.
 
As I have been saying, all kinds of ideas in computer science can migrate to
new and existing languages, often not quite the same way. I am not endorsing
SCALA, just noting that I suspect Python had some influence.
 
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Fast full-text searching in Python (job for Whoosh?)

2023-03-07 Thread avi.e.gross
Some of the discussions here leave me confused as the info we think we got
early does not last long intact and often morphs into something else and we
find much of the discussion is misdirected or wasted.

Wouldn't it have been nice if this discussion had not started with a mention
of a package/module few have heard of along with a vague request on how best
to search for lines that match something in a file?

I still do not know enough to feel comfortable even after all this time. It
now seems to be a web-based application in which a web page wants to use
autocompletion as the user types.

So was the web page a static file that the user runs, or is it dynamically
created by something like a python program? How is the fact that a user has
typed a letter in a textbox or drop down of sorts reflected in a request
being sent to a python program to return possible choices? Is the same
process called anew each time or is it, or perhaps a group of similar
processes or threads going to stick around and be called repeatedly?

Lots of details are missing and in particular, much of what is being
described sounds like it is happening in the browser, presumably in
JavaScript. Also noted is that the first keystroke or two may return too
much data.

So does the OP still think this is a python question? So much of the
discussion sounds like it is in the browser deciding whether to wait for the
user to type more before making a request, or throwing away results of an
older request.

So my guess is that a possible design for this amount of data may simply be
to read the file into the browser at startup, or when the first letter is
typed, and do all the searches internally, perhaps cascaded as long as
backspace or editing is not used.

If the data gets much larger, of course, then using a server makes sense
albeit it need not use python unless lots more in the project is also ...

-Original Message-
From: Python-list  On
Behalf Of David Lowry-Duda
Sent: Tuesday, March 7, 2023 1:29 PM
To: python-list@python.org
Subject: Re: Fast full-text searching in Python (job for Whoosh?)

On 22:43 Sat 04 Mar 2023, Dino wrote:
>How can I implement this? A library called Whoosh seems very promising 
>(albeit it's so feature-rich that it's almost like shooting a fly with 
>a bazooka in my case), but I see two problems:
>
> 1) Whoosh is either abandoned or the project is a mess in terms of 
>community and support 
>(https://groups.google.com/g/whoosh/c/QM_P8cGi4v4 ) and
>
> 2) Whoosh seems to be a Python only thing, which is great for now, 
>but I wouldn't want this to become an obstacle should I need port it to 
>a different language at some point.

As others have noted, it sounds like relatively straightforward 
implementations will be sufficient.

But I'll note that I use whoosh from time to time and I find it stable 
and pleasant to work with. It's true that development stopped, but it 
stopped in a very stable place. I don't recommend using whoosh here, but 
I would recommend experimenting with it more generally.

- DLD
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Fast full-text searching in Python (job for Whoosh?)

2023-03-06 Thread avi.e.gross
Ah, thanks Dino. Autocomplete within a web page can be an interesting
scenario but also a daunting one.

Now, do you mean you have a web page with a text field, initially I suppose
empty, and the user types a single character and rapidly a drop-down list or
something is created and shown? And as they type, it may shrink? And as soon
as they select one, it is replaced in the text field and done?

If your form has an attached function written in JavaScript, some might load
your data into the browser and do all that work from within. No python
needed.

Now if your scenario is similar to the above, or perhaps the user needs to
ask for autocompletion by using tab or something, and you want to keep
sending requests to a server, you can of course use any language on the
server. BUT I would be cautious in such a design.

My guess is you autocomplete on every keystroke and the user may well type
multiple characters resulting in multiple requests for your program. Is a
new one called every time or is it a running service. If the latter, it pays
to read in the data once and then carefully serve it. But when you get just
the letter "h" you may not want to send and process a thousand results but
limit It to say the first N. If they then add an o to make a ho, You may not
need to do much if it is anchored to the start except to search in the
results of the previous search rather than the whole data.

But have you done some searching on how autocomplete from a fixed corpus is
normally done? It is a quite common thing.


-Original Message-
From: Python-list  On
Behalf Of Dino
Sent: Monday, March 6, 2023 7:40 AM
To: python-list@python.org
Subject: Re: RE: Fast full-text searching in Python (job for Whoosh?)

Thank you for taking the time to write such a detailed answer, Avi. And 
apologies for not providing more info from the get go.

What I am trying to achieve here is supporting autocomplete (no pun 
intended) in a web form field, hence the -i case insensitive example in 
my initial question.

Your points are all good, and my original question was a bit rushed. I 
guess that the problem was that I saw this video:

https://www.youtube.com/watch?v=gRvZbYtwTeo_channel=NextDayVideo

The idea that someone types into an input field and matches start 
dancing in the browser made me think that this was exactly what I 
needed, and hence I figured that asking here about Whoosh would be a 
good idea. I know realize that Whoosh would be overkill for my use-case, 
as a simple (case insensitive) query substring would get me 90% of what 
I want. Speed is in the order of a few milliseconds out of the box, 
which is chump change in the context of a web UI.

Thank you again for taking the time to look at my question

Dino

On 3/5/2023 10:56 PM, avi.e.gr...@gmail.com wrote:
> Dino, Sending lots of data to an archived forum is not a great idea. I
> snipped most of it out below as not to replicate it.
> 
> Your question does not look difficult unless your real question is about
> speed. Realistically, much of the time spent generally is in reading in a
> file and the actual search can be quite rapid with a wide range of
methods.
> 
> The data looks boring enough and seems to not have much structure other
than
> one comma possibly separating two fields. Do you want the data as one wide
> filed or perhaps in two parts, which a CSV file is normally used to
> represent. Do you ever have questions like tell me all cars whose name
> begins with the letter D and has a V6 engine? If so, you may want more
than
> a vanilla search.
> 
> What exactly do you want to search for? Is it a set of built-in searches
or
> something the user types in?
> 
> The data seems to be sorted by the first field and then by the second and
I
> did not check if some searches might be ambiguous. Can there be many
entries
> containing III? Yep. Can the same words like Cruiser or Hybrid appear?
> 
> So is this a one-time search or multiple searches once loaded as in a
> service that stays resident and fields requests. The latter may be worth
> speeding up.
> 
> I don't NEED to know any of this but want you to know that the answer may
> depend on this and similar factors. We had a long discussion lately on
> whether to search using regular expressions or string methods. If your
data
> is meant to be used once, you may not even need to read the file into
> memory, but read something like a line at a time and test it. Or, if you
end
> up with more data like how many cylinders a car has, it may be time to
read
> it in not just to a list of lines or such data structures, but get
> numpy/pandas involved and use their many search methods in something like
a
> data.frame.
> 
> Of course if you are worried about portability, keep using Get Regular
> Expression Print.
> 
> Your example was:
> 
>   $ grep -i v60 all_cars_unique.csv
>   Genesis,GV60
>   Volvo,V60
> 
> You seem to have wanted case folding and that is NOT a normal search. And
> your search is matching 

RE: Fast full-text searching in Python (job for Whoosh?)

2023-03-06 Thread avi.e.gross
Thomas,

I may have missed any discussion where the OP explained more about proposed 
usage. If the program is designed to load the full data once, never get updates 
except by re-reading some file, and then handles multiple requests, then some 
things may be worth doing.

It looked to me, and I may well be wrong, like he wanted to search for a string 
anywhere in the text so a grep-like solution is a reasonable start with the 
actual data being stored as something like a list of character strings you can 
search "one line" at a time. I suspect a numpy variant may work faster.

And of course any search function he builds can be made to remember some or all 
previous searches using a cache decorator. That generally uses a dictionary for 
the search keys internally.

But using lots of dictionaries strikes me as only helping if you are searching 
for text anchored to the start of a line so if you ask for "Honda" you instead 
ask the dictionary called "h" and search perhaps just for "onda" then recombine 
the prefix in any results. But the example given wanted to match something like 
"V6" in middle of the text and I do not see how that would work as you would 
now need to search 26 dictionaries completely.



-Original Message-
From: Python-list  On 
Behalf Of Thomas Passin
Sent: Monday, March 6, 2023 11:03 AM
To: python-list@python.org
Subject: Re: Fast full-text searching in Python (job for Whoosh?)

On 3/6/2023 10:32 AM, Weatherby,Gerard wrote:
> Not sure if this is what Thomas meant, but I was also thinking dictionaries.
> 
> Dino could build a set of dictionaries with keys “a” through “z” that contain 
> data with those letters in them. (I’m assuming case insensitive search) and 
> then just search “v” if that’s what the user starts with.
> 
> Increased performance may be achieved by building dictionaries “aa”,”ab” ... 
> “zz. And so on.
> 
> Of course, it’s trading CPU for memory usage, and there’s likely a point at 
> which the cost of building dictionaries exceeds the savings in searching.

Chances are it would only be seconds at most to build the data cache, 
and then subsequent queries would respond very quickly.

> 
> From: Python-list  on 
> behalf of Thomas Passin 
> Date: Sunday, March 5, 2023 at 9:07 PM
> To: python-list@python.org 
> Subject: Re: Fast full-text searching in Python (job for Whoosh?)
> 
> I would probably ingest the data at startup into a dictionary - or
> perhaps several depending on your access patterns - and then you will
> only need to to a fast lookup in one or more dictionaries.
> 
> If your access pattern would be easier with SQL queries, load the data
> into an SQLite database on startup.
> 
> IOW, do the bulk of the work once at startup.
> --
> https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!lnP5Hxid5mAgwg8o141SvmHPgCBU8zEaHDgukrQm2igozg5H5XLoIkAmrsHtRbZHR68oYAQpRFPh-Z9telM$

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Fast full-text searching in Python (job for Whoosh?)

2023-03-06 Thread avi.e.gross
Gerard,

I was politely pointing out how it was more than the minimum necessary and
might gets repeated multiple times as people replied. The storage space is a
resource someone else provides and I prefer not abusing it.

However, since the OP seems to be asking a question focused on how long it
takes to search using possible techniques, indeed some people would want the
entire data to test with.

In my personal view, the a snippet of the data is what I need to see how it
is organized and then what I need way more is some idea for what kind of
searching is needed.

If I was told there would be a web page allowing users to search a web
service hosting the data on a server with one process called as much as
needed that spawned threads to handle the task, I might see it as very
worthwhile to read in the data once into some data structure that allows
rapid searches over and over.  If it is an app called ONCE as a whole for
each result, as in the grep example, why bother and just read a line at a
time and be done with it.

My suggestion remains my preference. The discussion is archived. Messages
are can optimally be trimmed as needed and not allowed to contain the full
contents of the last twenty replies back and forth unless that is needed.
Larger amounts of data can be offered to share and if wanted, can be posted
or send to someone asking for it or placed in some public accessible place.

But my preference may not be relevant as the forum has hosts or owners and
it is what they want that counts.

The data this time was not really gigantic. But I often work with data from
a CSV that has hundreds of columns and hundreds of thousands or more rows,
with some of the columns containing large amounts of text. But I may be
interested in how to work with say just half a dozen columns and for the
purposes of my question here, perhaps a hundred representative rows. Should
I share everything, or maybe save the subset and only share that?

This is not about python as a language but about expressing ideas and
opinions on a public forum with limited resources. Yes, over the years, my
combined posts probably use far more archival space. We are not asked to be
sparse, just not be wasteful. 

The OP may consider what he is working with as a LOT of data but it really
isn't by modern standards. 

-Original Message-
From: Python-list  On
Behalf Of Weatherby,Gerard
Sent: Monday, March 6, 2023 10:35 AM
To: python-list@python.org
Subject: Re: Fast full-text searching in Python (job for Whoosh?)

"Dino, Sending lots of data to an archived forum is not a great idea. I
snipped most of it out below as not to replicate it."

Surely in 2023, storage is affordable enough there's no need to criticize
Dino for posting complete information. If mailing space is a consideration,
we could all help by keeping our replies short and to the point.

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Cutting slices

2023-03-05 Thread avi.e.gross
I am not commenting on the technique or why it is chosen just the part where
the last search looks for a non-existent period:

s = 'alpha.beta.gamma'
...
s[ 11: s.find( '.', 11 )]

What should "find" do if it hits the end of a string without finding the
period you claim is a divider?

Could that be why gamma got truncated?

Unless you can arrange for a terminal period, maybe you can reconsider the
approach.


-Original Message-
From: Python-list  On
Behalf Of aapost
Sent: Sunday, March 5, 2023 6:00 PM
To: python-list@python.org
Subject: Re: Cutting slices

On 3/5/23 17:43, Stefan Ram wrote:
>The following behaviour of Python strikes me as being a bit
>"irregular". A user tries to chop of sections from a string,
>but does not use "split" because the separator might become
>more complicated so that a regular expression will be required
>to find it. But for now, let's use a simple "find":
>
> |>>> s = 'alpha.beta.gamma'
> |>>> s[ 0: s.find( '.', 0 )]
> |'alpha'
> |>>> s[ 6: s.find( '.', 6 )]
> |'beta'
> |>>> s[ 11: s.find( '.', 11 )]
> |'gamm'
> |>>>
> 
>. The user always inserted the position of the previous find plus
>one to start the next "find", so he uses "0", "6", and "11".
>But the "a" is missing from the final "gamma"!
>
>And it seems that there is no numerical value at all that
>one can use for "n" in "string[ 0: n ]" to get the whole
>string, isn't it?
> 
> 

I would agree with 1st part of the comment.

Just noting that string[11:], string[11:None], as well as string[11:16] 
work ... as well as string[11:324242]... lol..
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Fast full-text searching in Python (job for Whoosh?)

2023-03-05 Thread avi.e.gross
Dino, Sending lots of data to an archived forum is not a great idea. I
snipped most of it out below as not to replicate it.

Your question does not look difficult unless your real question is about
speed. Realistically, much of the time spent generally is in reading in a
file and the actual search can be quite rapid with a wide range of methods.

The data looks boring enough and seems to not have much structure other than
one comma possibly separating two fields. Do you want the data as one wide
filed or perhaps in two parts, which a CSV file is normally used to
represent. Do you ever have questions like tell me all cars whose name
begins with the letter D and has a V6 engine? If so, you may want more than
a vanilla search.

What exactly do you want to search for? Is it a set of built-in searches or
something the user types in?

The data seems to be sorted by the first field and then by the second and I
did not check if some searches might be ambiguous. Can there be many entries
containing III? Yep. Can the same words like Cruiser or Hybrid appear? 

So is this a one-time search or multiple searches once loaded as in a
service that stays resident and fields requests. The latter may be worth
speeding up.

I don't NEED to know any of this but want you to know that the answer may
depend on this and similar factors. We had a long discussion lately on
whether to search using regular expressions or string methods. If your data
is meant to be used once, you may not even need to read the file into
memory, but read something like a line at a time and test it. Or, if you end
up with more data like how many cylinders a car has, it may be time to read
it in not just to a list of lines or such data structures, but get
numpy/pandas involved and use their many search methods in something like a
data.frame.

Of course if you are worried about portability, keep using Get Regular
Expression Print.

Your example was:

 $ grep -i v60 all_cars_unique.csv
 Genesis,GV60
 Volvo,V60

You seem to have wanted case folding and that is NOT a normal search. And
your search is matching anything on any line. If you wanted only a complete
field, such as all text after a comma to the end of the line, you could use
grep specifications to say that.

But once inside python, you would need to make choices depending on what
kind of searches you want to allow but also things like do you want all
matching lines shown if you search for say "a" ...




-Original Message-
From: Python-list  On
Behalf Of Dino
Sent: Saturday, March 4, 2023 10:47 PM
To: python-list@python.org
Subject: Re: Fast full-text searching in Python (job for Whoosh?)


Here's the complete data file should anyone care.

Acura,CL
Acura,ILX
Acura,Integra
Acura,Legend

smart,fortwo electric drive
smart,fortwo electric drive cabrio

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-04 Thread avi.e.gross
>>> I think you are over-thinking this, Avi :)

Is overthinking the pythonic way or did I develop such a habit from some
other language?

More seriously, I find in myself that I generally do not overthink. I
overtalk and sort of overwrite, so for now, I think I will drop out of this
possibly non-pythonic topic and go read another book or a few hundred so
when it comes up again ...

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Saturday, March 4, 2023 5:04 PM
To: python-list@python.org
Subject: Re: Which more Pythonic - self.__class__ or type(self)?

On 3/4/2023 4:18 PM, avi.e.gr...@gmail.com wrote:
> I don't know, Thomas. For some simple programs, there is some evolutionary
> benefit by starting with what you know and gradually growing from there.
He
> first time you need to do something that seems to need a loop in python,
> there are loops to choose from.
> 
> But as noted in a recent discussion, things are NOT NECESSARILY the same
> even with something that simple. Did your previous languages retain
> something like the loop variable outside the loop? What are your new
scoping
> rules? Do you really want to keep using global variables, and so on.
> 
> And, another biggie is people who just don't seem aware of what comes
easily
> in the new language. I have seen people from primitive environments set up
> programs with multiple arrays they process the hard way instead of using
> some forms of structure like a named tuple or class arranged in lists or
use
> a multidimensional numpy/pandas kind of data structure.
> 
> So ignoring the word pythonic as too specific, is there a way to say that
> something is the way your current language supports more naturally?
> 
> Yes, there are sort of fingerprints in how people write. Take the python
> concept of truthy and how some people will still typically add a test for
> equality with True. That may not be pythonic to some but is there much
harm
> in being explicit so anyone reading the code better understands what it
doe?
> 
> I have to wonder what others make of my code as my style is likely to be
> considered closer to "eclectic" as I came to python late and found an
> expanding language with way too many ways to do anything and can choose.
But
> I claim that too is pythonic!

I think you are over-thinking this, Avi :)

> 
> -Original Message-
> From: Python-list 
On
> Behalf Of Thomas Passin
> Sent: Saturday, March 4, 2023 1:09 PM
> To: python-list@python.org
> Subject: Re: Which more Pythonic - self.__class__ or type(self)?
> 
> On 3/4/2023 2:47 AM, Peter J. Holzer wrote:
>> Even before Python existed there was the adage "a real programmer
>> can write FORTRAN in any language", indicating that idiomatic usage of a
>> language is not governed by syntax and library alone, but there is a
>> cultural element: People writing code in a specific language also read
>> code by other people in that language, so they start imitating each
>> other, just like speakers of natural languages imitate each other.
>> Someone coming from another language will often write code which is
>> correct but un-idiomatic, and you can often guess which language they
>> come from (they are "writing FORTRAN in Python").
> 
> What Peter didn't say is that this statement is usually used in a
> disparaging sense.  It tends to imply that a person can write (or is
> writing) awkward or inappropriate code anywhere.
> 

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-04 Thread avi.e.gross
Alan,

I got divorced from the C++ crowd decades ago when I left Bell Labs. You are 
making me glad I did!

I do accept your suggestion that you can be idiomatic if you follow the common 
methods of whatever language you use. That will take you quite far as long as 
you are not a total slave to it.

But I note some idioms catch on and some are imposed and some become almost 
moot. I am not sure which aspects of C++ have changed drastically and may go 
re-study the modern version as I was a very early adoptee within AT and saw 
changes even back then. 

But I consider something like the half dozen or so major print variants in 
python and wonder how much longer some of them will be seen as worth using, let 
alone idiomatic. Something like an fstring may dominate for many purposes.

I know in R, that I used to use some convoluted methods to assemble output that 
I often now ignore once a "glue" package gave me something similar to fstring 
abilities where all kinds of variables and calculations can now be embedded 
withing a string to be dynamically evaluated in your current environment. Some 
of the documents I write now similarly embed parts of programs and also have an 
inline ability to evaluate small amounts of code in one of many languages  that 
inserts directly into the text as it is being typeset.

So I see moving targets where what was formerly at or near the state of the 
art, becomes passé. So much of my early work rapidly became trivial or 
irrelevant or never caught on or became lost in an environment I no longer 
used. To keep going forward often involves leaving things behind.

Some new features in Python will be interesting to watch. I mentioned the match 
statement. I was using a similar construct in a JVM language called SCALA ages 
ago.  There it was a sort of core part of the language and often replaced 
constructs normally used by other languages such as many simple or nested IF 
statements. I am sure someone will point out where they borrowed parts from or 
who did it better, but what I am saying is that I want to see if it becomes an 
exotic addition to Python in a way that loosely melds, or if it becomes the 
PYTHONIC way ...



-Original Message-
From: Alan Gauld  
Sent: Saturday, March 4, 2023 1:38 PM
To: avi.e.gr...@gmail.com; python-list@python.org
Subject: Re: RE: Which more Pythonic - self.__class__ or type(self)?

On 04/03/2023 17:38, avi.e.gr...@gmail.com wrote:
> 
> Of course each language has commonly used idioms 
> 

That's the point, the correct term is probably "idiomatic"
rather than "pythonic" but it is a defacto standard that
idiomatic Python has become known as Pythonic. I don't
think that's a problem. And at least we aren't in the C++
situation where almost everything that was idiomatic up
until 1999 is now deemed an anti-pattern and they have
standard library modules to try and guide you to use the
"correct" idioms!

But being Pythonic is still a much more loose term and
the community less stressed about it than their C++
cousins where it has almost reached a religious fervour!

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos



-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-04 Thread avi.e.gross
Great idea, DN!

A whole series of books can be written such as:

- Python for virgin dummies who never programmed before.
- Python for former BASIC programmers
- Python for former LISP programmers with a forked tongue
- Python for former Ada Programmers
- Python for ...
- Python for those who find a dozen former languages are simply not enough.
- Python for people who really want to mainly  use the modules like pandas
or sklearn ...
- Pythonic upgrades to the methods used in former inferior languages ...
- How to speak with a Pythonese accent and lose you old accent based on your
former native language(s).

I am sure some books along these lines have already been written!

Who wants to collaborate?

-Original Message-
From: Python-list  On
Behalf Of dn via Python-list
Sent: Saturday, March 4, 2023 1:26 PM
To: python-list@python.org
Subject: Re: Which more Pythonic - self.__class__ or type(self)?

On 04/03/2023 20.47, Peter J. Holzer wrote:
> On 2023-03-03 13:51:11 -0500, avi.e.gr...@gmail.com wrote:

...
> No. Even before Python existed there was the adage "a real programmer
> can write FORTRAN in any language", indicating that idiomatic usage of a
> language is not governed by syntax and library alone, but there is a
> cultural element: People writing code in a specific language also read
> code by other people in that language, so they start imitating each
> other, just like speakers of natural languages imitate each other.
> Someone coming from another language will often write code which is
> correct but un-idiomatic, and you can often guess which language they
> come from (they are "writing FORTRAN in Python"). Also quite similar to
> natural languages where you can guess the native language of an L2
> speaker by their accent and phrasing.

With ph agree I do...

or do you want that in a DO-loop with a FORMAT?

-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-04 Thread avi.e.gross
I don't know, Thomas. For some simple programs, there is some evolutionary
benefit by starting with what you know and gradually growing from there. He
first time you need to do something that seems to need a loop in python,
there are loops to choose from. 

But as noted in a recent discussion, things are NOT NECESSARILY the same
even with something that simple. Did your previous languages retain
something like the loop variable outside the loop? What are your new scoping
rules? Do you really want to keep using global variables, and so on.

And, another biggie is people who just don't seem aware of what comes easily
in the new language. I have seen people from primitive environments set up
programs with multiple arrays they process the hard way instead of using
some forms of structure like a named tuple or class arranged in lists or use
a multidimensional numpy/pandas kind of data structure.

So ignoring the word pythonic as too specific, is there a way to say that
something is the way your current language supports more naturally? 

Yes, there are sort of fingerprints in how people write. Take the python
concept of truthy and how some people will still typically add a test for
equality with True. That may not be pythonic to some but is there much harm
in being explicit so anyone reading the code better understands what it doe?

I have to wonder what others make of my code as my style is likely to be
considered closer to "eclectic" as I came to python late and found an
expanding language with way too many ways to do anything and can choose. But
I claim that too is pythonic!

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Saturday, March 4, 2023 1:09 PM
To: python-list@python.org
Subject: Re: Which more Pythonic - self.__class__ or type(self)?

On 3/4/2023 2:47 AM, Peter J. Holzer wrote:
> Even before Python existed there was the adage "a real programmer
> can write FORTRAN in any language", indicating that idiomatic usage of a
> language is not governed by syntax and library alone, but there is a
> cultural element: People writing code in a specific language also read
> code by other people in that language, so they start imitating each
> other, just like speakers of natural languages imitate each other.
> Someone coming from another language will often write code which is
> correct but un-idiomatic, and you can often guess which language they
> come from (they are "writing FORTRAN in Python").

What Peter didn't say is that this statement is usually used in a 
disparaging sense.  It tends to imply that a person can write (or is 
writing) awkward or inappropriate code anywhere.

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-04 Thread avi.e.gross
Peter,

Of course each language has commonly used idioms as C with pointer
arithmetic and code like *p++=*q++ but my point is that although I live near
a  seaway and from where C originated, I am not aware of words like "c-way"
or "scenic" as compared to the way people keep saying "pythonic".

Yes, languages develop idioms and frankly, many are replaced with time. And,
yes, I am sure I can write FORTRAN style  in any language as I used to teach
it, but WATFOR?

If the question is to show a dozen solutions for a problem written in VALID
python and ask a panel of seasoned python programmers which they would
prefer, then sometimes there is a more pythonic solution by that definition.
Give the same test to newbies who each came from a different language
background and are just getting started, and I am not sure I care how they
vote!

I suggest that given a dozen such choices, several may be reasonable choices
and in some cases, I suggest the non-pythonic choice is the right one such
as when you expect someone to port your code to other languages and you need
to keep it simple.

I am simply saying that for ME, some questions are not as simple as others.
I am more interested in whether others can read and understand my code, and
it runs without problems, and maybe even is slightly efficient, than whether
someone deems it pythonic.


-Original Message-
From: Python-list  On
Behalf Of Peter J. Holzer
Sent: Saturday, March 4, 2023 2:48 AM
To: python-list@python.org
Subject: Re: Which more Pythonic - self.__class__ or type(self)?

On 2023-03-03 13:51:11 -0500, avi.e.gr...@gmail.com wrote:
> I do not buy into any concept about something being pythonic or not.
> 
> Python has grown too vast and innovated quite a  bit, but also borrowed
from
> others and vice versa.
> 
> There generally is no universally pythonic way nor should there be. Is
there
> a C way

Oh, yes. Definitely.

> and then a C++ way and an R way or JavaScript

JavaScript has a quite distinctive style. C++ is a big language (maybe
too big for a single person to grok completely) so there might be
several "dialects". I haven't seen enough R code to form an opinion.

> or does only python a language with a philosophy of what is the
> pythonic way?

No. Even before Python existed there was the adage "a real programmer
can write FORTRAN in any language", indicating that idiomatic usage of a
language is not governed by syntax and library alone, but there is a
cultural element: People writing code in a specific language also read
code by other people in that language, so they start imitating each
other, just like speakers of natural languages imitate each other.
Someone coming from another language will often write code which is
correct but un-idiomatic, and you can often guess which language they
come from (they are "writing FORTRAN in Python"). Also quite similar to
natural languages where you can guess the native language of an L2
speaker by their accent and phrasing.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Python list insert iterators

2023-03-03 Thread avi.e.gross
Thomas is correct that this is a bit of an odd request unless explained
better.

There are a number of implicit assumptions that need to be revisited here.
Python Lists are what they are. They are not in any way tagged. They are not
linked lists or binary trees or dictionaries or whatever you are looking
for.

They are a mutable object with an order at any given time and no memory or
history of an earlier status.

They support not just insertion but also deletion and replacement and other
things.

But generally, if your time span between deciding on additions and
implementing them will contain no deletions, then one simple solution is to
re-order your insertion to always do the last one first. The indices will
only change at and above an insertion point. Your remaining insertions will
always be at an untouched region where the indices remain the same, for now.

A second choice as Thomas points out is to adjust your indices. An example
might be if you have a collection of proposed insertions and each contains
an index number and payload. Each time you insert the next payload at the
insertion point, you invoke a function that goes through your remaining
Collection and finds any with an index that is higher and increments it.

Obviously there are issues if dealing with adding multiple times to the same
index or adding multiple items at once.

The above could be encapsulated in some kind of VIEW in some languages
including of course some that use pointers.

I will add by pointing out a way to do a multi-insertion at once if you know
all the insertions at the same time.

Take your list that you want to change by adding at say positions 9, 3 and
6.

Now DON"T insert anything. Forget the concept.

Instead, and this is drastic, make a NEW list.

The new list is loosely old[0:2] + new_at_3 + old[3:5] + new_at_6 + old[6:8]
+new_at_9 + old[9:]

Something carefully written like that using concatenation means you do not
lose track of indices and end up with a new extended list you can feel free
to save under the old name and let the prior one be garbage collected.

Maybe one of the above hints at what could work for you, or others may
supply a better answer, or maybe you reevaluate what you are doing or
explain it some more.

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Friday, March 3, 2023 1:04 PM
To: python-list@python.org
Subject: Re: Python list insert iterators

On 3/3/2023 3:22 AM, Guenther Sohler wrote:
> Hi Python community,
> 
> I have a got an example list like
> 
> 1,  2,  3, 4, 5, 6, 7, 8, 9, 10
>  T   T
> 
> and i  eventually want to insert items in the given locations
> (A shall go between 2 and 3,  B shall go between 6 and 7)
> 
> Right now i just use index numbers to define the place:
> 
> A shall insert in position 2
> B shall insert in position 6
> 
> However when i insert A in position 2, the index for successful insertion
> of B gets wrong
> (should now be 7 instead of 6)
> 
> No, it's not an option to sort the indexes and start inserting from the
> back.
> The most elegant option is not to store indexes, but list iterators, which
> attach to the list element
> and would automatically move, especially if an element is inserted before.
> 
> I could not find such functionality in python lists of [ 1,2,3 ]
> 
> Does python have such functionality ?
> if yes, where can i find it ?

You should be more clear about how to establish the desired insertion 
point.  In your example, you first say that the insertion of B should be 
between 6 and 7. But after A gets inserted, you say that B's insertion 
point should change.  How is anyone able to know what the correct 
insertion point should be at any time?

If the rule is that B should get inserted after a particular known 
element, you can find out the index of that element with list.index() 
and insert just after that. If the rule is "There is an imaginary 
location that starts out after index 6 but moves depending on previous 
insertions", then you will probably need to capture a record of those 
insertions and use it to adjust the invisible insertion point.  But this 
synchronization could be tricky to keep correct depending on what you 
want to do to this list.

So you need to specify clearly what the rules are going to be.
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-03 Thread avi.e.gross


Alan,

I do not buy into any concept about something being pythonic or not.

Python has grown too vast and innovated quite a  bit, but also borrowed from
others and vice versa.

There generally is no universally pythonic way nor should there be. Is there
a C way and then a C++ way and an R way or JavaScript or does only python a
language with a philosophy of what is the pythonic way?

My vague impression was that the pythonic way was somewhat of a contrast to
the way a programmer did it before coming to python. So some would argue
that although python allows loops, that some things are more naturally done
in python using a list comprehension.

Really?

I suggest that NOW for some people, it is way more natural to import modules
like numpy and pandas and use their tools using a more vectorized approach.
Is that the new pythonic in some situations?

I can also argue that if you were a contestant on Jeopardy and were in a
category for computer languages and were shown some computer code  and asked
to name that language in 4 lines, then the most pythonic would not be one
saying type(var) but the one showing a dunder method! I mean what makes some
languages special is often the underlying details! On the surface, many look
fairly similar.

Some problems not only can be solved many ways in python, but by using
combinations of different programming paradigms. It can be argued by some
that the pythonic way is to use some forms of object-oriented programming
and by others pushing for a more functional approach. Some seem to continue
pushing for efficiency and others relish at using up CPU cycles and prefer
other considerations such as what is easier for the programmer or that is
more self-documenting.

My answer remains, in this case, like yours. The dunder methods are
generally meant to be implementation details mostly visible when creating
new classes or perhaps adjusting an object. They largely implement otherwise
invisible protocols by providing the hooks the protocols invoke, and do it
in a somewhat reserved name space. If the user is writing code that just
uses existing classes, generally no dunderheads should be seen. I think
using them is not only not pythonic, but risks breaking code if some changes
to python are made.  As one example, the iteration protocol now has new
dunder methods added to be used for asynchronous and calling the __iter__()
type methods will not work well and you now need to know to call the new
ones. Or, don't call them at all and use the regular functions provided.

Some things may be clearly more in the spirit of the language and sometimes
who cares. Consider the debate that since python allows you to fail and
catch an exception, why bother using if statements such as checking for
no-zero before dividing. I never understood that. Plan A works. Now you can
also chose plan B. They both work. But has anyone asked some dumb questions
about the data the code is working on? What if you have data full of zeroes
or NA or Inf or other things make a division problematic. What is the cost
of testing for something or a group of things every time versus setting up a
try/catch every time? What about lots of nesting of these things. What can
humans read better or make adjustments to?

In my mind, if the bad thing you want to avoid is rare and the testing is
costly, perhaps the exception method is best. I mean if you are working with
large numbers where primes are not common, then having to test if it is a
prime can be costly while catching a failure may be less so.

But consider how some people act as if pythonic means you should not care
about efficiency! LOL!

I leave you with the question of the day. Was Voldemort pythonic?

Avi


-Original Message-
From: Python-list  On
Behalf Of Alan Gauld
Sent: Friday, March 3, 2023 4:43 AM
To: python-list@python.org
Subject: Re: Which more Pythonic - self.__class__ or type(self)?

On 02/03/2023 20:54, Ian Pilcher wrote:
> Seems like an FAQ, and I've found a few things on StackOverflow that
> discuss the technical differences in edge cases, but I haven't found
> anything that talks about which form is considered to be more Pythonic
> in those situations where there's no functional difference.

I think avoiding dunder methods is generally considered more Pythonic.

But in this specific case using isinstance() is almost always
the better option. Testing for a specific class is likely to break
down in the face of subclasses. And in Python testing for static types
should rarely be necessary since Python uses duck typing
and limiting things to a hard type seriously restricts your code.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Regular Expression bug?

2023-03-02 Thread avi.e.gross
It is a well-known fact, Jose, that GIGO.

The letters "n" and "m" are not interchangeable. Your pattern fails because you 
have "pn" in one place and "pm" in the other.


>>> s = "pn=jose pn=2017"
...
>>> s0 = r0.match(s)
>>> s0




-Original Message-
From: Python-list  On 
Behalf Of jose isaias cabrera
Sent: Thursday, March 2, 2023 8:07 PM
To: Mats Wichmann 
Cc: python-list@python.org
Subject: Re: Regular Expression bug?

On Thu, Mar 2, 2023 at 2:38 PM Mats Wichmann  wrote:
>
> On 3/2/23 12:28, Chris Angelico wrote:
> > On Fri, 3 Mar 2023 at 06:24, jose isaias cabrera 
wrote:
> >>
> >> Greetings.
> >>
> >> For the RegExp Gurus, consider the following python3 code:
> >> 
> >> import re
> >> s = "pn=align upgrade sd=2023-02-"
> >> ro = re.compile(r"pn=(.+) ")
> >> r0=ro.match(s)
> > print(r0.group(1))
> >> align upgrade
> >> 
> >>
> >> This is wrong. It should be 'align' because the group only goes up-to
> >> the space. Thoughts? Thanks.
> >>
> >
> > Not a bug. Find the longest possible match that fits this; as long as
> > you can find a space immediately after it, everything in between goes
> > into the .+ part.
> >
> > If you want to exclude spaces, either use [^ ]+ or .+?.
>
> https://docs.python.org/3/howto/regex.html#greedy-versus-non-greedy

This re is a bit different than the one I am used. So, I am trying to match
everything after 'pn=':

import re
s = "pm=jose pn=2017"
m0 = r"pn=(.+)"
r0 = re.compile(m0)
s0 = r0.match(s)
>>> print(s0)
None

Any help is appreciated.
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Which more Pythonic - self.__class__ or type(self)?

2023-03-02 Thread avi.e.gross
My understanding is that python created functions like type() and len() as a
general purpose way to get information and ALSO set up a protocol that
classes can follow by creating dunder methods. I think the most pythonic
things is to avoid directly calling the dunder methods with a few exceptions
that mainly happen when you are building or extending classes. I mean some
dunder methods are then called directly to avoid getting into infinite loops
that would be triggered.

And note in many cases, the protocol is more complex. Is a length built-in?
If not, can the object be iterated and you count the results? Calling the
function len() may get you more info as it can leverage such things. And it
means you can sometimes leave out some methods and your code still works.

Be warned that type() is a very special function in python and when called
with more arguments, does many relatively beautiful but unrelated things. It
has a special role in the class or type hierarchy. But used with a single
argument, it harmlessly return a result you want.


-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Thursday, March 2, 2023 6:43 PM
To: python-list@python.org
Subject: Re: Which more Pythonic - self.__class__ or type(self)?

On 3/2/2023 5:53 PM, Greg Ewing via Python-list wrote:
> On 3/03/23 9:54 am, Ian Pilcher wrote:
>> I haven't found
>> anything that talks about which form is considered to be more Pythonic
>> in those situations where there's no functional difference.
> 
> In such cases I'd probably go for type(x), because it looks less
> ugly.
> 
> x.__class__ *might* be slightly more efficient, as it avoids a
> global lookup and a function call. But as always, measurement
> would be required to be sure.

Except that we don't know if "efficiency" - whatever that might mean 
here - matters at all.

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-03-02 Thread avi.e.gross
Thanks, Peter. Excellent advice, even if only for any of us using Microsoft
Outlook as our mailer. I made the changes and we will see but they should
mainly impact what I see. I did tweak another parameter.

The problem for me was finding where they hid the options menu I needed.
Then, I started translating the menus back into German until I realized I
was being silly! Good practice though. LOL!

The truth is I generally can handle receiving mangled code as most of the
time I can re-edit it into shape, or am just reading it and not
copying/pasting.

What concerns me is to be able to send out the pure text content many seem
to need in a way that does not introduce the anomalies people see. Something
like a least-common denominator.

Or. I could switch mailers. But my guess is reading/responding from the
native gmail editor may also need options changes and yet still impact some
readers.

-Original Message-
From: Python-list  On
Behalf Of Peter J. Holzer
Sent: Thursday, March 2, 2023 3:09 PM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

On 2023-03-01 01:01:42 +0100, Peter J. Holzer wrote:
> On 2023-02-28 15:25:05 -0500, avi.e.gr...@gmail.com wrote:
> > I had no doubt the code you ran was indented properly or it would not
work.
> > 
> > I am merely letting you know that somewhere in the process of 
> > copying the code or the transition between mailers, my version is messed
up.
> 
> The problem seems to be at your end. Jen's code looks ok here.
[...]
> I have no idea why it would join only some lines but not others.

Actually I do have an idea now, since I noticed something similar at work
today: Outlook has an option "remove additional line breaks from text-only
messages" (translated from German) in the the "Email / Message Format"
section. You want to make sure this is off if you are reading mails where
line breaks might be important[1].

hp

[1] Personally I'd say you shouldn't use Outlook if you are reading mails
where line breaks (or other formatting) is important, but ...

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Regular Expression bug?

2023-03-02 Thread avi.e.gross
José,

Matching can be greedy. Did it match to the last space?

What you want is a pattern that matches anything except a space (or whitespace) 
followed b matching a space or something similar.

Or use a construct that makes matching non-greedy.

Avi

-Original Message-
From: Python-list  On 
Behalf Of jose isaias cabrera
Sent: Thursday, March 2, 2023 2:23 PM
To: python-list@python.org
Subject: Regular Expression bug?

Greetings.

For the RegExp Gurus, consider the following python3 code:

import re
s = "pn=align upgrade sd=2023-02-"
ro = re.compile(r"pn=(.+) ")
r0=ro.match(s)
>>> print(r0.group(1))
align upgrade


This is wrong. It should be 'align' because the group only goes up-to the 
space. Thoughts? Thanks.

josé

-- 

What if eternity is real?  Where will you spend it?  H...
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape RE

2023-03-01 Thread avi.e.gross
Cameron,

The topic is now Regular Expressions and the sin tax. This is not
exclusively a Python issue as everybody and even their grandmother uses it
in various forms.

I remember early versions of RE were fairly simple and readable. It was a
terse minilanguage that allowed fairly complex things to be done but was
readable.

You now encounter versions that make people struggle as countless extensions
have been sloppily grafted on. Who ordered multiple uses where "?" is now
used? As an example. Many places have sort of expanded the terseness and
both made it more and also less legible. UNICODE made lots of older RE
features  not very useful as definitions of things like what whitespace can
be and what a word boundary or contents might be are made so different that
new constructs were added to hold them.

But, if you are operating mainly on ASCII text, the base functionality is
till in there and can be used fairly easily.

Consider it a bit like other mini languages such as the print() variants
that kept adding functionality by packing lots of info tersely so you
specify you want a floating point number with so many digits and so on, and
by the way, right justified in a wider field and if it is negative, so this.
Great if you can still remember how to read it. 

I was reading a python book recently which kept using a suffix of !r and I
finally looked it up. It seems to be asking print (or perhaps an f string)
to use __repr__()  if possible to get the representation of the object. Then
I find out this is not really needed any more as the context now allows you
to use something like {repr(val)) so a val!r is not the only and confusing
way.

These mini-languages each require you to learn their own rules and quirks
and when you do, they can be powerful and intuitive, at least for the
features you memorized and maybe use regularly. 

Now RE knowledge is the same and it ports moderately well between languages
except when it doesn't. As has been noted, the people at PERL relied on it a
lot and kept changing and extending it. Some Python functionality lets you
specify if you want PERL style or other styles.

But hiding your head in the sand is not always going to work for long. No,
you do not need to use RE for simple cases. Mind you, that is when it is
easiest to use it reliably. I read some books related to XML where much of
the work had been done in non-UNIX land years ago and they often had other
ways of doing things in their endless series of methods on validating a
schema or declaring it so data is forced to match the declared objectives
such as what type(s) each item can be or whether some fields must exist
inside others or in a particular order, or say you can have only three of
them and seeming endless other such things. And then, suddenly, someone has
the idea to introduce the ability for you to specify many things using
regular expressions and the oppressiveness (for me) lifts and many things
can now be done trivially or that were not doable before. I had a similar
experience in my SQL reading where adding the ability to do some pattern
matching using a form of RE made life simpler.

The fact is that the idea of complex pattern matching IS complex and any
tool that lets you express it so fluidly will itself be complex. So, as some
have mentioned, find a resource that helps you build a regular expression
perhaps through menus, or one that verifies if one you created makes any
sense or lets you enter test data and have it show you how it is matching or
what to change to make it match differently. The multi-line version of RE
may also be helpful as well as sometimes breaking up a bigger one into
several smaller ones that your program uses in multiple phases.

Python recently added new functionality called Structural Pattern Matching.
You use a match statement with various cases that match patterns and if
matched, execute some action. Here is one tutorial if needed:

https://peps.python.org/pep-0636/

The point is that although not at all the same as a RE, we again have a bit
of a mini-language that can be used fairly concisely to investigate a
problem domain fairly quickly and efficiently and do things. It is an
overlapping but different form of pattern matching. And, in languages that
have long had similar ideas and constructs, people often cut back on using
other constructs like an IF statement, and just used something like this!

And consider this example as being vaguely like a bit of regular expression:

match command.split():
case ["go", ("north" | "south" | "east" | "west")]:
current_room = current_room.neighbor(...)

Like it or not, our future in programming is likely to include more and more
such aids along with headaches.

Avi

-Original Message-
From: Python-list  On
Behalf Of Grant Edwards
Sent: Wednesday, March 1, 2023 12:04 PM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

On 2023-02-28, Cameron Simpson  wrote:

> Regexps are:
> - cryptic 

RE: Look free ID genertion (was: Is there a more efficient threading lock?)

2023-03-01 Thread avi.e.gross
If a workaround like itertools.count.__next__() is used because it will not
be interrupted as it is implemented in C, then I have to ask if it would
make sense for Python to supply something similar in the standard library
for the sole purpose of a use in locks.

But realistically, this is one place the concept of an abstract python
language intersects aspects of what is bundled into a sort of core at or
soon after startup, as well as the reality that python can be implemented in
many ways including some ways on some hardware that may not make guarantees
to behave this way.

Realistically, the history of computing is full of choices made that now
look less useful or obvious.

What would have happened if all processors had been required to have some
low level instruction that effectively did something in an atomic way that
allowed a way for anyone using any language running on that machine a way to
do a simple thing like set a lock or check it?

Of course life has also turned out to be more complex. Some architectures
can now support a small number of operations and implement others as sort of
streams of those operations liked together.  You would need to be sure your
program is very directly using the atomic operation directly.

-Original Message-
From: Python-list  On
Behalf Of Dieter Maurer
Sent: Wednesday, March 1, 2023 1:43 PM
To: Chris Angelico 
Cc: python-list@python.org
Subject: Look free ID genertion (was: Is there a more efficient threading
lock?)

Chris Angelico wrote at 2023-3-1 12:58 +1100:
> ...
> The
>atomicity would be more useful in that context as it would give 
>lock-free ID generation, which doesn't work in Python.

I have seen `itertools.count` for that.
This works because its `__next__` is implemented in "C" and therefore will
not be interrupted by a thread switch.
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Python 3.10 Fizzbuzz

2023-03-01 Thread avi.e.gross
This discussion has veered a bit, as it often does, but does raise
interesting points about programming in general and also in python.

We seem to be able to easily cite examples where a group of things is lumped
for convenience and people end up using them but then tweaking them.

S an example, the ggplot2 package in R (a version is available in python)
does graphics with some defaults and you can add-in themes that set internal
aspects of the object including many that allow you to see what the
background of the graph looks like. One common need was to remove color for
printing in a black/white situation so someone created a theme_bw() you
could call that sets the parameters deemed appropriate. The background can
include many things including crosshatchings that can be tweaked
independently. Some people thought the theme_bw() was not pleasing and
figured out how to tweak a few things after calling it and next thing we
know, someone packages a new set and calls it theme_gray(), theme_minimal(),
theme_dark(), theme_void() and, of course, theme_classic().

But using more than one of these in a row can be both wasteful and puzzling.
The last one is selectively over-riding a data-structure in parts with some
changes to the same spot the previous one created or modified and some not.

So we have what I consider layers of bundling and abstraction, as is common
in many object-oriented programs and quite subtle bugs that can happen when
you cannot see inside a black box, or don't know you need to. I often
created a graph where I tweaked a few things by myself and got the nice
graph I wanted. Then I was asked to not make that nice colorful graph
because they could not see it as nicely when printed without color. 

Simple enough, I added theme_bw() or something at the end of a sort of
pipeline and indeed it drained all the color out but did more than I wanted.
It also happened to reset something I had crafted that did not really have
anything to do with color. To get it to work, I had to rearrange my pipeline
and place my changes after theme_bw().

This does not mean the people who created a sort of "standard" were wrong.
It means using their standard carelessly can have unanticipated results.
Examples like this are quite common but the key is that use of these things
is not mandated and you can roll your own if you wish.

When you look at our discussion of the computer program called "black" it
seems the fault, if any, is in the organization that makes use of it
mandatory and inflexible, and even has it change your code for you without
any easy way to ...

I am guessing that quite a few "black" options chosen are nearly universally
agreed upon. A few may be vehemently opposed. And some seem random like the
88 columns one. The standard computer terminals in days of yore (does anyone
still use them?) did often have exactly 80 columns frequently. But as the
years rolled on, we had windowed machines with no fixed sizes for windows
and even often support for different font sizes. We have scroll abilities
often built in so often long lines do not wrap but you can scroll to see the
rest. And much of our software no longer uses constant fixed length buffers
and can adapt the moment a window is resized and much more.

And dare I mention we are now getting programs written that no human is
expected to read and often is written by a program and eventually more like
by an AI. Have you tried to read the HTML with embedded compressed
JavaScript in a browser window that is not formatted for human consumption
but uses less resources to transmit?

I am trying to imagine the output from something like evaluating a complex
Regular Expression if written down as code in python rather than a sort of
compiled form mainly using C. Is there any reason it would not sometimes
dive deeply into something line many nested layers of IF statements. By
Python rules, every level has to be indented a minimal amount. If black
insisted on say 4 spaces, you could easily exceed the ability to write
anything on a line as you are already past the 88th column. I doubt black
can fix something like this. It is perfectly valid to have arbitrarily
nested concepts like this in many places, including something like a JSON
representation of a structure.

But it is no longer always reasonable to simply ask programmers to let you
select your own changes and options for something like black except in
limited ways. Allowing shortening line length to 80 may be harmless.
Allowing it to be set to unlimited, maybe not. Aspects may interact with
others in ways you are not aware of. 

As an experiment, this message has hat noting external copied/pasted into
it. I am wondering if people who see my text oddly but only sometimes, may
be seeing what happens when I copy a segment with a different hidden line
ending that is maintained by my editor. Truly this illustrates why having
two standards may not be optimal and result in chaos.


-Original Message-
From: 

RE: How to escape strings for re.finditer?

2023-02-28 Thread avi.e.gross
Peter,

Nobody here would appreciate it if I tested it by sending out multiple
copies of each email to see if the same message wraps differently.

I am using a fairly standard mailer in Outlook that interfaces with gmail
and I could try mailing directly from gmail but apparently there are
systemic problems and I experience other complaints when sending directly
from AOL mail too. 

So, if some people don't read me, I can live with that. I mean the right
people, LOL!

Or did I get that wrong?

I do appreciate the feedback. Ironically, when I politely shared how someone
else's email was displaying on my screen, it seems I am equally causing
similar issues for others.

An interesting question is whether any of us reading the archived copies see
different things including with various browsers:

https://mail.python.org/pipermail/python-list/

I am not sure which letters from me had the anomalies you mention but
spot-checking a few of them showed a normal display when I use Chrome.

But none of this is really a python issue except insofar as you never know
what functionality in the network was written for in python.

-Original Message-
From: Python-list  On
Behalf Of Peter J. Holzer
Sent: Tuesday, February 28, 2023 7:26 PM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

On 2023-03-01 01:01:42 +0100, Peter J. Holzer wrote:
> On 2023-02-28 15:25:05 -0500, avi.e.gr...@gmail.com wrote:
> > It happens to be easy for me to fix but I sometimes see garbled code 
> > I then simply ignore.
> 
> Truth to be told, that's one reason why I rarely read your mails to 
> the end. The long lines and the triple-spaced paragraphs make it just 
> too uncomfortable.

Hmm, since I was now paying a bit more attention to formatting problems I
saw that only about half of your messages have those long lines although all
seem to be sent with the same mailer. Don't know what's going on there.

hp


-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Re: Python 3.10 Fizzbuzz

2023-02-28 Thread avi.e.gross
Karsten,

Would it be OK if we paused this discussion a day till February is History?

Sarcasm aside, I repeat, the word black has many unrelated meanings as
presumably this case includes. And for those who do not keep close track of
the local US nonsense, February has for some reason been dedicated to be a
National Black History Month.

Can software violate a code for human conduct? The recent AI news suggests
it does! LOL!

But you know, if you hire a program to tell you if your code passes a
designated series of tests and it just points out where they did not, and
suggest changes that may put you in alignment, that by itself is not
abusive. But if you did not ask for their opinion, yes, it can be annoying
as being unsolicited.

Humans can be touchy and lose context. I have people in my life who
magically ignore my carefully thought-out phrases like "If ..." by acting as
if I had said something rather than IF something. Worse, they hear
abstractions too concretely. I might be discussing COVID and saying "If
COVID was a lethal as it used to be ..." and they interject BUT IT ISN'T.
OK, listen again. I am abstract and trying to make a point. The fact that
you think it isn't is nice to note but hardly relevant to a WHAT IF
question.

So a program designed by programmers, a few of whom are not well known for
how they interact with humans but who nonetheless insist on designed user
interfaces by themselves, may well come across negatively. The reality is
humans vary tremendously and one may appreciate feedback as a way to improve
and get out of the red and the other will assume it is a put down that
leaves them black and blue, even when the words are the same.

-Original Message-
From: Python-list  On
Behalf Of Karsten Hilbert
Sent: Tuesday, February 28, 2023 2:44 PM
To: pythonl...@danceswithmice.info
Cc: python-list@python.org
Subject: Aw: Re: Python 3.10 Fizzbuzz

> > I've never tried Black or any other code formatter, but I'm sure we 
> > wouldn't get on.
>
> Does this suggest, that because Black doesn't respect other people's 
> opinions and feelings, that it wouldn't meet the PSF's Code of Conduct?

That much depends on The Measure Of A Man.

Karsten
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-28 Thread avi.e.gross
David,

Your results suggest we need to be reminded that lots depends on other
factors. There are multiple versions/implementations of python out there
including some written in C but also other underpinnings. Each can often
have sections of pure python code replaced carefully with libraries of
compiled code, or not. So your results will vary.

Just as an example, assume you derive a type of your own as a subclass of
str and you over-ride the find method by writing it in pure python using
loops and maybe add a few bells and whistles. If you used your improved
algorithm using this variant of str, might it not be quite a bit slower?
Imagine how much slower if your improvement also implemented caching and
logging and the option of ignoring case which are not really needed here.

This type of thing can happen in many other scenarios and some module may be
shared that is slow and a while later is updated but not everyone installs
the update so performance stats can vary wildly. 

Some people advocate using some functional programming tactics, in various
languages, partially because the more general loops are SLOW. But that is
largely because some of the functional stuff is a compiled function that
hides the loops inside a faster environment than the interpreter.

-Original Message-
From: Python-list  On
Behalf Of David Raymond
Sent: Tuesday, February 28, 2023 2:40 PM
To: python-list@python.org
Subject: RE: How to escape strings for re.finditer?

> I wrote my previous message before reading this.  Thank you for the test
you ran -- it answers the question of performance.  You show that
re.finditer is 30x faster, so that certainly recommends that over a simple
loop, which introduces looping overhead.  

>>      def using_simple_loop(key, text):
>>      matches = []
>>      for i in range(len(text)):
>>      if text[i:].startswith(key):
>>      matches.append((i, i + len(key)))
>>      return matches
>>
>>      using_simple_loop: [0.1395295020792, 0.1306313000456,
0.1280345001249, 0.1318618002423, 0.1308461032626]
>>      using_re_finditer: [0.00386140005233, 0.00406190124297,
0.00347899970256, 0.00341310216218, 0.003732001273]


With a slight tweak to the simple loop code using .find() it becomes a third
faster than the RE version though.


def using_simple_loop2(key, text):
matches = []
keyLen = len(key)
start = 0
while (foundSpot := text.find(key, start)) > -1:
start = foundSpot + keyLen
matches.append((foundSpot, start))
return matches


using_simple_loop: [0.1732664997689426, 0.1601669997908175,
0.15792609984055161, 0.157397349591, 0.15759290009737015]
using_re_finditer: [0.003412699792534113, 0.0032823001965880394,
0.0033694999292492867, 0.003354900050908327, 0.006998894810677]
using_simple_loop2: [0.00256159994751215, 0.0025471001863479614,
0.0025424999184906483, 0.0025831996463239193, 0.002999018251896]
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Python 3.10 Fizzbuzz

2023-02-28 Thread avi.e.gross
Dave, 

Is it rude to name something "black" to make it hard for some of us to remind 
them of the rules or claim that our personal style is so often the opposite 
that it should be called "white" or at least shade of gray?

The usual kidding aside, I have no idea what it was called black but in all 
seriousness this is not a black and white issue. Opinions may differ when a 
language provides many valid options on how to write code. If someone wants to 
standardize and impose some decisions, fine. But other may choose their own 
variant and take their chances.

I, for example, like certain features in many languages where if I am only 
doing one short line of code, I prefer to skip the fanfare. Consider an 
(non-python)

If (condition) {
print(5)
}

Who needs that nonsense? If the language allows it:

If (condition) print(5)

Or in python:

If condition: print(5)

Rather than a multi-line version.

But will I always use the short version? Nope. If I expect to add code later, 
might as well start with the multi-line form. If the line gets too long, ditto. 
And, quite importantly, if editing other people's code, I look around and 
follow their lead.

There often is no (one) right way to do things but there often are many wrong 
ways. Tools like black (which I know nothing detailed about) can be helpful. 
But I have experience times when I wrote carefully crafted code (as it happens 
in R inside the RSTUDIO editor) and selected a region and asked it to reformat, 
and gasped at how it ruined my neatly arranged code. I just wanted the bit I 
had added to be formatted the same as the rest already was, not a complete 
re-make. Luckily, there is an undo. 

There must be some parameterized tools out there that let you set up a profile 
of your own personal preferences that help keep your code in your own preferred 
format, and re-arrange it after you have done some editing like copying from 
somewhere else so it fits together.

-Original Message-
From: Python-list  On 
Behalf Of dn via Python-list
Sent: Tuesday, February 28, 2023 2:22 PM
To: python-list@python.org
Subject: Re: Python 3.10 Fizzbuzz

On 28/02/2023 12.55, Rob Cliffe via Python-list wrote:
> 
> 
> On 27/02/2023 21:04, Ethan Furman wrote:
>> On 2/27/23 12:20, rbowman wrote:
>>
>> > "By using Black, you agree to cede control over minutiae of hand- 
>> > formatting. In return, Black gives you speed, determinism, and 
>> > freedom from pycodestyle nagging about formatting. You will save 
>> > time and
>> mental
>> > energy for more important matters."
>> >
>> > Somehow I don't think we would get along very well. I'm a little on 
>> > the opinionated side myself.
>>
>> I personally cannot stand Black.  It feels like every major choice it 
>> makes (and some minor ones) are exactly the opposite of the choice I 
>> make.
>>
>> --
>> ~Ethan~
> I've never tried Black or any other code formatter, but I'm sure we 
> wouldn't get on.

Does this suggest, that because Black doesn't respect other people's opinions 
and feelings, that it wouldn't meet the PSF's Code of Conduct?

--
Regards,
=dn
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-28 Thread avi.e.gross
This message is more for Thomas than Jen,

You made me think of what happens in fairly large cases. What happens if I ask 
you to search a thousand pages looking for your name? 

One solution might be to break the problem into parts that can be run in 
independent threads or processes and perhaps across different CPU's or on many 
machines at once. Think of it as a variant on a merge sort where each chunk 
returns where it found one or more items and then those are gathered together 
and merged upstream.

The problem is you cannot just randomly divide the text.  Any matches across a 
divide are lost. So if you know you are searching for "Thomas Passin" you need 
an overlap big enough to hold enough of that size. It would not be made as 
something like a pure binary tree and if the choices made included variant 
sizes in what might match, you would get duplicates. So the merging part would 
obviously have to eventually remove those.

I have often wondered how Google and other such services are able to find 
millions of things in hardly any time and arguably never show most of them as 
who looks past a few pages/screens?

I think much of that may involve other techniques including quite a bit of 
pre-indexing. But they also seem to enlist lots of processors that each do the 
search on a subset of the problem space and combine and prioritize.

-Original Message-
From: Python-list  On 
Behalf Of Thomas Passin
Sent: Tuesday, February 28, 2023 1:31 PM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

On 2/28/2023 1:07 PM, Jen Kris wrote:
> 
> Using str.startswith is a cool idea in this case.  But is it better 
> than regex for performance or reliability?  Regex syntax is not a 
> model of simplicity, but in my simple case it's not too difficult.

The trouble is that we don't know what your case really is.  If you are talking 
about a short pattern like your example and a small text to search, and you 
don't need to do it too often, then my little code example is probably ideal. 
Reliability wouldn't be an issue, and performance would not be relevant.  If 
your case is going to be much larger, called many times in a loop, or be much 
more complicated in some other way, then a regex or some other approach is 
likely to be much faster.


> Feb 27, 2023, 18:52 by li...@tompassin.net:
> 
> On 2/27/2023 9:16 PM, avi.e.gr...@gmail.com wrote:
> 
> And, just for fun, since there is nothing wrong with your code,
> this minor change is terser:
> 
> example = 'X - abc_degree + 1 + qq + abc_degree + 1'
> for match in re.finditer(re.escape('abc_degree + 1')
> , example):
> 
> ... print(match.start(), match.end())
> ...
> ...
> 4 18
> 26 40
> 
> 
> Just for more fun :) -
> 
> Without knowing how general your expressions will be, I think the
> following version is very readable, certainly more readable than
> regexes:
> 
> example = 'X - abc_degree + 1 + qq + abc_degree + 1'
> KEY = 'abc_degree + 1'
> 
> for i in range(len(example)):
> if example[i:].startswith(KEY):
> print(i, i + len(KEY))
> # prints:
> 4 18
> 26 40
> 
> If you may have variable numbers of spaces around the symbols, OTOH,
> the whole situation changes and then regexes would almost certainly
> be the best approach. But the regular expression strings would
> become harder to read.
> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 
> 

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-28 Thread avi.e.gross
escape('abc_degree + 1') 

for match in re.finditer(find_string, example):

print(match.start(), match.end())

 

Of course I am sure you wrote and ran code more like the latter version but 
somewhere in your copy/paste process, 

 

And, just for fun, since there is nothing wrong with your code, this minor 
change is terser:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'

for match in re.finditer(re.escape('abc_degree + 1') , example):

... print(match.start(), match.end())

... 

... 

4 18

26 40

 

But note once you use regular expressions, and not in your case, you might 
match multiple things that are far from the same such as matching two repeated 
words of any kind in any case including "and and" and "so so" or finding words 
that have multiple doubled letter as in the stereotypical bookkeeper. In those 
cases, you may want even more than offsets but also show the exact text that 
matched or even show some characters before and/or after for context.

 

 

-Original Message-

From: Python-list mailto:python-list-bounces+avi.e.gross=gmail@python.org> > On Behalf Of 
Jen Kris via Python-list

Sent: Monday, February 27, 2023 8:36 PM

To: Cameron Simpson mailto:c...@cskk.id.au> >

Cc: Python List mailto:python-list@python.org> >

Subject: Re: How to escape strings for re.finditer?

 

 

I haven't tested it either but it looks like it would work. But for this case I 
prefer the relative simplicity of:

 

example = 'X - abc_degree + 1 + qq + abc_degree + 1'

find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, 
example):

print(match.start(), match.end())

 

4 18

26 40

 

I don't insist on terseness for its own sake, but it's cleaner this way. 

 

Jen

 

 

Feb 27, 2023, 16:55 by c...@cskk.id.au <mailto:c...@cskk.id.au> :

On 28Feb2023 01:13, Jen Kris mailto:jenk...@tutanota.com> > wrote:

I went to the re module because the specified string may appear more than once 
in the string (in the code I'm writing).

 

Sure, but writing a `finditer` for plain `str` is pretty easy (untested):

 

pos = 0

while True:

found = s.find(substring, pos)

if found < 0:

break

start = found

end = found + len(substring)

... do whatever with start and end ...

pos = end

 

Many people go straight to the `re` module whenever they're looking for 
strings. It is often cryptic error prone overkill. Just something to keep in 
mind.

 

Cheers,

Cameron Simpson mailto:c...@cskk.id.au> >

--

https://mail.python.org/mailman/listinfo/python-list

 

-- 

https://mail.python.org/mailman/listinfo/python-list

 

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-28 Thread avi.e.gross
Roel,

You make some good points. One to consider is that when you ask a regular 
expression matcher to search using something that uses NO regular expression 
features, much of the complexity disappears and what it creates is probably 
similar enough to what you get with a string search except that loops and all 
are written as something using fast functions probably written in C. 

That is one reason the roll your own versions have a disadvantage unless you 
roll your own in a similar way by writing a similar C function.

Nobody has shown us what really should be out there of a simple but fast text 
search algorithm that does a similar job and it may still be out there, but as 
you point out, perhaps it is not needed as long as people just use the re 
version.

Avi

-Original Message-
From: Python-list  On 
Behalf Of Roel Schroeven
Sent: Tuesday, February 28, 2023 4:33 AM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

Op 28/02/2023 om 3:44 schreef Thomas Passin:
> On 2/27/2023 9:16 PM, avi.e.gr...@gmail.com wrote:
>> And, just for fun, since there is nothing wrong with your code, this 
>> minor change is terser:
>>
> example = 'X - abc_degree + 1 + qq + abc_degree + 1'
> for match in re.finditer(re.escape('abc_degree + 1') , example):
>> ... print(match.start(), match.end()) ...
>> ...
>> 4 18
>> 26 40
>
> Just for more fun :) -
>
> Without knowing how general your expressions will be, I think the 
> following version is very readable, certainly more readable than regexes:
>
> example = 'X - abc_degree + 1 + qq + abc_degree + 1'
> KEY = 'abc_degree + 1'
>
> for i in range(len(example)):
> if example[i:].startswith(KEY):
> print(i, i + len(KEY))
> # prints:
> 4 18
> 26 40
I think it's often a good idea to use a standard library function instead of 
rolling your own. The issue becomes less clear-cut when the standard library 
doesn't do exactly what you need (as here, where
re.finditer() uses regular expressions while the use case only uses simple 
search strings). Ideally there would be a str.finditer() method we could use, 
but in the absence of that I think we still need to consider using the 
almost-but-not-quite fitting re.finditer().

Two reasons:

(1) I think it's clearer: the name tells us what it does (though of course we 
could solve this in a hand-written version by wrapping it in a suitably named 
function).

(2) Searching for a string in another string, in a performant way, is not as 
simple as it first appears. Your version works correctly, but slowly. In some 
situations it doesn't matter, but in other cases it will. For better 
performance, string searching algorithms jump ahead either when they found a 
match or when they know for sure there isn't a match for some time (see e.g. 
the Boyer–Moore string-search algorithm). 
You could write such a more efficient algorithm, but then it becomes more 
complex and more error-prone. Using a well-tested existing function becomes 
quite attractive.

To illustrate the difference performance, I did a simple test (using the 
paragraph above is test text):

 import re
 import timeit

 def using_re_finditer(key, text):
 matches = []
 for match in re.finditer(re.escape(key), text):
 matches.append((match.start(), match.end()))
 return matches


 def using_simple_loop(key, text):
 matches = []
 for i in range(len(text)):
 if text[i:].startswith(key):
 matches.append((i, i + len(key)))
 return matches


 CORPUS = """Searching for a string in another string, in a performant way, 
is
 not as simple as it first appears. Your version works correctly, but 
slowly.
 In some situations it doesn't matter, but in other cases it will. 
For better
 performance, string searching algorithms jump ahead either when they found 
a
 match or when they know for sure there isn't a match for some time (see 
e.g.
 the Boyer–Moore string-search algorithm). You could write such a more
 efficient algorithm, but then it becomes more complex and more error-prone.
 Using a well-tested existing function becomes quite attractive."""
 KEY = 'in'
 print('using_simple_loop:',
timeit.repeat(stmt='using_simple_loop(KEY, CORPUS)', globals=globals(),
number=1000))
 print('using_re_finditer:',
timeit.repeat(stmt='using_re_finditer(KEY, CORPUS)', globals=globals(),
number=1000))

This does 5 runs of 1000 repetitions each, and reports the time in seconds for 
each of those runs.
Result on my machine:

 using_simple_loop: [0.1395295020792, 0.1306313000456, 
0.1280345001249, 0.1318618002423, 0.1308461032626]
 using_re_finditer: [0.00386140005233, 0.00406190124297, 
0.00347899970256, 0.00341310216218, 0.003732001273]

We find that in this test re.finditer() is more than 30 times faster (despite 
the overhead of regular expressions.

While 

RE: How to escape strings for re.finditer?

2023-02-27 Thread avi.e.gross
I think by now we have given all that is needed by the OP but Dave's answer
strikes me as being able to be a tad faster as a while loop if you are
searching  larger corpus such as an entire ebook or all books as you can do
on books.google.com

I think I mentioned earlier that some assumptions need to apply. The text
needs to be something like an ASCII encoding or seen as code points rather
than bytes. We assume a match should move forward by the length of the
match. And, clearly, there cannot be a match too close to the end.

So a while loop would begin with a variable set to zero to mark the current
location of the search. The condition for repeating the loop is that this
variable is less than or equal to len(searched_text) - len(key)

In the loop, each comparison is done the same way as David uses, or anything
similar enough but the twist is a failure increments the variable by 1 while
success increments by len(key).

Will this make much difference? It might as the simpler algorithm counts
overlapping matches and wastes some time hunting where perhaps it shouldn't.

And, of course, if you made something like this into a search function, you
can easily add features such as asking that you only return the first N
matches or the next N, simply by making it a generator.
So tying this into an earlier discussion, do you want the LAST match info
visible when the While loop has completed? If it was available, it opens up
possibilities for running the loop again but starting from where you left
off.



-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Monday, February 27, 2023 9:44 PM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

On 2/27/2023 9:16 PM, avi.e.gr...@gmail.com wrote:
> And, just for fun, since there is nothing wrong with your code, this minor
change is terser:
> 
 example = 'X - abc_degree + 1 + qq + abc_degree + 1'
 for match in re.finditer(re.escape('abc_degree + 1') , example):
> ... print(match.start(), match.end())
> ...
> ...
> 4 18
> 26 40

Just for more fun :) -

Without knowing how general your expressions will be, I think the following
version is very readable, certainly more readable than regexes:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
KEY = 'abc_degree + 1'

for i in range(len(example)):
 if example[i:].startswith(KEY):
 print(i, i + len(KEY))
# prints:
4 18
26 40

If you may have variable numbers of spaces around the symbols, OTOH, the
whole situation changes and then regexes would almost certainly be the best
approach.  But the regular expression strings would become harder to read.
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: XXX XXX should stop sending me rude email messages.

2023-02-27 Thread avi.e.gross
Michael,

Although I appreciate much of what you say, I ask humbly and politely that
we change the Subject line for messages like this one. HH is out of range
for now, albeit I think he can still read what we say.

But keeping the name Michael Torrie in the subject line, should be sort of
XXX rated.

And I mean especially in this case where we have no idea what "he" was
writing but it was a private message not intended for the group.

Just kidding BTW. But since YOU are Michael Torrie, I guess you can
broadcast a request to stop mailing yourself if that pleases. I will just no
longer bring your name forth in this context. Feel free to send me email, if
the need arises.

Avi

-Original Message-
From: Python-list  On
Behalf Of Michael Torrie
Sent: Monday, February 27, 2023 9:08 PM
To: python-list@python.org
Subject: Re: Rob Cliffe should stop sending me rude email messages.

On 2/27/23 09:17, Grant Edwards wrote:
> On 2023-02-27, Michael Torrie  wrote:
> 
>> I've been putting off sending this message for days, but the list 
>> noise level is now to the point that it has to be said.
> 
> Ah, I've finially realized why some of those threads have seemed so 
> disjointed to me. Years ago, I plonked all posts which are (like Hen
> Hanna's) submitted via Googole Groups.
> 
> I highly recommend it.
> 
> FWIW, here's the "score" rule for doing that with srln:
> 
> Score:: =-
>   Message-ID: .*googlegroups.com

Thanks for the tip and reminder.  I'll add that to my gmail filter.

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-27 Thread avi.e.gross
Jen,

Can you see what SOME OF US see as ASCII text? We can help you better if we get 
code that can be copied and run as-is.

 What you sent is not terse. It is wrong. It will not run on any python 
interpreter because you somehow lost a carriage return and indent.

This is what you sent:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, 
example):
print(match.start(), match.end())

This is code indentedproperly:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') 
for match in re.finditer(find_string, example):
print(match.start(), match.end())

Of course I am sure you wrote and ran code more like the latter version but 
somewhere in your copy/paste process, 

And, just for fun, since there is nothing wrong with your code, this minor 
change is terser:

>>> example = 'X - abc_degree + 1 + qq + abc_degree + 1'
>>> for match in re.finditer(re.escape('abc_degree + 1') , example):
... print(match.start(), match.end())
... 
... 
4 18
26 40

But note once you use regular expressions, and not in your case, you might 
match multiple things that are far from the same such as matching two repeated 
words of any kind in any case including "and and" and "so so" or finding words 
that have multiple doubled letter as in the  stereotypical bookkeeper. In those 
cases, you may want even more than offsets but also show the exact text that 
matched or even show some characters before and/or after for context.


-Original Message-
From: Python-list  On 
Behalf Of Jen Kris via Python-list
Sent: Monday, February 27, 2023 8:36 PM
To: Cameron Simpson 
Cc: Python List 
Subject: Re: How to escape strings for re.finditer?


I haven't tested it either but it looks like it would work.  But for this case 
I prefer the relative simplicity of:

example = 'X - abc_degree + 1 + qq + abc_degree + 1'
find_string = re.escape('abc_degree + 1') for match in re.finditer(find_string, 
example):
print(match.start(), match.end())

4 18
26 40

I don't insist on terseness for its own sake, but it's cleaner this way.  

Jen


Feb 27, 2023, 16:55 by c...@cskk.id.au:

> On 28Feb2023 01:13, Jen Kris  wrote:
>
>> I went to the re module because the specified string may appear more than 
>> once in the string (in the code I'm writing).
>>
>
> Sure, but writing a `finditer` for plain `str` is pretty easy (untested):
>
>  pos = 0
>  while True:
>  found = s.find(substring, pos)
>  if found < 0:
>  break
>  start = found
>  end = found + len(substring)
>  ... do whatever with start and end ...
>  pos = end
>
> Many people go straight to the `re` module whenever they're looking for 
> strings. It is often cryptic error prone overkill. Just something to keep in 
> mind.
>
> Cheers,
> Cameron Simpson 
> --
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-27 Thread avi.e.gross
Jen,

What you just described is why that tool is not the right tool for the job, 
albeit it may help you confirm if whatever method you choose does work 
correctly and finds the same number of matches.

Sometimes you simply do some searching and roll your own.

Consider this code using a sort of list comprehension feature:

>>> short = "hello world"
>>> longer =  "hello world is how many programs start for novices but some use 
>>> hello world! to show how happy they are to say hello world"

>>> short in longer
True
>>> howLong = len(short)

>>> res = [(offset, offset + howLong)  for offset  in range(len(longer)) if 
>>> longer.startswith(short, offset)]
>>> res
[(0, 11), (64, 75), (111, 122)]
>>> len(res)
3

I could do a bit more but it seems to work. Did I get the offsets right? 
Checking:

>>> print( [ longer[res[index][0]:res[index][1]] for index in range(len(res))])
['hello world', 'hello world', 'hello world']

Seems to work but thrown together quickly so can likely be done much nicer.

But as noted, the above has flaws such as matching overlaps like:

>>> short = "good good"
>>> longer = "A good good good but not douple plus good good good goody"
>>> howLong = len(short)
>>> res = [(offset, offset + howLong)  for offset  in range(len(longer)) if 
>>> longer.startswith(short, offset)]
>>> res
[(2, 11), (7, 16), (37, 46), (42, 51), (47, 56)]

It matched five times as sometimes we had three of four good in a row. Some 
other method might match only three.

What some might do can get long and you clearly want one answer and not 
tutorials. For example, people can make a loop that finds a match and either 
sabotages the area by replacing or deleting it, or keeps track and searched 
again on a substring offset from the beginning. 

When you do not find a tool, consider making one. You can take (better) code 
than I show above and make it info a function and now you have a tool. Even 
better, you can make it return whatever you want.

-Original Message-
From: Python-list  On 
Behalf Of Jen Kris via Python-list
Sent: Monday, February 27, 2023 7:40 PM
To: Bob van der Poel 
Cc: Python List 
Subject: Re: How to escape strings for re.finditer?


string.count() only tells me there are N instances of the string; it does not 
say where they begin and end, as does re.finditer.  

Feb 27, 2023, 16:20 by bobmellow...@gmail.com:

> Would string.count() work for you then?
>
> On Mon, Feb 27, 2023 at 5:16 PM Jen Kris via Python-list <> 
> python-list@python.org> > wrote:
>
>>
>> I went to the re module because the specified string may appear more 
>> than once in the string (in the code I'm writing).  For example:
>>  
>>  a = "X - abc_degree + 1 + qq + abc_degree + 1"
>>   b = "abc_degree + 1"
>>   q = a.find(b)
>>  
>>  print(q)
>>  4
>>  
>>  So it correctly finds the start of the first instance, but not the 
>> second one.  The re code finds both instances.  If I knew that the substring 
>> occurred only once then the str.find would be best.
>>  
>>  I changed my re code after MRAB's comment, it now works.
>>  
>>  Thanks much.
>>  
>>  Jen
>>  
>>  
>>  Feb 27, 2023, 15:56 by >> c...@cskk.id.au>> :
>>  
>>  > On 28Feb2023 00:11, Jen Kris <>> jenk...@tutanota.com>> > wrote:
>>  >
>>  >> When matching a string against a longer string, where both 
>> strings have spaces in them, we need to escape the spaces.  >>  >> 
>> This works (no spaces):
>>  >>
>>  >> import re
>>  >> example = 'abcdefabcdefabcdefg'
>>  >> find_string = "abc"
>>  >> for match in re.finditer(find_string, example):
>>  >> print(match.start(), match.end())  >>  >> That gives me the 
>> start and end character positions, which is what I want.
>>  >>
>>  >> However, this does not work:
>>  >>
>>  >> import re
>>  >> example = re.escape('X - cty_degrees + 1 + qq')  >> find_string = 
>> re.escape('cty_degrees + 1')  >> for match in 
>> re.finditer(find_string, example):
>>  >> print(match.start(), match.end())  >>  >> I’ve tried several 
>> other attempts based on my reseearch, but still no match.
>>  >>
>>  >
>>  > You need to print those strings out. You're escaping the _example_ 
>> string, which would make it:
>>  >
>>  >  X - cty_degrees \+ 1 \+ qq
>>  >
>>  > because `+` is a special character in regexps and so `re.escape` escapes 
>> it. But you don't want to mangle the string you're searching! After all, the 
>> text above does not contain the string `cty_degrees + 1`.
>>  >
>>  > My secondary question is: if you're escaping the thing you're searching 
>> _for_, then you're effectively searching for a _fixed_ string, not a 
>> pattern/regexp. So why on earth are you using regexps to do your searching?
>>  >
>>  > The `str` type has a `find(substring)` function. Just use that! It'll be 
>> faster and the code simpler!
>>  >
>>  > Cheers,
>>  > Cameron Simpson <>> c...@cskk.id.au>> >  > --  > >> 
>> https://mail.python.org/mailman/listinfo/python-list
>>  >
>>  
>>  --
>>  >> https://mail.python.org/mailman/listinfo/python-list

RE: How to escape strings for re.finditer?

2023-02-27 Thread avi.e.gross
Just FYI, Jen, there are times a sledgehammer works but perhaps is not the only 
way. These days people worry less about efficiency and more about programmer 
time and education and that can be fine.

But it you looked at methods available in strings or in some other modules, 
your situation is quite common. Some may use another RE front end called 
finditer().

I am NOT suggesting you do what I say next, but imagine writing a loop that 
takes a substring of what you are searching for of the same length as your 
search string. Near the end, it stops as there is too little left.

You can now simply test your searched for string against that substring for 
equality and it tends to return rapidly when they are not equal early on.

Your loop would return whatever data structure or results you want such as that 
it matched it three times at offsets a, b and c.

But do you allow overlaps? If not, your loop needs to skip len(search_str) 
after a match.

What you may want to consider is another form of pre-processing. Do you care if 
"abc_degree + 1" has missing or added spaces at the tart or end or anywhere in 
middle as in " abc_degree +1"?

Do you care if stuff is a different case like "Abc_Degree + 1"?

Some such searches can be done if both the pattern and searched string are 
first converted to a canonical format that maps to the same output. But that 
complicates things a bit and you may to display what you match differently.

And are you also willing to match this: "myabc_degree + 1"?

When using a crafter RE there is a way to ask for a word boundary so abc will 
only be matched if before that is a space or the start of the string and not 
"my".

So this may be a case where you can solve an easy version with the chance it 
can be fooled or overengineer it. If you are allowing the user to type in what 
to search for, as many programs including editors, do, you will often find such 
false positives unless the user knows RE syntax and applies it and you do not 
escape it. I have experienced havoc when doing a careless global replace that 
matched more than I expected, including making changes in comments or constant 
strings rather than just the name of a function. Adding a paren is helpful as 
is not replacing them all but one at a time and skipping any that are not 
wanted.

Good luck.

-Original Message-
From: Python-list  On 
Behalf Of Jen Kris via Python-list
Sent: Monday, February 27, 2023 7:14 PM
To: Cameron Simpson 
Cc: Python List 
Subject: Re: How to escape strings for re.finditer?


I went to the re module because the specified string may appear more than once 
in the string (in the code I'm writing).  For example:  

a = "X - abc_degree + 1 + qq + abc_degree + 1"
 b = "abc_degree + 1"
 q = a.find(b)

print(q)
4

So it correctly finds the start of the first instance, but not the second one.  
The re code finds both instances.  If I knew that the substring occurred only 
once then the str.find would be best.  

I changed my re code after MRAB's comment, it now works.  

Thanks much.  

Jen


Feb 27, 2023, 15:56 by c...@cskk.id.au:

> On 28Feb2023 00:11, Jen Kris  wrote:
>
>> When matching a string against a longer string, where both strings 
>> have spaces in them, we need to escape the spaces.
>>
>> This works (no spaces):
>>
>> import re
>> example = 'abcdefabcdefabcdefg'
>> find_string = "abc"
>> for match in re.finditer(find_string, example):
>> print(match.start(), match.end())
>>
>> That gives me the start and end character positions, which is what I 
>> want.
>>
>> However, this does not work:
>>
>> import re
>> example = re.escape('X - cty_degrees + 1 + qq') find_string = 
>> re.escape('cty_degrees + 1') for match in re.finditer(find_string, 
>> example):
>> print(match.start(), match.end())
>>
>> I’ve tried several other attempts based on my reseearch, but still no 
>> match.
>>
>
> You need to print those strings out. You're escaping the _example_ string, 
> which would make it:
>
>  X - cty_degrees \+ 1 \+ qq
>
> because `+` is a special character in regexps and so `re.escape` escapes it. 
> But you don't want to mangle the string you're searching! After all, the text 
> above does not contain the string `cty_degrees + 1`.
>
> My secondary question is: if you're escaping the thing you're searching 
> _for_, then you're effectively searching for a _fixed_ string, not a 
> pattern/regexp. So why on earth are you using regexps to do your searching?
>
> The `str` type has a `find(substring)` function. Just use that! It'll be 
> faster and the code simpler!
>
> Cheers,
> Cameron Simpson 
> --
> https://mail.python.org/mailman/listinfo/python-list
>

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: it seems like a few weeks ago... but actually it was more like 30 years ago that i was programming in C, and

2023-02-27 Thread avi.e.gross
Yes, Greg, you are correct. After I posted, I encountered a later message
that suggested it was list comprehensions that had accidentally left a
variable behind in a context when theoretically you got ALL you asked for in
the resulting list, so it fixed eventually.

You live and learn till you don't.


-Original Message-
From: Python-list  On
Behalf Of Greg Ewing via Python-list
Sent: Monday, February 27, 2023 6:49 PM
To: python-list@python.org
Subject: Re: it seems like a few weeks ago... but actually it was more like
30 years ago that i was programming in C, and

On 28/02/23 7:40 am, avi.e.gr...@gmail.com wrote:
> inhahe  made the point that this may not have been the
original intent for python and may be a sort of bug that it is too late to
fix.

Guido has publically stated that it was a deliberate design choice.
The merits of that design choice can be debated, but it wasn't a bug or an
accident.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to escape strings for re.finditer?

2023-02-27 Thread avi.e.gross
MRAB makes a valid point. The regular expression compiled is only done on the 
pattern you are looking for and it it contains anything that might be a 
command, such as an ^ at the start or [12] in middle, you want that converted 
so NONE OF THAT is one. It will be compiled to something that looks for an ^, 
including later in the string, and look for a real [ then a real 1 and a real 2 
and a real ], not for one of the choices of 1 or 2. 

Your example was 'cty_degrees + 1' which can have a subtle bug introduced. The 
special character is "+" which means match greedily as many copies of the 
previous entity as possible. In this case, the previous entity was a single 
space. So the regular expression will match 'cty degrees' then match the single 
space it sees because it sees a space followed ny a plus  then not looking for 
a plus, hits a plus and fails. If your example is rewritten in whatever way 
re.escape uses, it might be 'cty_degrees \+ 1' and then it should work fine.

But converting what you are searching for just breaks that as the result will 
have a '\+" whish is being viewed as two unrelated symbols and the backslash 
breaks the match from going further.



-Original Message-
From: Python-list  On 
Behalf Of MRAB
Sent: Monday, February 27, 2023 6:46 PM
To: python-list@python.org
Subject: Re: How to escape strings for re.finditer?

On 2023-02-27 23:11, Jen Kris via Python-list wrote:
> When matching a string against a longer string, where both strings have 
> spaces in them, we need to escape the spaces.
> 
> This works (no spaces):
> 
> import re
> example = 'abcdefabcdefabcdefg'
> find_string = "abc"
> for match in re.finditer(find_string, example):
>  print(match.start(), match.end())
> 
> That gives me the start and end character positions, which is what I want.
> 
> However, this does not work:
> 
> import re
> example = re.escape('X - cty_degrees + 1 + qq') find_string = 
> re.escape('cty_degrees + 1') for match in re.finditer(find_string, 
> example):
>  print(match.start(), match.end())
> 
> I’ve tried several other attempts based on my reseearch, but still no match.
> 
> I don’t have much experience with regex, so I hoped a reg-expert might help.
> 
You need to escape only the pattern, not the string you're searching.
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: TypeError: can only concatenate str (not "int") to str

2023-02-27 Thread avi.e.gross
Karsten,

There are limits to the disruption a group should tolerate even from people
who may need some leeway.

I wonder if Hen Hanna has any idea that some of the people he is saying this
to lost most of their family in the Holocaust and had parents who barely
survived passing through multiple concentration camps, I doubt he would
change his words or attitude in the slightest as some of his other gems
indicate a paranoid view of the world at best.

It is disproportionate to call everyone a Nazi at the slightest imagined
slight. But we are not here in this forum to discuss world affairs or
politics or how to replace python with the same language they have been
using and likely abusing. Like every resource, it is best used as intended
and that tends to mean not treating all the recipients as being willing to
receive every thought you have had since breakfast followed by demanding
everyone stop responding to him privately or in public or disagreeing in any
way.

I apologize for my part in even bothering to try to help him as it clearly
is a thankless task and a huge waste of andwidth.

-Original Message-
From: Python-list  On
Behalf Of Karsten Hilbert
Sent: Monday, February 27, 2023 6:32 AM
To: python-list@python.org
Subject: Re: TypeError: can only concatenate str (not "int") to str

Am Sun, Feb 26, 2023 at 08:56:28AM -0800 schrieb Hen Hanna:

> so far,  i think  Paul Rubin's post (in another thread) was esp. 
> concise, informative, --- but he's also made a comment
> about   'shunting'  beginners  (questions) to a
> concentration camp, and sounded  a bit  like a cold-hearted (or 
> warm-hearted)  Nazi  officer / scientist.

Now, I have a lot of sympathy -- not least from a professional point of view
-- and see quite some leeway for people acting neuro-atypically, but the
last line of the above really isn't necessary to be read on this list.

Best,
Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: it seems like a few weeks ago... but actually it was more like 30 years ago that i was programming in C, and

2023-02-27 Thread avi.e.gross
I am not a big fan of religions or philosophies that say a road to salvation is 
for the "I" to disappear.

But on a more serious note, as Roel said, there is NO RULE being violated 
unless the documentation of the language says it is supposed to do something 
different.

There are many excellent reasons to keep the final value of a loop variable 
around. On the other hand, there are also many good reasons to make such 
variables be totally kept within the context of the loop so they can mask a 
variable with the same name only temporarily within the loop.

Neither choice is wrong as long as it is applied consistently.

Now, having said that, does python allow you to in some way over-ride the 
behavior?

Well, first, you can simply choose an odd name like local__loopy___variable 
that is not used elsewhere in your code, except perhaps in the next loop 
downstream where it is re-initialized.

You can also use "del Variable" or reset it to null or something in every way 
you can exit the loop such as before a break or in an "else" clause if it 
bothers you.

inhahe  made the point that this may not have been the 
original intent for python and may be a sort of bug that it is too late to fix. 
Perhaps so, but insisting it be changed now is far from a simple request as I 
bet some people depend on the feature. True, it could be straightforward to 
recode any existing loops to update a secondary variable at the top of each 
loop that is declared before the loop and persists after the loop. 

Alas, that might force some to use the dreaded semicolon!

Final note is to look at something like the "with" statement in python as a 
context manager where it normally allows the resource to be closed or removed 
at the end. Of course you can set up an object that does not do the expected 
closure and preserves something, but generally what is wanted is to make sure 
the context exits gracefully.

Avi

-Original Message-
From: Python-list  On 
Behalf Of Roel Schroeven
Sent: Monday, February 27, 2023 3:51 AM
To: python-list@python.org
Subject: Re: it seems like a few weeks ago... but actually it was more like 30 
years ago that i was programming in C, and

Op 26/02/2023 om 6:53 schreef Hen Hanna:
> > There are some similarities between Python and Lisp-family 
> > languages, but really Python is its own thing.
>
>
> Scope (and extent ?) of   variables is one reminder that  Python is not 
> Lisp
>
>  fori in  range(5):  print( i )
>   .
>  print( i )
>
> ideally, after the FOR loop is done,  the (local) var  i should also 
> disappear.
> (this almost caused a bug for me)
I wouldn't say "i *should* also disappear". There is no big book of programming 
language design with rules like that that all languages have to follow. 
Different languages have different behavior. In some languages, for/if/while 
statements introduce a new scope, in other languages they don't. In Python, 
they don't. I won't say one is better than the other; they're just different.

--
"Most of us, when all is said and done, like what we like and make up reasons 
for it afterwards."
 -- Soren F. Petersen

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Python 3.10 Fizzbuzz

2023-02-26 Thread avi.e.gross
Only sometimes.

Is it an insult to suggest the question about what quotes to use is quite 
basic? Python has a wide variety of ways to make a string and if you have text 
that contains one kind of quote, you can nest it in the other kind. Otherwise, 
it really does not matter.

And, yes, there are triply quoted strings as well as raw and formatted but to 
know about these things, you might have to read a manual.

Since you won't, I provided an answer. The answer is that for the meaningless 
Fizzbuzz homework type of problem which is just ASCII text, it does not matter 
at all which kind of quote you use as long as what you use matches itself at 
the end of the string and as long as you use the ASCII versions, not the ones 
you might make in programs like WORD that have a pair for each.

Oh, by the way, people here use lots of editors to deal with their code 
including versions derived from vi and emacs and MANY others, so many people 
here need to be told why you are asking some of your editing questions that do 
not at first seem to relate. We strive to focus here a bit more on using the 
language than on how to make your editor do tricks.

-Original Message-
From: Python-list  On 
Behalf Of Hen Hanna
Sent: Sunday, February 26, 2023 4:07 PM
To: python-list@python.org
Subject: Re: Python 3.10 Fizzbuzz

On Monday, August 29, 2022 at 7:18:22 PM UTC-7, Paul Rubin wrote:
> Just because. 
> 
> from math import gcd 

> def fizz(n: int) -> str: 
>match gcd(n, 15): 
>   case 3: return "Fizz" 
>   case 5: return "Buzz" 
>   case 15: return "FizzBuzz" 
>   case _: return str(n) 
> 
> for i in range(1,101): 
> print(fizz(i))


is there any reason to prefer"over'   ?
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: one Liner: Lisprint(x) --> (a, b, c) instead of ['a', 'b', 'c']

2023-02-26 Thread avi.e.gross
I so rarely need to save a list in python in a form acceptable to LISP but here 
is a go with no visible recursion needed.

>>> nested = [1, 2, [3, 4, [5, 6, 7], 8], 9]

>>> print(nested)
[1, 2, [3, 4, [5, 6, 7], 8], 9]

# Just converting to a tuple does not change nested lists
>>> print(tuple(nested))
(1, 2, [3, 4, [5, 6, 7], 8], 9)

# But a function that typographically replaces [] with () needs no recursion
>>> def p2b(nested_list): return 
>>> repr(nested_list).replace('[','(').replace(']',')')

>>> print(p2b(nested))
(1, 2, (3, 4, (5, 6, 7), 8), 9)

People who speak python well do not necessarily lisp.

-Original Message-
From: Python-list  On 
Behalf Of Hen Hanna
Sent: Sunday, February 26, 2023 4:54 AM
To: python-list@python.org
Subject: Re: one Liner: Lisprint(x) --> (a, b, c) instead of ['a', 'b', 'c']

On Saturday, February 25, 2023 at 11:45:12 PM UTC-8, Hen Hanna wrote:
> def Lisprint(x): print( ' (' + ', '.join(x) + ')' , '\n') 
> 
> a= ' a b c ? def f x if zero? x 0 1 ' 
> a += ' A B C ! just an example ' 
> x= a.split() 
> 
> print(x) 
> Lisprint(x) 
> 
> ['a', 'b', 'c', '?', 'def', 'f', 'x', 'if', 'zero?', 'x', '0', '1', 'A', 'B', 
> 'C', '!', 'just', 'an', 'example'] 
> 
> (a, b, c, ?, def, f, x, if, zero?, x, 0, 1, A, B, C, !, just, an, example)


For nested lists  impossible to improve   upon  P.Norvig's  code


def Lisprint(x):   print(lispstr(x))

def lispstr(exp):
"Convert a Python object back into a Lisp-readable string."
if isinstance(exp, list):
return '(' + ' '.join(map(lispstr, exp)) + ')' 
else:
return str(exp)

a=' a b c '
x= a.split()
x +=  [['a', 'b', 'c']]
x +=  x

print(x) 
Lisprint(x) 

['a', 'b', 'c', ['a', 'b', 'c'], 'a', 'b', 'c', ['a', 'b', 'c']]

(a b c (a b c) a b c (a b c))


  --   Without the commas,   the visual difference 
(concision)  is  striking !
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: TypeError: can only concatenate str (not "int") to str

2023-02-26 Thread avi.e.gross
Alan,

Good tack. By not welcoming someone who is paranoid about being welcomed you
are clearly the right kind of welcoming!

Kidding aside, you have a point about one of the barrage of messages
probably not getting a great answer on your tutor forum. It is the MANY
messages often about fairly simple aspects of python, taken together, that
lead to the conclusion that this person is fairly new to python and still
thinking about things from a lifetime of experience using other languages.

I will say that at this point, it does not matter where they post as I
cannot imagine anyone having to pay them $1,000/hour for the privilege of
trying to tutor them.

There are topics raised that can be informative and lead to good discussions
amicably and as far as I can tell, many agree it would be nice if some
"error" messages provided more detail and perhaps some eventually will. But
as has been pointed out, these messages are only a small part of the python
environment and lots of other tools are typically used to debug that do
allow access to all kinds of details at breakpoints. 

I think many would be satisfied with some of the answers provided here and
note, FEW OR NONE OF US here (or am I wrong) are necessarily in a position
to make changes like this to the current or next versions of python. We are
all users who take what we get and work with it or look for a way around
things. The example used did not strike me as hard to figure out which of
X/Y was an int/str and what their values were. More time is wasted demanding
and debating a feature that is not there rather than solving the problem in
other ways.

In the interest of civility, I find removing myself sometimes works well. We
are volunteers and I don't need to volunteer to help any particular person
who does not seem to appreciate it. And if a forum fills up with nonsense so
the signal is hard to find amid the noise, why bother contributing?

Avi

-Original Message-
From: Python-list  On
Behalf Of Alan Gauld
Sent: Sunday, February 26, 2023 4:15 AM
To: python-list@python.org
Subject: Re: TypeError: can only concatenate str (not "int") to str

On 26/02/2023 00:54, Greg Ewing via Python-list wrote:
> On 26/02/23 10:53 am, Paul Rubin wrote:
>> I'm not on either list but the purpose of the tutor list is to shunt 
>> beginner questions away from the main list.

I'm not sure that's why we set it up but it is certainly a large part of our
remit. But protecting newbies from overly complex responses and covering
wider topics (beyond pure Pyhon) is also a large part of our purpose.

> There's a fundamental problem with tutor lists. They rely on 
> experienced people, the ones capable of answering the questions, to go 
> out of their way to read the tutor list -- something that is not of 
> personal benefit to them.

In practice, the "tutors" tend to be split between folks who inhabit both
lists and those who only interact on the tutor list. eg. I lurk here and
only occasionally partake.

But the problem with this particular thread is that, if directed to the
tutor list, the OP would simply be told that "that's the way Python works".
The tutor list is not for discussing language enhancements etc. It is purely
about answering questions on how to use the language (and standard library)
as it exists.
(We also cover beginner questions about programming in general.)

So this thread is most definitely in the right place IMHO.

--
Alan G
Tutor list moderator


-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: TypeError: can only concatenate str (not "int") to str

2023-02-25 Thread avi.e.gross
Greg,

Yes, the forum should be open. The first requests from the person were
replied to politely.

At some point a pattern was emerging of lots of fairly irreverent posts by
someone who is having trouble shifting programming paradigms. The suggestion
was then made as a SUGGESTION by several people that "some" of their
questions might be better asked on the tutor list where others new to python
may have similar questions and can learn.

This forum has all kinds of people and of course many topics are of more
interest to some that others. Programming styles differ too and I note some
here reacted to a suggestion that maybe constants could be more efficiently
be initiated in ways that use less resources. Some insisted it makes more
sense to be able to type what you want more compactly. Yes, of course,
multiple ways are equally valid especially as now, efficiency is not seen as
a major goal.

The reality is that several mailing lists are intended to be used for
occasional questions and people who have more serious needs should be using
local resources or taking courses and reading books as their main learning
method. An occasional question is welcomed. A barrage is an position and a
barrage where most of the answers are ignored or claimed to be wrong, can
generate an "attitude" some of us find less than appealing.

I continue to believe that a programmers job is to learn how to use a
language well, or switch languages, and not to keep moaning why it does not
do what you want or expect. Many answers have suggested how the OP can solve
some issues and apparently that is not of interest to them and they just
keep complaining.

I speak for nobody except myself. As I have said, I have chosen to not
respond and become frustrated.


-Original Message-
From: Python-list  On
Behalf Of Greg Ewing via Python-list
Sent: Saturday, February 25, 2023 7:54 PM
To: python-list@python.org
Subject: Re: TypeError: can only concatenate str (not "int") to str

On 26/02/23 10:53 am, Paul Rubin wrote:
> I'm not on either list but the purpose of the tutor list is to shunt 
> beginner questions away from the main list.

There's a fundamental problem with tutor lists. They rely on experienced
people, the ones capable of answering the questions, to go out of their way
to read the tutor list -- something that is not of personal benefit to them.

Also, pointing people towards tutor lists, if not done carefully, can give
the impression of saying "newcomers are not welcome here".
That's not a message we want to send to Python newcomers at all.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Evaluate once or every time

2023-02-24 Thread avi.e.gross
Mark,

I was very interested in the point you made and have never thought much about 
string concatenation this way but adjacency is an operator worth using.

This message has a new subject line as it is not about line continuation or 
comments.

From what you say, concatenation between visibly adjacent strings is done once 
when generating bytecode. Using a plus is supposed to be about the same but may 
indeed result in either an error if you use anything other than a string literal

bad = "hello " str(12)

or you must use something like a "+" to do the concatenation at each run time. 
Or, weirder, do it manually as : 

good = "hello ".__add__(str(12))

This may be no big deal in terms of efficiency but something to consider.

I have often stared in amazement at code like:

>>> mylist = "The quick brown fox jumps over the lazy dog".split()

>>> mylist
['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']

Or perhaps to make a list of vowels:

import string

vowels = list("aeiouAEIOU")
consonants = sorted(list(set(string.ascii_letters) - set(vowels)))

I mean couldn't you do this work in advance and do something like:

vowels = ['A', 'E', 'I', 'O', 'U', 'a', 'e', 'i', 'o', 'u']
consonants = ['B', 'C', 'D', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 
'R', 'S', 'T', 'V', 'W', 'X', 'Y', 'Z', 'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 
'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'w', 'x', 'y', 'z']

I assume this latter version would be set once no matter how often you run the 
unchanged program. YES, I am aware this may be bad practice for code you want 
to adapt for international use. 

But why be wasteful? I am currently reading a book on refactoring and will not 
share if it is illustrated, or if the above is a decent example as the book 
uses examples in JavaScript. 



-Original Message-
From: Python-list  On 
Behalf Of Mark Bourne
Sent: Friday, February 24, 2023 4:04 PM
To: python-list@python.org
Subject: Re: Line continuation and comments

Personally, I don't particularly like the way you have to put multiline strings 
on the far left (rather than aligned with the rest of the scope) to avoid 
getting spaces at the beginning of each line.  I find it makes it more 
difficult to see where the scope of the class/method/etc. 
actually ends, especially if there are multiple such strings.  It's not too bad 
for strings defined at the module level (outer scope) though, and of course for 
docstrings the extra spaces at the beginning of each line don't matter.

However, rather than using "+" to join strings as in your examples (which, as 
you suggest, is probably less efficient), I tend to use string literal 
concatenation which I gather is more efficient (treated as a single string at 
compile-time rather than joining separate strings at run-time).  See 
.

For example:
   HelpText = ("Left click: Open spam\n"
   "Shift + Left click: Cook spam\n"
   "Right click:Crack egg\n"
   "Shift + Right click:Fry egg\n")

The downside is having to put an explicit "\n" at the end of each line, but to 
me that's not as bad as having to align the content to the far left.

Getting a bit more on topic, use of backslashes in strings is a bit different 
to backslashes for line continuation anyway.  You could almost think of "\ 
(newline)" in a multiline string as being like an escape sequence meaning 
"don't actually put a newline character in the string here", in a similar way 
to "\n" meaning "put a newline character here" and "\t" 
meaning "put a tab character here".

Mark.


avi.e.gr...@gmail.com wrote:
> Good example, Rob, of how some people make what I consider RELIGIOUS edicts 
> that one can easily violate if one wishes and it makes lots of sense in your 
> example.
> 
> Let me extend that. The goal was to store a character string consisting of 
> multiple lines when printed that are all left-aligned. Had you written:
> 
>   HelpText = """
> Left click: Open spam
> ...
> Shift + Right click:Fry egg
> """
> Then it would begin with an extra carriage return you did not want. Your 
> example also ends with a carriage return because you closed the quotes on 
> another line, so a \ on the last line of text (or moving the quotes to the 
> end of the line) would be a way of avoiding that.
> 
> Consider some alternatives I have seen that are in a sense ugly and may 
> involve extra work for the interpreter unless it is byte compiled once.
> 
> def someFunc():
>   HelpText =
>   "Left click: Open spam" + "\n" +
>   "Shift + Left click: Cook spam" + "\n" +
>   ...
> 
> Or the variant of:
> HelpText =  "Left click: Open spam\n"
> HelpText +=  " Shift + Left click: Cook spam\n"
> ...
> 
> Or perhaps just dumping the multi-line text into a file beforehand and 
> 

terse

2023-02-24 Thread avi.e.gross
Greg,

I do not advocate for writing extremely concise python as mentioned in that
book although I was quite interested and do use some of the methods. 

But I worry about what you focused in on. Everyone says a picture is worth a
thousand words. So when writing about python one-liners, you might shorten
some programs even more with a nice illustration!

But if that is an obstacle, perhaps the edition below is less illustrative.

https://www.amazon.com/gp/product/B07ZY7XMX8

Just a reminder. My point was not about a book or set of techniques. It was
that lots of Python features can be used quite effectively to reduce the
need for more lines of cluttered code or places you may be tempted to use
semicolons. Having said that, guess what some one-liner techniques use?

In my view, terseness is not a goal in and of itself. However, it has often
been said that the number of bugs in code often seems correlated with the
number of lines and code written using higher levels of coordination and
abstraction in a language that supports that, can often be written with
fewer lines and apparently even with fewer bugs.

There is also a well-known set of phenomena about us humans in that many of
us are wired to handle fairly small amounts in memory at a time, such as
perhaps 7.  People who can do substantially better are often using an
assortment of tricks like clumping. I mean most people can remember a
10-digit phone number for a while because they may chunk it into
xxx-yyy-abcd or something like that where xxx is just remembered as one unit
as a complete area code while abcd is remembered as 4 individual digits.

So languages that allow and encourage not so much terseness as variations
like chunking, meaning using lots of smaller well-named functions, objects
that encapsulate what your program logic is, and yes, terse but easy to
understand ways of doing things like loops and functional programming
methods, and that are extended by modules or packages with a well-defined
interface, can let you focus on the program at higher levels that do fit in
your memory and can be reasoned more easily and accurately.

So some one-liners are great. Others not so much. Can you imagine a list
comprehension with a dozen or so nested loops on one long line, including
regions grouped in parentheses and lots of "if" clauses? It may be one line
that wraps on your screen into many or has to be scrolled sideways. Yes, you
can rewrite it split across many lines. But at some point, it may be better
to refactor it using a functional style or writing the loops out explicitly.

In my experience, some one-liners are accompanied by paragraphs of comments
explaining them. And often they use tricks that are less efficient.

Avi


-Original Message-
From: Python-list  On
Behalf Of Greg Ewing via Python-list
Sent: Friday, February 24, 2023 1:31 AM
To: python-list@python.org
Subject: Re: semi colonic

On 24/02/23 9:26 am, avi.e.gr...@gmail.com wrote:
> Python One-Liners: Write Concise, Eloquent Python Like a Professional 
> Illustrated Edition by Christian Mayer (Author)

I didn't know there were any Professional Illustrated Editions writing
Pythom. You learn something every day! :-)

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-23 Thread avi.e.gross
We have been supplying many possible reasons or consequences for why the
implementation of python does not do what the OP wants and even DEMANDS.

I am satisfied with knowing it was because they CHOSE NOT TO in some places
and maybe not in others. It is nice to see some possible reasons, but
something as simple as efficiency or needing to complicate the code in
something used regularly, might be enough for now.

But to comment on what Michael T. and Dave N. have been saying, newcomers
often have no clue of what can happen so their questions may sound quite
reasonable.

So what happens if you create a large data structure, so some operation that
fails, catch the error and save the variables involved in an exception and
throw that onward and perhaps the program keeps running? There is now a
pointer to the large data structure in the exception object, or even a copy.
If that exception is not discarded or garbage collected, it can remain in
memory indefinitely even if the original string was expected to be removed,
replaced, or garbage collected. Some modern features in R such as generators
will stay alive infinitely and retain their state in between calls for a
next item.

You can end up with memory leaks that are not trivial to solve or that may
mysteriously disappear when an iterable has finally been consumed and all
the storage it used or pointed at can be retrieved, as one example.

A more rational approach is to realize that python has multiple levels of
debugging and exceptions are one among many. They are not meant to solve the
entire problem but just enough to be helpful or point you in some direction.
Yes, they can do more.

And, FYI, I too pointed this person at the Tutor list and I see no sign they
care how many people they make waste their time with so many mainly gripes.
I personally now ignore any post by them.
-Original Message-
From: Python-list  On
Behalf Of Michael Torrie
Sent: Thursday, February 23, 2023 10:32 PM
To: python-list@python.org
Subject: Re: Why doesn't Python (error msg) tell me WHAT the actual (arg)
values are ?

On 2/23/23 01:08, Hen Hanna wrote:
>  Python VM  is seeing an "int" object (123)   (and telling me that)   ...
so it should be easy to print that "int" object 
> What does  Python VMknow ?   and when does it know it ?
It knows there is an object and its name and type.  It knows this from the
first moment you create the object and bind a name to it.
> it seems like  it is being playful, teasing (or mean),and   hiding
the ball from me

Sorry you aren't understanding.  Whenever you print() out an object, python
calls the object's __repr__() method to generate the string to display.  For
built-in objects this is obviously trivial. But if you were dealing an
object of some arbitrary class, there may not be a
__repr__() method which would cause an exception, or if the __repr__()
method itself raised an exception, you'd lose the original error message and
the stack trace would be all messed up and of no value to you.  Does that
make sense?  Remember that Python is a very dynamic language and what might
be common sense for a built-in type makes no sense at all for a custom type.
Thus there's no consistent way for Python to print out the information you
think is so simple.
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Line continuation and comments

2023-02-23 Thread avi.e.gross
Many "warnings" can safely be ignored.

The function as shown does not look right. I assume it is just an example, but 
a function that ignores the argument supplied is already a tad suspect.

Since it is SUGGESTED that the variable name "self" normally is used in a 
method for a class/instance, it is of course possible for it to set a variable 
called LEGAL_AGE_US to 21 and for no special reason, returns the age.

But my imagination is that a function called is_adult() should perhaps receive 
an age either as an argument, or an attribute of the current object and return 
True only if that age is greater than or equal to the legal age. Of course 
LEGAL_AGE_US may suggest a family of such functions specifying a legal age 
threshold for various countries or regions and all you need is the age between 
non-adult and adult. 

So one GUESS I have is that if this is a method, then you are seen not as 
setting a constant inside the function, where all-caps might be sensible but as 
setting an instance variable or changing it. A true constant might have been 
set when the class was designed or perhaps in __init__() or similar. 

I wonder how PyCharm would react if you used:

self.LEGAL_AGE_US = 21



-Original Message-
From: Python-list  On 
Behalf Of dn via Python-list
Sent: Thursday, February 23, 2023 9:01 PM
To: python-list@python.org
Subject: Re: Line continuation and comments

On 24/02/2023 12.45, Weatherby,Gerard wrote:
> “
> NB my PyCharm-settings grumble whenever I create an identifier which 
> is only used once (and perhaps, soon after it was established). I 
> understand the (space) optimisation, but prefer to trade that for 
> 'readability'.
> “
> 
> I haven’t seen that one. What I get is warnings about:
> 
> def is_adult( self )->bool:
>  LEGAL_AGE_US = 21
>  return LEGAL_AGE
> 
> It doesn’t like LEGAL_AGE_US being all caps if declared in a function.

Yes, I suffered this one too.

The rationale comes from PEP-008 (Constants):

Constants are usually defined on a module level and written in all capital 
letters with underscores separating words.


Today, I wasn't criticised for:
> NB my PyCharm-settings grumble whenever I create an identifier which is
> only used once (and perhaps, soon after it was established). I
> understand the (space) optimisation, but prefer to trade that for
> 'readability'.

Perhaps that came from AWS CodeWhisperer which I have since abandoned, 
or maybe from SonarLint (which I've just checked to discover it is not 
working properly...)

-- 
Regards,
=dn

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Line continuation and comments

2023-02-23 Thread avi.e.gross
Good example, Rob, of how some people make what I consider RELIGIOUS edicts 
that one can easily violate if one wishes and it makes lots of sense in your 
example.

Let me extend that. The goal was to store a character string consisting of 
multiple lines when printed that are all left-aligned. Had you written:

 HelpText = """
Left click: Open spam
...
Shift + Right click:Fry egg
"""
Then it would begin with an extra carriage return you did not want. Your 
example also ends with a carriage return because you closed the quotes on 
another line, so a \ on the last line of text (or moving the quotes to the end 
of the line) would be a way of avoiding that.

Consider some alternatives I have seen that are in a sense ugly and may involve 
extra work for the interpreter unless it is byte compiled once.

def someFunc():
 HelpText =
 "Left click: Open spam" + "\n" +
 "Shift + Left click: Cook spam" + "\n" +
 ...

Or the variant of:
HelpText =  "Left click: Open spam\n"
HelpText +=  " Shift + Left click: Cook spam\n"
...

Or perhaps just dumping the multi-line text into a file beforehand and reading 
that into a string!

def someFunc():

The backslash is not looking like such a bad idea! LOL!

-Original Message-
From: Python-list  On 
Behalf Of Rob Cliffe via Python-list
Sent: Wednesday, February 22, 2023 2:08 PM
To: python-list@python.org
Subject: Re: Line continuation and comments



On 22/02/2023 15:23, Paul Bryan wrote:
> Adding to this, there should be no reason now in recent versions of 
> Python to ever use line continuation. Black goes so far as to state 
> "backslashes are bad and should never be used":
>
> https://black.readthedocs.io/en/stable/the_black_code_style/future_sty
> le.html#using-backslashes-for-with-statements

def someFunc():
 HelpText = """\
Left click: Open spam
Shift + Left click: Cook spam
Right click:Crack egg
Shift + Right click:Fry egg
"""

The initial backslash aligns the first line with the others (in a fixed font of 
course).
Best wishes
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-23 Thread avi.e.gross
Rob,

There are lots of nifty features each of us might like and insist make much
more sense than what others say they want.

Sometimes the answer is to not satisfy most of those demands but provide
TOOLS they can use to do things for themselves.

As you agree, many of us have found all kinds of tools that help with
debugging and frankly, some of them use it to the point of annoying others
who would rather avoid them. An example is type hints that can get quite
detailed and obscure the outline of your program and are ignored by the
parser and only relevant for some linter or other such program.

I am thinking of what would happen if I created several fairly long or
complex data structures and tried to add them. Say I have a dictionary
containing millions of entries including every conceivable UNICODE character
as well as very complex values such as other dictionaries or strings
containing entire books. My other data structure might be a forest
containing many smaller trees, such as the output of some machine learning
models or perhaps showing every possible game of chess up to 50 moves deep
along with an evaluation of the relative strength of the board to one
player.

I then accidentally write code that contains:

   big_dic + forest_trees

Would I like my error message consume all the paper in my city (or scroll my
screen for a week) as it tells me a dict cannot be added to a forest and by
the way, here is a repr of each of them showing the current (highly
recursive) contents.

Now people have written functions that take something long and truncate it
so a list containing [1, 2, 3, ... 1_000_000] is shown in this condensed
form with the rest missing, but then someone will complain they are not
seeing all of it!

So the deal is to use your TOOLS. You can run a debugger or add print
statements or enclose it in a try/catch to keep it from stopping the program
and other techniques. You can examine the objects carefully just before, or
even after and do cautious things like ask for the length and then maybe ask
for the first few and last few items, or whatever makes sense.

In the original example, we were first asked about print(a + b) and later
given a somewhat weirder func(x, y, x +y) as examples. Now ask what order
things are evaluated and where the error happens. What is known by the party
handling the error?

If you put the offending statement in a try/catch scenario, then when the
error is triggered, YOU wrote the code that catches the exception and you
can often examine the payload of the exception, or know that the arguments
of a or b or x or y were involved and you can craft your own output to be
more clear. Or, you can even sometimes fix the problem and redo the code
using something like float(x) or str(y).

My impression here is that the error is not on the surface but caught
deeper. The symbols used may be x and y but what if we work with "12" + 13
and follow what happens?

Since the interpreter in python evaluates x+y before calling the function
using the result as an argument, the function never sees anything. What
should happen is that the interpreter sees a "12" which is normally an
object of a class of str and then it sees a "+" and then it sees anything
that follows as a second object it ignores for now. Python does not have a
fully defined operator that it invokes when it sees a "+" as the meaning
depends on what object is being asked to do whatever plus means to it. For a
string argument, it means concatenate to your current content and return a
new str object. The way that happens is that the class (or a relative) has
defined a method called __add__() or it hasn't. If it has, it takes an
argument of the second object and in this case it gets the integer object
containing 13. 

So it runs the function and it has not been programmed on how to append an
integer to a string of characters and it returns without an answer but an
exception. The interpreter evaluator does not admit defeat yet and
reasonable tries to see if the integer 13 has a __iadd__() which is similar
but different and again, an integer has not been programmed to append itself
to an object of type str. Could it have been? Sure. If you make your own
(sub)class you can create a kind of integer that will make a str version of
itself and append it o the "12" to make "1213" BUT in this case, that is not
an option. So the integer method fails and returns an exception too.

Now the parser functionality knows it has failed. "12" and 13 have both
refused to implement the plus sign and either it catches the exception OR is
does not and lets it flow upstream till any other functions in a chain catch
it. Any one can then generate some error message, or it can reach the top
level of the interpreter and it has to decide what to do.

But some errors are not fatal. If str had no __add__() that is not an error.
If it returns that it cannot do it, that is not a fatal error but a
temporary drawback. Only when int fails too is there a likely 

RE: semi colonic

2023-02-23 Thread avi.e.gross
Rob,

It depends. Some purists say python abhors one liners. Well, I politely 
disagree and I enjoyed this book which shows how to write some quite compressed 
one-liners or nearly so.

Python One-Liners: Write Concise, Eloquent Python Like a Professional 
Illustrated Edition
by Christian Mayer (Author)

https://www.amazon.com/Python-One-Liners-Concise-Eloquent-Professional/dp/1718500505/ref=sr_1_1?crid=2MMIRHGLR3GHN=python+one+liners=1677183160=python+one+liner%2Caps%2C93=8-1

The reality is that python is chock full of constructs that make one-liners 
easy and perhaps make a need for semi-colons less crucial.

An example is a comprehension like:

[x*y for x in range(10) for y in range(10) if x != y ]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 4, 5, 6, 7, 8, 9, 0, 2, 6, 8, 10, 12, 14, 
16, 18, 0, 3, 6, 12, 15, 18, 21, 24, 27, 0, 4, 8, 12, 20, 24, 28, 32, 36, 0, 5, 
10, 15, 20, 30, 35, 40, 45, 0, 6, 12, 18, 24, 30, 42, 48, 54, 0, 7, 14, 21, 28, 
35, 42, 56, 63, 0, 8, 16, 24, 32, 40, 48, 56, 72, 0, 9, 18, 27, 36, 45, 54, 63, 
72]

How many lines of code would it take to make than nonsense using an initializer 
for an empty list and nested loops and an "if" statement?

A barely longer one-liner add more functionality with no added lines or 
semicolons:

[(x*y, x+y, x>=y) for x in range(10) for y in range(10) if x != y  ]
[(0, 1, False), (0, 2, False), ..., (72, 17, True)]

Examples of all kinds of such things about including seemingly trivial things 
like how a "with" statement lets you hide lots of code to do when entering and 
exiting use of an object. I have earlier mentioned the way packing and 
unpacking can effectively replace many lines of code with one.

So the pythonic way often is not so much to do things one many lines but often 
to do things in a way that a unit of logic often can fit on one screen by using 
what the language offers judiciously even if you do not put multiple statement 
with semicolons on one line.

-Original Message-
From: Python-list  On 
Behalf Of Rob Cliffe via Python-list
Sent: Thursday, February 23, 2023 6:08 AM
To: python-list@python.org
Subject: Re: semi colonic



On 23/02/2023 02:25, Hen Hanna wrote:
>
> i sometimes  put  extra  commas...  as:
>
> [  1, 2,  3,  4, ]
That is a good idea.
Even more so when the items are on separate lines:
 [
 "spam",
 "eggs",
 "cheese",
 ]
and you may want to change the order.
>
> so it is (or may be)  easier  to add things   later.
>
> ---  i can think of putting extra final  ;   for the same 
> reason.
That may not be such a good idea.  Writing multiple statements on one line is 
generally discouraged (notwithstanding that IMO it is occasionally appropriate).

Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: semi colonic

2023-02-23 Thread avi.e.gross
Grant,

I am not sure it is fair to blame JSON for a design choice.

Use of commas can be done many ways in many contexts. 

One context is a sort of placeholder. Can you have a language where a
function has multiple arguments and you can skip some as in:

Func(a,b,c)
Func(a, b,)
Func(a,,)

Or even
Func(a,,c)

The missing arguments in such a language may be viewed as some sort of NULL
or take a default value as a possibility.

So what if you have a language where a list or tuple or other data structure
incorporates a similar idea when created or used. If I have a matrix and I
want every entry in row 4, meaning all columns, can I ask for mat[4,] and it
means something else than mat[4] which may return a vector instead of a
matrix?

There are tons of such ideas that are choices. Python allows a SINGLE comma
here but multiple are an error:

>>> a=1
>>> a
1
>>> a=1,
>>> a
(1,)
>>> a=1,,
SyntaxError: incomplete input

So why not allow MULTIPLE commas and ignore them? It is a choice!

Here is a scenario where a trailing comma is an error:

>>> a,b,,, = range(5)
SyntaxError: invalid syntax
>>> a,b,_,_,_ = range(5)
>>> a,b,*_ = range(5)

The way to deal here with more items is to use * in front of the last one to
gather any strays.

But as _ is simply reused in the middle example and meant to be ignored, why
do you need it if you would simply allow multiple commas? Short answer is
they did not choose to design it that way. The places in python that do
allow a trailing "," will allow only one. Beyond that, they assume you are
making an error. So if someone wants to make a list of 5 things in
alphabetical order but forgets a few, they cannot write:

mylist = [first, , third,  , , ]

and then let the code run to be enhanced later with their reminder. What
they can do is write this:

mylist = [first,
  #,
  third,
  #,
  #,
  ]

The reminders are now simply well-placed comments.

Now we could debate the design of JSON and some enhancements people have
made for other more portable data structures. I think it reasonable that
they decided to stick to working with fully-formatted data structures and
guess what? If I make a list or tuple or other data structures in python
with a trailing comma, it is NOT stored that way and if you display it,
there is no trailing comma shown. It is fully JSON compatible in some sense:

>>> import json
>>> mynest = [1,2, [3, 4,], 5,]
>>> mynest
[1, 2, [3, 4], 5]
>>> json.dumps(mynest)
'[1, 2, [3, 4], 5]'
>>> json.dumps([1,2, [3, 4,], 5,])
'[1, 2, [3, 4], 5]'
>>> json.loads(json.dumps(mynest))
[1, 2, [3, 4], 5]

So when are you running into problems? Is it when reading something from a
file using a function expecting properly formatted JSON?





-Original Message-
From: Python-list  On
Behalf Of Grant Edwards
Sent: Thursday, February 23, 2023 2:28 PM
To: python-list@python.org
Subject: Re: semi colonic

On 2023-02-23, rbowman  wrote:
> On Wed, 22 Feb 2023 18:25:00 -0800 (PST), Hen Hanna wrote:
>
>> i sometimes  put  extra  commas...  as:
>> 
>>[  1, 2,  3,  4, ]
>> 
>> so it is (or may be)  easier  to add things   later.
>
> That can bite you with things like JSON that aren't very forgiving.

Oh, how I hate that about JSON...


-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: semi colonic

2023-02-23 Thread avi.e.gross


That is a reasonable use, Rob, albeit I would refactor that example in quite a 
few ways so the need for a semicolon disappears even for lining things up.

So to extrapolate, perhaps a related example might be as simple as wanting to 
initialialize multiple variables together might suffice as in:

if dow == 0: hours_worked = 8; overtime = False

Of course some monstrosities are now possible for such a scenario such as


if dow == 0:  hours_worked, overtime  =  8,  False

Not that readable. 

I repeat, there is nothing wrong with a language having a feature like a 
semi-colon even if it is mainly syntactic sugar. Just wondering if it was 
widely used or even essential. My thought was that python evolved when some 
languages really needed a terminator but as it went another way, using 
indentation and sometimes blank lines, ...

I am not sure what Dieter meant about seeing semicolons in .pth files. I expect 
to see them in all kinds of files containing python code or anything created 
with a structure that chooses to include it.

-Original Message-
From: Python-list  On 
Behalf Of Rob Cliffe via Python-list
Sent: Wednesday, February 22, 2023 9:11 PM
To: python-list@python.org
Subject: Re: semi colonic



On 23/02/2023 00:58, avi.e.gr...@gmail.com wrote:
> So can anyone point to places in Python where a semicolon is part of a 
> best or even good way to do anything?
>
>
Yes.  Take this bit of toy code which I just dreamed up.  (Of course it is toy 
code; don't bother telling me how it could be written better.) If it looks a 
bit ragged, pretend it is in a fixed font.

if dow==0: day="Mon"; calcPay()
if dow==1: day="Tue"; calcPay()
if dow==2: day="Wed"; calcPay()
if dow==3: day="Thu"; calcPay()
if dow==4: day="Fri"; calcpay()
if dow==5: day="Sat"; calcPay(rate=1.5)
if dow==6: day="Sun"; calcPay(rate=2)

The point is: when you have several short bits of code with an identical or 
similar pattern, *vertically aligning* the corresponding parts can IMO make it 
much easier to read the code and easier to spot errors.
Compare this:

if dow==0:
 day="Mon"
 calcPay()
if dow==1:
 day="Tue"
 calcPay()
if dow==2:
 day="Wed"
 calcPay()
if dow==3:
 day="Thu"
 calcPay()
if dow==4:
 day="Fri"
 calcpay()
if dow==5:
 day="Sat"
 calcPay(rate=1.5)
if dow==6:
 day="Sun"
 calcPay(rate=2)

Not so easy to spot the mistake now, is it?
Not to mention the saving of vertical space.

Best wishes
Rob Cliffe
PS If you really care, I can send you a more complicated example of real code 
from one of my programs which is HUGELY more readable when laid out in this way.

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: semi colonic

2023-02-23 Thread avi.e.gross
Greg,

How did you know that was the method I used to indicate I had properly
debugged and tested a line of code?

a = 5; pass
b = 7; pass
c = a * b; pass

Then I switched to using comments:

a = 5 #  pass
b = 7 # pass
c = a * b # fail

And would you believe it still worked!

OK, I am just kidding if anyone is taking this seriously.


-Original Message-
From: Python-list  On
Behalf Of Greg Ewing via Python-list
Sent: Thursday, February 23, 2023 1:28 AM
To: python-list@python.org
Subject: Re: semi colonic

On 23/02/23 1:58 pm, avi.e.gr...@gmail.com wrote:

> Would anything serious break if it was deprecated for use as a 
> statement terminator?

Well, it would break all the code of people who like to write code that way.
They might get a bit miffed if we decide that their code is not serious. :-)

On the other hand, if they really want to, they will still be able to abuse
semicolons by doing this sort of thing:

a = 5; pass
b = 7; pass
c = a * b; pass

Then everyone will know it's some really serious code!

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: semi colonic

2023-02-22 Thread avi.e.gross
That seems like a reasonable if limited use of a semi-colon, Thomas.

Of course, most shells will allow a multi-line argument too like some AWK
scripts I have written with a quote on the first line followed by multiple
lines of properly formatted code  and a closing quote.

Python though can get touchy about getting just the right amount of
indentation and simple attempts to break your program up into two lines 

python -c "import sys 
print('\n'.join(sys.path))"


DO not work so well on some shells.

So, yes, I agree. But I tried this on bash under Cygwin on windows using a
"here" document and it worked fine with multiple lines so something to
consider with no semicolons:

$ python < import sys
> print('\n'.join(sys.path))
> !

/usr/lib/python2.7/site-packages/pylint-1.3.1-py2.7.egg
/usr/lib/python2.7/site-packages/astroid-1.3.4-py2.7.egg
/usr/lib/python2.7/site-packages/six-1.9.0-py2.7.egg
/usr/lib/python27.zip
/usr/lib/python2.7
/usr/lib/python2.7/plat-cygwin
/usr/lib/python2.7/lib-tk
/usr/lib/python2.7/lib-old
/usr/lib/python2.7/lib-dynload
/usr/lib/python2.7/site-packages
/usr/lib/python2.7/site-packages/gtk-2.0

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Wednesday, February 22, 2023 9:05 PM
To: python-list@python.org
Subject: Re: semi colonic

On 2/22/2023 7:58 PM, avi.e.gr...@gmail.com wrote:
> Thomas,
> 
> This is one of many little twists I see between languages where one 
> feature impacts use or even the need for another feature.
> 
> So can anyone point to places in Python where a semicolon is part of a 
> best or even good way to do anything?

Mostly I use it to run small commands on the command line with python -c.
e.g.

python -c "import sys;print('\n'.join(sys.path))"

This is handy enough that I wouldn't like to do without.

Another place I use the semicolon (once in a while) is for quick debugging.
I might add as line like, perhaps,

import os; print(os.path.exists(filename))

This way I can get rid of the debugging statement by deleting that single
line.  This is non only quicker but I'm less likely to delete too much by
mistake.

> Some older languages had simple parsers/compilers that needed some way 
> to know when a conceptual line of code was DONE and the semi-colon was 
> a choice for making that clear. But some languages seem to only 
> continue looking past an end-of-line if they detect some serious 
> reason to assume you are in middle of something. An unmatched open 
> parenthesis or square bracket might be enough, and in some languages a
curly brace.
> 
> Python mainly has a concept of indentation and blank lines as one part 
> of the guidance. Continuing lines is possible, if done carefully.
> 
> But consider the lowly comma. Some languages may assume more is to 
> come if it is dangled at the end of a line. But in a language that 
> supports a dangling comma such as in making a tuple, how is the 
> interpreter to know more is to come?
> 
 a = 5,
 a
> (5,)
> 
 a = 5, \
> ... 6
 a
> (5, 6)
> 
> Well, one possible use of a semi-colon is to make short one-liner 
> functions like this:
> 
>  def twoByFour(a): sq = a*a; forth = sq*sq; return((sq, forth))
> 
> There is no reason, of course, that could not be done in multiple 
> indented lines or other ways.
> 
> So if it was allowed in something like a lambda creation, it could be 
> useful but it isn't!
> 
> About the only thing that I can think of is if someone wishes to 
> compress a file of python code a bit. The indentation can add up but a 
> semi-colon does not solve all such problems.
> 
> Would anything serious break if it was deprecated for use as a 
> statement terminator? Then again, is it hurting anything? If it 
> stopped being used this way, could it later be introduced as some new 
> language feature or operator such as we now have a := b as a reuse of 
> the colon, maybe a semicolon could be useful at least until someone 
> decides to allow additional Unicode characters!
> 
> Now if there are serious reasons to use semi-colon in python, great. 
> If not, it is a historical artifact.
> 
> -Original Message-
> From: Python-list 
>  On Behalf Of 
> Thomas Passin
> Sent: Wednesday, February 22, 2023 7:24 PM
> To: python-list@python.org
> Subject: Re: Introspecting the variable bound to a function argument
> 
> On 2/22/2023 3:12 PM, Hen Hanna wrote:
>> On Wednesday, February 22, 2023 at 2:32:57 AM UTC-8, Anton Shepelev
wrote:
>>> Hello, all.
>>>
>>> Does Python have an instrospection facility that can determine to 
>>> which outer variable a function argument is bound, e.g.:
>>>
>>> v1 = 5;
>>> v2 = 5;
>>
>>
>> do some Python coders like to end lines with   ;   ?
> 
> Very few, probably.  It's not harmful but adds unnecessary visual clutter.
> 
>>>
>>>   def f(a):
>>>  print(black_magic(a))# or
> black_magic('a')
>>>
>>>   f(v1)# prints: v1
>>>   f(v2)# prints: v2
>>>
>>
>> 

semi colonic

2023-02-22 Thread avi.e.gross
Thomas,

This is one of many little twists I see between languages where one feature
impacts use or even the need for another feature.

So can anyone point to places in Python where a semicolon is part of a best
or even good way to do anything?

Some older languages had simple parsers/compilers that needed some way to
know when a conceptual line of code was DONE and the semi-colon was a choice
for making that clear. But some languages seem to only continue looking past
an end-of-line if they detect some serious reason to assume you are in
middle of something. An unmatched open parenthesis or square bracket might
be enough, and in some languages a curly brace.

Python mainly has a concept of indentation and blank lines as one part of
the guidance. Continuing lines is possible, if done carefully.

But consider the lowly comma. Some languages may assume more is to come if
it is dangled at the end of a line. But in a language that supports a
dangling comma such as in making a tuple, how is the interpreter to know
more is to come?

>>> a = 5,
>>> a
(5,)

>>> a = 5, \
... 6
>>> a
(5, 6)

Well, one possible use of a semi-colon is to make short one-liner functions
like this:

def twoByFour(a): sq = a*a; forth = sq*sq; return((sq, forth))

There is no reason, of course, that could not be done in multiple indented
lines or other ways. 

So if it was allowed in something like a lambda creation, it could be useful
but it isn't!

About the only thing that I can think of is if someone wishes to compress a
file of python code a bit. The indentation can add up but a semi-colon does
not solve all such problems.

Would anything serious break if it was deprecated for use as a statement
terminator? Then again, is it hurting anything? If it stopped being used
this way, could it later be introduced as some new language feature or
operator such as we now have a := b as a reuse of the colon, maybe a
semicolon could be useful at least until someone decides to allow additional
Unicode characters!

Now if there are serious reasons to use semi-colon in python, great. If not,
it is a historical artifact.

-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Wednesday, February 22, 2023 7:24 PM
To: python-list@python.org
Subject: Re: Introspecting the variable bound to a function argument

On 2/22/2023 3:12 PM, Hen Hanna wrote:
> On Wednesday, February 22, 2023 at 2:32:57 AM UTC-8, Anton Shepelev wrote:
>> Hello, all.
>>
>> Does Python have an instrospection facility that can determine to 
>> which outer variable a function argument is bound, e.g.:
>>
>> v1 = 5;
>> v2 = 5;
> 
> 
> do some Python coders like to end lines with   ;   ?

Very few, probably.  It's not harmful but adds unnecessary visual clutter.

>>
>>  def f(a):
>> print(black_magic(a))# or
black_magic('a')
>>
>>  f(v1)# prints: v1
>>  f(v2)# prints: v2
>>
> 
> the term  [call by name]  suggests  this should be possible.
> 
> 
> 30 years ago...  i used to think about this type of thing A LOT ---
>   ---  CBR, CBV, CBN,   (call by value),(call by name)
etc.
> 

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Why doesn't Python (error msg) tell me WHAT the actual (arg) values are ?

2023-02-22 Thread avi.e.gross
Hen or Hanna,

You keep asking WHY which may be reasonable but hard or irrelevant in many
cases.

I find the traceback perfectly informative.

It says you asked it to print NOT just "a" but "a + 12" and the error is
coming not from PRINT but from trying to invoke addition between two objects
that have not provided instructions on how to do so. Specifically, an object
of type str has not specified anything to do if asked to concatenate an
object of type int to it. And, an object of type int has not specified what
to do if asked to add itself to an object of type str to the left of it.
Deeper in python, the objects have dunder methods like __ADD__() and
___RADD__() to invoke for those situations that do some logic and decide
they cannot handle it and return an exception of sorts that ends up
generating your message.

If you want to know what "a" has at the moment, ask for just it, not adding
twelve to it. Perhaps you should add a line above your print asking to just
print(a).

Before you suggest what might be helpful, consider what it might mean in a
complex case with lots of variables and what work the interpreter might have
to do to dump the current values of anything relevant or just ANYTHING.

The way forward is less about asking why but asking what to do to get what
you want, or realize it is not attained the way you thought.

Avi

-Original Message-
From: Python-list  On
Behalf Of Hen Hanna
Sent: Wednesday, February 22, 2023 3:05 PM
To: python-list@python.org
Subject: Why doesn't Python (error msg) tell me WHAT the actual (arg) values
are ?


  >  py   bug.py
   Traceback (most recent call last):
 File "C:\Usenet\bug.py", line 5, in 
 print( a + 12 )
  TypeError: can only concatenate str (not "int") to str


Why doesn't  Python (error msg) do the obvious thing and tell me
WHAT   the actual   (offending,  arg)  values are ?

In many cases, it'd help to know what string the var  A  had  ,   when the
error occurred.
   i wouldn't have to put  print(a) just
above,  to see.




( pypydoesn't do that either,   but Python makes programming (debugging)
so easy that i hardly feel any inconvenience.)
-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Tuple Comprehension ???

2023-02-21 Thread avi.e.gross
HH,

Just FYI, as a seeming newcomer to Python, there is a forum that may fit
some of your questions better as it is for sort of tutoring and related
purposes:

https://mail.python.org/mailman/listinfo/tutor

I am not discouraging you from posting here, just maybe not to overwhelm
this group with many questions and especially very basic ones.

Your choice.

But what I read below seems like an attempt by you to look for a cognate to
a python feature in a LISP dialect. There may be one but in many ways the
languages differ quite a bit.

As I see it, using the asterisk the way you tried is not all that common and
you are sometimes using it where it is not needed. Python is NOT a language
that is strongly typed so there is no need to take a list or other iterator
or container of numbers and fool the interpreter into making it look like
you placed them as multiple arguments. In many places, just handing over a
list is fine and it is expanded and used as needed.

When a function like sum() or max() is happy to take a single list argument,
then feed it the list, not *list. 

Where it can be helpful is a function like range() where range() takes up to
three arguments as in:

 >>> list(range(7))
[0, 1, 2, 3, 4, 5, 6]
>>> list(range(11, 20))
[11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> list(range(31, 45, 3))
[31, 34, 37, 40, 43]

In the above, you are really asking for stop=num, or start/stop or
start/stop/step. 

But what if you have some data structure like a list that contains [31, 45,
3] or just a two number version or a single number and it is sitting in a
variable. You can ask Python to unpack all kinds of things in various ways
and the asterisk is one simple one. Unpacking is itself a huge topic I will
not discuss.

>>> mylist = [31, 45, 3]
>>> list(range(mylist))
Traceback (most recent call last):
  File "", line 1, in 
list(range(mylist))
TypeError: 'list' object cannot be interpreted as an integer
>>> list(range(*mylist))
[31, 34, 37, 40, 43]

Range() takes only something like integers.  So you use the * in this
context to give it three integers individually.

Range does not take named arguments but many other functions do. So what if
I have an arbitrary function that accepts arguments like
myfunc(alpha=default, beta=default, gamma=default) where the defaults are
not the issue and may be anything such as "" or 0 or an empty set.

I could write code like this that creates a toy version of the function and
create a dictionary that supplies any combination of the required arguments
and use not the * operator that expands something like a list, but the
doubled ** operator that expands all entries of a dictionary into individual
items:

>>> def myfunc(alpha=1, beta="", gamma=""):
... print(f"alpha={alpha}, beta={beta}, gamma={gamma}")
... 
... 
>>> myfunc()
alpha=1, beta=, gamma=
>>> myfunc(1, 2, 3)
alpha=1, beta=2, gamma=3

>>> mydict = { "alpha" : 101, "beta" : "hello", "gamma" : "buy bye" }
>>> mydict
{'alpha': 101, 'beta': 'hello', 'gamma': 'buy bye'}

>>> myfunc( **mydict )
alpha=101, beta=hello, gamma=buy bye

I have no idea if any of this is really like your macroexpand. It does not
need to be. It is what it is. If you went back to a language like C, their
macros might be used to make a "constant with "#define" but they would not
be a constant in the same way as a language that uses a keyword that makes a
variable name into a constant that cannot be changed without causing an
error. Similar but also not the same.

This is only the surface of some things commonly done in python when it
makes sense. But often there is no need and your examples are a good example
when the function happily take a list in the first place. So why fight it
especially when your expanded version is not happily received?

The takeaway is that you need to read a bit more of a textbook approach that
explains things and not use slightly more advanced features blindly. It is
NOT that sum() takes a single argument that matters. It fails on something
like sum(1) which is a single argument as well as sum("nonsense") and so on.
What sum takes is a wide variety of things in python which implement what it
takes to be considered an iterable. And it takes exactly one of them under
the current design.

sum((1,))
1
sum([1])
1
sum(n- 5 for n in range(10,15))
35

All kinds of things work. Tuples and lists are merely the easiest to see.
The latter exmple is a generator that returns 5 less than whatever range()
produces as another kind of iterable. The sum() function  will not take two
or more things, iterable or not. So the first below fails and the second
does not:

>>> sum([1, 2], [3,4])
Traceback (most recent call last):
  File "", line 1, in 
sum([1, 2], [3,4])
TypeError: can only concatenate list (not "int") to list

>>> sum([1, 2] + [3,4])
10

Why? Because the plus sign asked the lists to combine into one larger list.
The sum function is only called after python has combined the lists into one
with no name.

Now you can 

RE: Tuple Comprehension ???

2023-02-21 Thread avi.e.gross
Axy,

Nobody denies some of the many ways you can make a good design. But people
have different priorities that include not just conflicts between elements
of a design but also equally important factors like efficiency and deadlines
and not breaking too badly with the past.

You can easily enough design your own sub-language of sorts within the
python universe. Simply write your own module(s) and import them. Choose
brand new names for many of your functions or replace existing functions
carefully by figuring out which namespace a function is defined in, then
creating a new function with the same name that may call the old function
within it by explicitly referring to it.

SO if you want a new max() then either create an axy_max() or perhaps link
the original max to original_max and make your own max() that after playing
around internally, might call original_max.

Here is an example. Suppose you want the maximum value in a nested structure
like:

nested = [ 1, [2, 3, [4, 5], 6], 7]

This contains parts at several levels including an inner list containing yet
another inner list. using max(nested) will not work even if some intuition
wants it to work.

If you want your own version of max to be flexible enough to deal with this
case too, then you might find a flattener function or  a function that
checks the depth of a structure such as a list or tuple, and apply it as
needed until you have arguments suitable to hand to original_max. Your max()
may end up being recursive as it keep peeling back one level and calling
itself on the results which may then peel back another level. 

But the people who built aspects of python chose not to solve every single
case, however elegant that might be, and chose to solve several common cases
fairly efficiently.

What we often end up doing here is pseudo-religious discussions that rarely
get anywhere. It really is of no importance to discuss what SHOULD HAVE BEEN
for many scenarios albeit with some exceptions. Some errors must be slated
to be fixed or at least have more warnings in the documentation. And,
proposals for future changes to Python can be submitted and mostly will be
ignored.

What people often do not do is to ask a question that is more easy to deal
with. Asking WHY a feature is like it is can be a decent question. Asking
how to get around a feature such as whether there is some module out there
that implements it another way using some other function call, is another
good question. COMPLAINING about what has been done and used for a long time
is sometimes viewed differently and especially if it suggests people doing
this were stupid or even inconsistent. 

Appealing to make-believe rules you choose to live your life by also tends
not to work. As you note, overengineering can cause even more problems than
a simple consistent design, albeit it can also create a rather boring and
useless product.

Too much of what happens under the table is hidden in python and if you
really study those details, you might see how a seemingly trivial task like
asking to create a new object of some class, can result in a cascade of code
being run that does things that themselves result in cascades of more code
as the object is assembled and modified using a weird number of searches and
executions for dunder methods in classes and metaclasses it is based on as
well as dealing with other objects/functions like descriptors and
decorators. Since our processors are faster, we might be able to afford a
design that does so much for you and we have garbage collection to deal with
the many bits and pieces created and abandoned in many processes. So our
higher level designs can often look fairly simple and even elegant but the
complexity now must be there somewhere.

I hate to bring up  an analogy as my experience suggests people will take it
as meaning way more (or less) than I intend. Many languages, especially
early on, hated to fail. Heck, machines crashed. So an elegant design was
required to be overlaid with endless testing to avoid the darn errors.
Compilers had to try to catch them even earlier so you did not provide any
argument to a function that was not the same type. You had to explicitly
test for other things at run time to avoid dividing by zero or take the
square root of a negative number or see if a list was empty ...

Python allows or even often encourages a different paradigm where you throw
errors when needed but mainly just TRY something and be prepared to deal
with failure. It too is an elegant design but a very different one. And, you
can do BOTH. Heck, you can do many styles of programming as the language
keeps being extended. There is no one right way most of the time even if
someone once said there is.

So if the standard library provides one way to do something, it may not be
the only way and may not match what you want. Sometimes the fix for a
request is made by adding options like the default=value for max, and
sometimes by allowing the user to specify the 

RE: Tuple Comprehension ???

2023-02-21 Thread avi.e.gross
There are limits to anyone arguing for designs to be the way they want or 
expect and Roel has explained this one below.

When it comes to designing a function, lots of rules people expect are beyond 
irrelevant. Many functions can be implemented truly hundreds of ways with 
varying numbers of arguments and defaults and other effects. I can make a 
function that raises the first argument to a power specified in the second 
argument with no defaults and you get a syntax error for calling it with one 
argument or more than two arguments. Or I can make the second argument use a 
keyword with a default of 1, or 2 or whatever I wish and it can now be called 
with one argument to get the default or two but not more. Or, I can have the 
function absorb all additional arguments and ignore them or even use them as 
additional powers to be raised to so pow(2, 3, 4, 5) returns a tuple or list of 
8, 16, 32.  Or maybe not and it would return ((2^3)^4)^5 or any other nonsense 
you design.

There IS NO CONSISTENCY possible in many cases unless you make a family of 
similarly named functions and add some thing to each name to make it clear.

Python arguably is harder than some languages in this regard as it allows way 
more flexibility. If a function accepts an iterator, and another does not, the 
call may superficially looks the same but is not. 

So, yes, max() could have been designed differently and you can even design 
your own mymax() and mysum() to check the arguments they receive and re-arrange 
them in a way that lets you call the original max/sum functions potentially in 
the same ways. 

But as a general rule, when using a function, don't GUESS what it does or infer 
what it does and then complain when someone says you should have read the 
manual. There are too many design choices, often done by different programmers 
and often motivated by ideas like efficiency. You likely can not guess many of 
them.

And lots of python functions you write can make use of all kinds of features 
such as caching results of previous computations or holding on to variables 
such as what you asked for last time so it can be used as a default. If I write 
a function like go(direction=something, distance=something) then perhaps my 
design will remember the last time it was invoked and if you call it again with 
no arguments, it may repeat the same action, or if only one is given, the other 
is repeated. But on a first call, it may fail as it has no memory yet of what 
you did. That may be intuitive to some and not others, but would it make as 
much sense for another function to be designed the same way so it tolerates 
being called with no arguments when this makes less sense? Do I often want to 
call for sin(x) and later for just sin() and expect it to mean that it be 
repeated?

But back to the original question about max/sum it gets weirder. Although max() 
takes any number of arguments, it really doesn't. There is no way to get the 
maximum of a single argument as in max(5) because it is designed to EITHER take 
one iterable OR more than one regular argument. 

So one case that normally fails is max([]) or any empty iterable and you can 
keep it from failing with something like max([], default=0) .

In your own code, you may want to either design your own functions, or use them 
as documented or perhaps create your own wrapper functions that carefully 
examine what you ask them to do and re-arrange as needed to call the 
function(s) you want as needed or return their own values or better error 
messages.  As a silly example, this fails:

max(1, "hello")

Max expects all arguments to be of compatible types. You could write your own 
function called charMax() that converts all arguments to be of type str before 
calling max() or maybe call max(... , key=mycompare) where compare as a 
function handles this case well.

The key point is that you need to adapt yourself to what some function you want 
to use offers, not expect the language to flip around at this point and start 
doing it your way and probably breaking many existing programs.

Yes, consistency is a good goal. Reality is a better goal.




-Original Message-
From: Python-list  On 
Behalf Of Roel Schroeven
Sent: Tuesday, February 21, 2023 1:11 PM
To: python-list@python.org
Subject: Re: Tuple Comprehension ???

Hen Hanna schreef op 21/02/2023 om 5:13:
>  (A)   print( max( * LisX ))
>  (B)   print( sum( * LisX ))<--- Bad syntax !!!
>
> What's most surprising is (A)  is ok, and  (B) is not.
>
> even tho'   max() and sum()  have   (basically)  the same 
> syntax...  ( takes one arg ,  whch is a list )
>
There's an important difference in syntax.

sum() takes an iterable:

sum(iterable, /, start=0)
 Return the sum of a 'start' value (default: 0) plus an iterable of numbers

 When the iterable is empty, return the start value.
 This function is intended specifically for use with numeric 

RE: Tuple Comprehension ???

2023-02-21 Thread avi.e.gross
There is a very common misunderstanding by people learning python that a
tuple has something to do with parentheses. It confused me too at first.

A tuple is made by the use of one or more commas and no parentheses are
needed except when, like everything else, they are used for grouping as in
the arithmetic for 

  (5 + 4) * 3

So (6) is not a tuple while a trailing comma makes (6,) to be a tuple with
one entry.

A tad confusingly is that () by itself is a tuple, containing nothing. While
(,) is a syntax error!

A serious design issue in most computer languages is that there are too few
unique symbols to go around and some get re-used in multiple ways that
usually are not ambiguous when viewed in context. As an example, sets and
dictionaries both use curly braces but {} by itself is considered ambiguous
and they chose to make it be an empty dictionary. To get an empty set, use
set() instead. Parentheses are way overused and thus it gets murky at times
as when they are used to sort of make it clear you are using a generator.

Consider how this fails without parentheses:

result = x*2 for x in [1,2,3]
SyntaxError: invalid syntax

But with parentheses works fine:

result = (x*2 for x in [1,2,3])
result
 at 0x029A3CFCF030>

However if you want a generator that is expanded into a list, you do not
need the parentheses duplicated like this:

result = list( (x*2 for x in [1,2,3]) )

and can just use this without nested parentheses:

result = list(  x*2 for x in [1,2,3]  )

For completeness, you arguably should have a concept of a comprehension for
every possible case but the people at python chose not to for reasons like
the above and especially as it is fairly simple to use this version:

result = tuple(  x*2 for x in [1,2,3]  )

Yes, it is a tad indirect and requires making a generator first.






-Original Message-
From: Python-list  On
Behalf Of Hen Hanna
Sent: Monday, February 20, 2023 11:14 PM
To: python-list@python.org
Subject: Re: Tuple Comprehension ???

On Monday, February 20, 2023 at 7:57:14 PM UTC-8, Michael Torrie wrote:
> On 2/20/23 20:36, Hen Hanna wrote: 
> > For a while, i've been curious about a [Tuple Comprehension]
> I've never heard of a "Tuple comprehension." No such thing exists as 
> far as I know.
> > So finally i tried it, and the result was a bit surprising... 
> > 
> > 
> > X= [ x for x in range(10) ]
> > X= ( x for x in range(10) )
> > print(X)
> > a= list(X)
> > print(a)


> What was surprising? Don't keep us in suspense! 
> 
> Using square brackets is a list comprehension. Using parenthesis 
> creates a generator expression. It is not a tuple.

ok!



LisX= [x for x in range(10) ]

print( sum( LisX ))
print( max( LisX ))

print( sum( x for x in range(10) ) )
print( max( x for x in range(10) ) )

print( * LisX )

print( max( * LisX ))
print( sum( LisX ))# same as before
# print( sum( * LisX )) <--- Bad syntax !!!

   TypeError: sum() takes at most 2 arguments (10 given)


_

(A)   print( max( * LisX ))
(B)   print( sum( * LisX ))<--- Bad syntax !!!

What's most surprising is (A)  is ok, and  (B) is not.

   even tho'   max() and sum()  have   (basically)  the same
syntax...  ( takes one arg ,  whch is a list )



i've been programming for many years...( just knew to Python )
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Tuple Comprehension ???

2023-02-20 Thread avi.e.gross
Tuples are immutable and sort of have to be created all at once. This does
not jive well wth being made incrementally in a comprehension. And, as
noted, the use of parentheses I too many contexts means that what looks like
a comprehension in parentheses is used instead as a generator.

If you really want a tuple made using a comprehension, you have some options
that are indirect.

One is to create a list using the comprehension and copy/convert that into a
tuple as in:

mytuple = tuple( [x for x in range(10) ] )

I think an alternative is to use a generator in a similar way that keeps
being iterated till done.

mytuple = tuple( (x for x in range(10) ) )

And similarly, you can use a set comprehension and convert that to a tuple
but only if nothing is repeated and perhaps order does not matter, albeit in
recent python versions, I think it remains ordered by insertion order!

mytuple = tuple( {x for x in range(10) } )

There are other more obscure and weird ways, of course but generally no
need.

Realistically, in many contexts, you do not have to store or use things in
tuples, albeit some sticklers think it is a good idea to use a tuple when
you want to make clear the data is to be immutable. There can be other
benefits such as storage space used. And in many ways, tuples are supposed
to be faster than lists.

-Original Message-
From: Python-list  On
Behalf Of Michael Torrie
Sent: Monday, February 20, 2023 10:57 PM
To: python-list@python.org
Subject: Re: Tuple Comprehension ???

On 2/20/23 20:36, Hen Hanna wrote:
> For a while,  i've been curious about a  [Tuple   Comprehension] 

I've never heard of a "Tuple comprehension."  No such thing exists as far as
I know.

> So  finally   i tried it, and the result was a bit surprising...
> 
> 
> X= [ x for x in range(10) ]
> X= ( x for x in range(10) )
> print(X)
> a= list(X)
> print(a)

What was surprising? Don't keep us in suspense!

Using square brackets is a list comprehension. Using parenthesis creates a
generator expression. It is not a tuple. A generator expression can be
perhaps thought of as a lazy list.  Instead of computing each member ahead
of time, it returns a generator object which, when iterated over, produces
the members one at a time.  This can be a tremendous optimization in terms
of resource usage.  See
https://docs.python.org/3/reference/expressions.html#generator-expressions.
 Also you can search google for "generator expression" for other examples.

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Comparing caching strategies

2023-02-18 Thread avi.e.gross
MRAB,

I made it very clear I was using the translation provided by Google Translate. 
I copied exactly what it said and as I speak the languages involved, they 
seemed reasonable.  I often find it provides somewhat different translations 
than I expect and sometimes I need to supply a longer sentence to get it on 
track and sometimes it is just plain wrong. Getting it to provide 
female-specific sentences can be an exercise in frustration, for example.

What it produces is not important to the main point. 

It is hard to have conversations when people keep niggling on details and 
especially details that make absolutely no difference. 

My point was you can have functions that cannot be cached for results just any 
trivial way. The EXAMPLE I gave suggests if your had a Python program that did 
something long these lines between multiple languages, the RESULTS will depend 
on not just the phrase used. But they can be cached well in several ways if you 
want.

Let me point out another related use. When you type a message on your phone, 
you may have a sort of prediction algorithm running that keeps offering to 
auto-complete your words or fix spelling errors. It does this dynamically and 
learns new words and eventually may start remembering patterns. I have totally 
mucked up my ENGLISH keyboard because it now remembers snippets of my typing in 
languages like Spanish and German and lately Esperanto and it makes suggestions 
from a very confused composite state including lots of funny accented 
characters. I now switch keyboards when I remember but much of the damage is 
done! LOL!

That too is an example of some hidden data the program uses which is loaded 
with past data of what I have typed and of short phrases so it can make a guess 
that word A is often followed by word B and after seeing both A and B is 
extremely likely to be followed by C. Now obviously this would not be a trivial 
use of caching as part of guessing but could be part of a method that 
increments some keys like "A B" or "A B C" with a count of how often they have 
seen that combo. Yes, this is not direct caching, but also a side effect you 
might want to add to a function.

As for Esperanto, I am still learning it and as a created language, the snippet 
I used can likely be translated multiple ways and as someone who could care 
less about Valentines Day, I don't ever intend on using any of those ways in 
conversation. Most languages have alternate ways of saying things and it would 
not shock me if there are sentences like "The Holiday used in Capitalist 
Western Countries to commemorate a day of love based on a person some religions 
consider of merit and have given a religious title" or whatever.

So back on topic, an original question here was how to cache things, perhaps 
using a LRU algorithm with a data structure using some maximum.

My comment was that using a function decorator that caches any result may not 
be adequate in many cases. I presented several examples and my point is that in 
the above example, it may make more sense to have multiple caches that exist 
perhaps outside any one function, or a more complex cache that stores using a 
more complex key

-Original Message-
From: Python-list  On 
Behalf Of MRAB
Sent: Saturday, February 18, 2023 7:04 PM
To: python-list@python.org
Subject: Re: Comparing caching strategies

On 2023-02-18 23:04, avi.e.gr...@gmail.com wrote:
[snip]
> 
> Note how this can cause problems with the original idea here of caching 
> strategies. Imagine a function that checks the environment as to what 
> encoding or human language and so on to produce text in. If you cache it so 
> it produces results that are stored in something like a dictionary with a 
> key, and later the user changes the environment as it continues running, the 
> cache may now contain invalid results. You might need to keep track of the 
> environment and empty the cache if things change, or start a new cache and 
> switch to it.  An example would be the way I use Google Translate. I 
> sometimes am studying or using a language and want to look up a word or 
> phrase or entire sentence. If Google Translate keeps track, it may notice 
> repeated requests like "Valentines Day" and cache it for re-use. But I often 
> click to switch languages and see if the other one uses a similar or 
> different way to describe what it means or something similar but spelled 
> another way. German does the latter as in Valentinstag which is a fairly 
> literal translation as does Dutch (Valentijnsdag ) and  Hungarian (Valentin 
> nap) .
> 
> But Hebrew calls it the holiday of love, sort of (חג האהבה). 
> Portuguese is similar but includes day as well as love (Dia dos 
> Namorados)
> 
> Esperanto tosses in more about sainthood (Sankta Valentín) and in a sense 
> Spanish does both ways with day and saint (Día de San Valentín).
> 
The Esperanto didn't look right to me; it's "Valentena tago" or 
"Sankt-Valentena tago".

[snip]

--

RE: Comparing caching strategies

2023-02-18 Thread avi.e.gross
It is not an unusual pattern, Thomas, to do something selective to some object 
rather than do all parts just one way.

The history of computing has often been one where you had to deal with scarcity 
of expensive resources.

Consider the Python "list" as a rather wasteful container that is best used for 
diverse contents. As soon as you make all the contents of the same basic type, 
you start wondering if you are better off with a restricted type that holds 
only one kind of item, often as something like a numpy array. Or, you start 
wondering if maybe a good way to store the contents is in a dictionary instead, 
as searching it is often way faster.

But fundamentally, there is nothing WRONG with uses of a Python list as it 
handles almost anything and for small sample sizes, it is often not work 
spending time redoing it. Still, if you implement a 3-D matrix as a list of 
lists of lists, and use slicing and other techniques to do things like matrix 
multiplication, it gets unwieldly enough so arguably it is the wrong tool.

If you look at the history of Python, they deliberately left out many of the 
containers other programming languages used and stressed lists and tuples. No 
vectors or arrays were the focus. Later stuff had to be added as people noted 
that generality has costs. If you look at a language like JavaScript, it is 
beyond weird how they decided to use attributes of an object to make an array. 
So an object might have attributes like "name" and "length" alongside some like 
"1" and "2" and "666" and when you wanted to treat the object like an array, it 
might look at the attributes and ignore the non-numerical ones and look like it 
has a sparsely populated array/vector of items indexed from 1 to 666. You can 
do all kinds of arithmetic operations and add missing indices or remove some 
and it still works like a typical array but with weird costs such as sometimes 
having to reindex lots of items if you want to insert a new item as the 3rd or 
1st element so any above need renumbering the hard way! It works but strikes me 
as a kludge.

If you look within Python  at numpy and pandas and similar utilities, they are 
well aware of the idea of abstracting out concepts with different 
implementations and use lots of Python tools that access different storage 
methods as needed often by using objects that implement various "protocols" to 
the point where manipulating them similarly seems straightforward. In 
principle, you could create something like a pandas data.frame and then call a 
function that examines the columns, and perhaps rows, and returns a modified 
(or new) data.frame where the contents have been adjusted so the storage is 
smaller or can be accessed more efficiently, based on any other preferences and 
hints you supply, such as if it can be immutable. A column which is all 1's or 
Trues can obviously be done other ways, and one that has all values under 256, 
again, can use less storage. 

Of course this would need to be done very carefully so that changes that 
violate assumptions will result in a refactoring to a version that handles the 
changes, or results in an error. And a long-running system could be set to keep 
track of how an object is used and perhaps make adjustments. As one example, 
after some number of accesses, you might change a function "live" to begin 
caching, or have an object reconfigure to be faster to search even if occupying 
more room.

Back to Python lists, a typical change is simply to convert them to a tuple 
which makes them easier to store and often faster to search. And, if the keys 
are unique, now that dictionaries maintain order of insertion, sometimes you 
may want to convert the list to a dict. 

But I hasten to note some "improvements" may not really improve. In a language 
like R, many operations such as changing one vector in a data.frame are done 
lazily and the new copy is still sharing mostly the same storage as the old. 
Making some changes can result in negative effects. A common problem people 
have is that trying to store some objects in an existing vector can work except 
when done, the entire vector has been transformed into one that can carry any 
of the contents. A vector of integers may become a vector of doubles or even a 
vector of characters that now have entries like "777" as a character string. 
The flexibility comes with lots of programming errors!

Note how this can cause problems with the original idea here of caching 
strategies. Imagine a function that checks the environment as to what encoding 
or human language and so on to produce text in. If you cache it so it produces 
results that are stored in something like a dictionary with a key, and later 
the user changes the environment as it continues running, the cache may now 
contain invalid results. You might need to keep track of the environment and 
empty the cache if things change, or start a new cache and switch to it.  An 
example would be the way I use Google 

RE: Comparing caching strategies

2023-02-18 Thread avi.e.gross
David,

This conversation strikes me as getting antagonistic and as such, I will not
continue it here after this message.

I can nitpick at least as well as you but have no interest. It is also
wandering away from the original point.

The analogy I gave remains VALID no matter if you do not accept it as being
precise. Analogies are not equalities. I did not say this does at all what
programs like pkzip do in entirety or even anything similarly. I simply said
that pkzip (as it no doubt evolves) has a series of compression methods to
choose from and each has a corresponding method to undo and get back the
original. As it happens, you can write any number of algorithms that
determine which method to use and get back the original. It depend on many
relative costs including not just the size of the compressed object but how
often it will be accessed and how much effort each takes. In some cases you
will compress everything once and extract it many times and in many places,
so it may be worth the effort to try various compression techniques and
measure size and even the effort to extract from that and decide which to
use.

I do not know the internals of any Roaring Bitmap implementation so all I
did gather was that once the problem is broken into accessing individual
things I chose to call zones for want of a more specific name, then each
zone is stored in one of an unknown number of ways depending on some logic.
You say an implementation chose two ways and that is fine. But in theory, it
may make sense to choose other ways as well and the basic outline of the
algorithm remains similar. I can imagine if a region/zone is all the even
numbers, then a function that checks if you are searching for an odd number
may be cheaper. That is an example, not something I expect to see or that is
useful enough. But the concept of storing a function as the determiner for a
region is general enough that it can support many more realistic ideas.

>From what you wrote, the algorithm chosen is fairly simple BUT I have to ask
if these bitmaps are static or can be changed at run time? I mean if you
have a region that is sparse and then you keep adding, does the software
pause and rewrite that region as a bitmap if it is a list of offsets? Or, if
a bitmap loses enough, ...

On to your other points. Again, understand I am talking way more abstractly
than you and thus it really does not matter what the length of a particular
ID in some country is for the main discussion. The assumption is that if you
are using something with limits, like a Roaring Bitmap, that you do things
within the limits. When I lived in Austria, I did not bother getting an
Austrian Sozialversicherungsnummer so I have no idea it is ten digits long.
In any case, many things grow over time such as the length of telephone
numbers. 

The same applies to things like airport codes. They can get longer for many
reasons and may well exceed 4 characters, and certainly UNICODE or other
such representations may exceed four bytes now if you allow non-ASCII
characters that map into multiple bytes. My point was to think about how
useful a Roaring bitmap is if it takes only 32 bit integers and one trivial
mapping was to use any four bytes to represent a unique integer. But clearly
you could map longer strings easily enough if you restrict yourself to 26
upper case letters and perhaps a few other symbols that can be encoded in 5
bits. I am not saying it is worth the effort, but that means 6 characters
can fit in 32 bits.

I do wonder if the basic idea has to be limited to 32 bits or if it can
expand to say 64 or use additional but fast methods of storing the data
beyond the two mentioned.

There are variants of other ideas I can think of like sparse arrays or
matrices such as you find in the scipy module in Python. If they hold a
Boolean value, they sound like they are a similar idea where you simply keep
track of the ones marked True, or if it makes sense, the ones considered
False.

-Original Message-
From: Python-list  On
Behalf Of Peter J. Holzer
Sent: Saturday, February 18, 2023 5:40 AM
To: python-list@python.org
Subject: Re: Comparing caching strategies

On 2023-02-17 18:08:09 -0500, avi.e.gr...@gmail.com wrote:
> Analogies I am sharing are mainly for me to wrap my head around an 
> idea by seeing if it matches any existing ideas or templates and is 
> not meant to be exact. Fair enough?

Yeah. But if you are venting your musings into a public space you shouldn't
be surprised if people react to them. And we can only react to what you
write, not what you think.

> But in this case, from my reading, the analogy is rather reasonable.

Although that confused me a bit. You are clearly responding to something I
thought about but which you didn't quote below. Did I just think about it
and not write it, but you responded anyway because you're a mind reader?
Nope, it just turns out you (accidentally) deleted that sentence.


> The implementation of Roaring Bitmaps seems to 

RE: Comparing caching strategies

2023-02-17 Thread avi.e.gross
Peter,

Analogies I am sharing are mainly for me to wrap my head around an idea by
seeing if it matches any existing ideas or templates and is not meant to be
exact. Fair enough?

But in this case, from my reading, the analogy is rather reasonable. The
implementation of Roaring Bitmaps seems to logically break up the space of
all possible values it looks up into multiple "zones" that are somewhat
analogous to individual files, albeit probably in memory as the program
runs. In the pkzip analogy, each file is processed and stored independently
alongside a header that provides enough detail to uniquely open up the
contents when needed. The collection of such compressed files is then one
bigger file that is saved that uses up less space. Roaring bitmaps seems to
determine how best to store each zone and only processes that zone when
requested, hence some of the speedup as each zone is stored in a manner that
generally allows fast access.

I did not raise the issue and thus have no interest in promoting this
technology nor knocking it down. Just wondering what it was under the hood
and whether I might even have a need for it. I am not saying Social Security
numbers are a fit, simply that some form of ID number might fit. If a
company has a unique ID number for each client and uses it consistently,
then an implementation that holds a set stored this way of people using
product A, such as house insurance, and those using product B, such as car
insurance, and perhaps product C is an umbrella policy, might easily handle
some queries such as who uses two or all three (intersections of sets) or
who might merit a letter telling them how much they can save if they
subscribed to two or all three as a way to get more business. Again, just  a
made-up example I can think about. A company which has a million customers
over the years will have fairly large sets as described. 

What is helpful to me in thinking about something will naturally often not
be helpful to you or others but nothing you wrote makes me feel my first
take was in any serious way wrong. It still makes sense to me.

And FYI, the largest integer in signed 32 bits is 2_147_483_647 which is 10
digits. A Social Security number look like xxx-xx- at this time which is
only 9 digits.  Not that it matters, but it seems it still qualifies to be
used as I describe, as long as Roaring bitmaps allows it, minus any spaces
or hyphens and converted to an integer.

As for my other EXAMPLE, I fail to see why I need to provide a specific need
for an application. I don't care what they need it for. The thought was
about whether something that does not start as an integer can be uniquely
mapped into and out of integers of size 32 bits. So I considered a few
examples of short textual items such as three letter airport abbreviations.
But if you cannot imagine an application, consider one similar enough to the
above.  I think there are currently over 47,000 such airports  in the world
and apparently about 20,000 in the US. That seems to exceed the possible
combinations of 26 letters (cubed) so it seems there are 4-letter codes too
such as ZSPD. It should still fit into 4 bytes, for now.

So assume you have a variety of facts such as which airports have
handicapped accessible bathrooms, or have an extra long/strong runway that
can handle huge planes or anything else that is worth knowing. You might
have bitmaps (as is being discussed) that may be sparse for some such info
and fairly dense for other info like having jet fuel available. As above,
finding an airport that has various mixtures may be doable with these sets
and perhaps faster than say queries on a relational database storing the
same info.

I will end by saying this is a hypothetical discussion for me. I am not
doing any projects now where I expect to use Roaring bitmaps but am now
aware of them should any need or opportunity arise. My mind is very full
with such trivia and very little is needed albeit I never know what may come
in as useful. 

Respectfully,

Avi

-Original Message-
From: Python-list  On
Behalf Of Peter J. Holzer
Sent: Friday, February 17, 2023 1:47 PM
To: python-list@python.org
Subject: Re: Comparing caching strategies

On 2023-02-17 00:07:12 -0500, avi.e.gr...@gmail.com wrote:
> Roaring bitmaps claim to be an improvement not only over uncompressed 
> structures but some other compressed versions but my reading shows it 
> may be limited to some uses. Bitsets in general seem to be useful only 
> for a largely contiguous set of integers where each sequential bit 
> represents whether the nth integer above the lowest is in the set or 
> not.

They don't really have to be that contiguous. As long as your integers fit
into 32 bits you're fine.

> Of course, properly set up, this makes Unions and Intersections and 
> some other operations fairly efficient. But sets are not the same as 
> dictionaries and often you are storing other data types than smaller 
> integers.

Of course. Different 

RE: Precision Tail-off?

2023-02-17 Thread avi.e.gross
Stephen,

What response do you expect from whatever people in the IEEE you want?

The specific IEEE standards were designed and agreed upon by groups working
in caveman times when the memory and CPU time were not so plentiful. The
design of many types, including floating point, had to work decently if not
perfectly so you stored data in ways the hardware of the time could process
it efficiently.

Note all kinds of related issues about what happens if you want an integer
larger than fits into 16 bits or 32 bits or even 64 bits. A Python integer
was designed to be effectively unlimited and uses as much storage as needed.
It can also get ever slower when doing things like looking for gigantic
primes. But it does not have overflow problems.

So could you design a floating point data type with similar features? It
would be some complex data structure that keeps track of the number of
bit/bytes/megabytes currently being used to store the mantissa or exponent
parts and then have some data structure that holds all the bits needed. When
doing any arithmetic like addition or division or more complex things, it
would need to compare the two objects being combined and calculate how to
perhaps expand/convert one to match the other and then do umpteen steps to
generate the result in as many pieces/steps as needed and create a data
structure that holds the result, optionally trimming off terminal parts not
needed or wanted. Then you would need all relevant functions that accept
regular floating point to handle these numbers and generate these numbers.

Can that be done well? Well, sure, but not necessarily WELL. Some would
point you to the Decimal type. It might take a somewhat different tack on
how to do this. But everything comes with a cost.

Perhaps the response from the IEEE would be that what they published was
meant for some purposes but not yours. It may be that a group needs to
formulate a new standard but leave the old ones in place for people willing
to use them as their needs are more modest. 

As an analogy, consider the lowly char that stored a single character in a
byte. II mean good old ASCII but also EBCDIC and the ISO family like ISO
8859-1 and so on. Those standards focused in on the needs of just a few
languages and if you wanted to write something in a mix of languages, it
could be a headache as I have had time I had to shift within one document to
say ISO 8859-8 to include some Hebrew, and ISO 8859-3 for Esperanto and so
on while ISO8859-1 was fine for English, French, German, Spanish and many
others. For some purposes, I had to use encodings like shift JIS to do
Japanese as many Asian languages were outside what ISO was doing.

The solutions since then vary but tend to allow or require multiple bytes
per character. But they retain limits and if we ever enter a Star Trek
Universe with infinite diversity and more languages and encodings, we might
need to again enlarge our viewpoint and perhaps be even more wasteful of our
computing resources to accommodate them all!

Standards are often not made to solve ALL possible problems but to make
clear what is supported and what is not required. Mathematical arguments can
be helpful but practical considerations and the limited time available (as
these darn things can take YEARS to be agreed on) are often dominant.
Frankly, by the tie many standards, such as for a programming language, are
finalized, the reality in the field has often changed. The language may
already have been supplanted largely by others for new work, or souped up
with not-yet-standard features.

I am not against striving for ever better standards and realities. But I do
think a better way to approach this is not to reproach what was done but ask
if we can focus on the near-future and make it better.

Arguably, there are now multiple features out there such as Decimal and they
may be quite different. That often happens without a standard.  But if you
now want everyone to get together and define a new standard that may break
some implementations, ...

As I see it, many computer courses teach the realities as well as the
mathematical fantasies that break down in the real world. One of those that
tend to be stressed is that floating point is not exact and that comparison
operators need to be used with caution. Often the suggestion is to subtract
one number from another and check if the result is fairly close to zero as
in the absolute value is less than an IEEE standard number where the last
few bits are ones. For more complex calculations where the errors can
accumulate, you may need to choose a small number with more such bits near
the end.

Extended precision arithmetic is perhaps cheaper now and can be done for a
reasonable number of digits. It probably is not realistic to do most such
calculations for billions of digits, albeit some of the calculations for the
first googolplex digits of pi might indeed need such methods, as soon as we
fin a way to keep that many digits in memory give the 

RE: Comparing caching strategies

2023-02-16 Thread avi.e.gross
I am less interested in the choice of names than the pro and con of when these 
Roaring bitmaps are worth using and when they are not.

It is a bit like discussing whether various compression techniques are worth 
using as the storage or memory costs can be weighed against the CPU or 
transient memory costs of compressing and uncompressing.  The value tends to 
depend on many factors and there may even be times you want to store the data 
in multiple data structures with each optimized to store that kind or amount of 
data.

Roaring bitmaps claim to be an improvement not only over uncompressed 
structures but some other compressed versions but my reading shows it may be 
limited to some uses. Bitsets in general seem to be useful only for a largely 
contiguous set of integers where each sequential bit represents whether the nth 
integer above the lowest is in the set or not. Of course, properly set up, this 
makes Unions and Intersections and some other operations fairly efficient. But 
sets are not the same as dictionaries and often you are storing other data 
types than smaller integers.

Many normal compression techniques can require lots of time to uncompress to 
find anything. My impression is that Roaring Bitmaps is a tad like the pkzip 
software that tries various compression techniques on each file and chooses 
whatever seems to work better on each one. That takes extra time when zipping, 
but when unzipping a file, it goes directly to the method used to compress it 
as the type is in a header and just expands it one way.

My impression is that Roaring bitmaps breaks up the range of integers into 
smaller chunks and depending on what is being stored in that chunk, may leave 
it as an uncompressed bitmap, or a list of the sparse contents, or other 
storage methods and can search each version fairly quickly. 

So, I have no doubt it works great for some applications such as treating 
social security numbers as integers. It likely would be overkill to store 
something like the components of an IP address between 0 and 255 inclusive.

But having said that, there may well be non-integer data that can be mapped 
into and out of integers. As an example, airports or radio stations have names 
like LAX or WPIX. If you limit yourself to ASCII letters then every one of them 
can be stored as a 32-bit integer, perhaps with some padding. Of course for 
such fairly simple data, some might choose to place the data in a balanced tree 
structure and get reasonable search speed.

I am curious about the size of some of these structures but obviously it 
depends. Are they stored on disk in this form too?


-Original Message-
From: Python-list  On 
Behalf Of MRAB
Sent: Thursday, February 16, 2023 11:24 PM
To: python-list@python.org
Subject: Re: Comparing caching strategies

On 2023-02-14 22:20, Rob Cliffe via Python-list wrote:
> On 11/02/2023 00:39, Dino wrote:
>> First off, a big shout out to Peter J. Holzer, who mentioned roaring 
>> bitmaps a few days ago and led me to quite a discovery.
>>
> I was intrigued to hear about roaring bitmaps and discover they really 
> were a thing (not a typo as I suspected at first).
> What next, I wonder?
>   argumentative arrays
>   chattering classes (on second thoughts, we have those already)
>   dancing dictionaries
>   garrulous generators
>   laughing lists
>   piping pipelines
>   singing strings
>   speaking sets
>   stuttering sorts
>   talking tuples
>   whistling walruses?

babbling bytestrings?

> The future awaits [pun not intended] ...

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: LRU cache

2023-02-14 Thread avi.e.gross
Chris,

That is a nice decorator solution with some extra features.

We don't know if the OP needed a cache that was more general purpose and
could be accessed from multiple points, and shared across multiple
functions.



-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Tuesday, February 14, 2023 5:46 PM
To: python-list@python.org
Subject: Re: LRU cache

On Wed, 15 Feb 2023 at 09:37, Dino  wrote:
>
>
> Here's my problem today. I am using a dict() to implement a quick and 
> dirty in-memory cache.
>
> I am stopping adding elements when I am reaching 1000 elements 
> (totally arbitrary number), but I would like to have something 
> slightly more sophisticated to free up space for newer and potentially 
> more relevant entries.
>
> I am thinking of the Least Recently Used principle, but how to 
> implement that is not immediate. Before I embark on reinventing the 
> wheel, is there a tool, library or smart trick that will allow me to 
> remove elements with LRU logic?
>

Check out functools.lru_cache :)

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: LRU cache

2023-02-14 Thread avi.e.gross
Dino,

If your question is understood, you want to treat a dictionary as a sort of
queue with a maximum number of entries. And, you want to remove some kind of
least useful item to make room for any new one.

Most dictionaries now have entries in the order they were entered. There may
already be some variant out there that implements this but it should not be
hard to create.

So you could simply change the method that adds an item to the dictionary.
If the new next  item is not already in the dictionary, simply remove the
first item using whatever method you wish. 

Getting all the keys may be avoided by using an iterator once as in:
next(iter( my_dict.keys() )) or something like that. 

You can then remove that item using the key and insert your new item at the
end.

Of course,  that is not least recently used but least recently entered. 

To deal with keeping track of what was accessed last or never, is a problem
not trivially solved with just a dictionary. I mean you could store a tuple
for each item that included a date and a payload as the value, and you could
then search all the values and find the one with the oldest date. That seems
a bit much so another architecture could be to maintain another data
structure holding keys and dates that perhaps you keep sorted by date and
every time the cache accesses a value, it finds the item in the second
storage and updates the date and  moves the item to the end of the
structure. 

But note that some may want to keep an access count so you always know how
many times an item has been re-used and thus not remove them as easily.

The main goal of a dictionary is to speed up access and make it almost
linear. Your algorithm can add so much overhead, depending on how it is
done, that it can defeat the purpose. 

What some might do is skip the dictionary and use some kind of queue like a
dequeue that handles your thousand entries and new items are added at the
end, items accessed moved to the front, and a brute force search is used to
find an entry. But note some algorithms like that may end up removing the
newest item immediately as it is least recently used if placed at the end. 

It may be an Ordered Dict is one solution as shown here:

https://www.geeksforgeeks.org/lru-cache-in-python-using-ordereddict/



-Original Message-
From: Python-list  On
Behalf Of Dino
Sent: Tuesday, February 14, 2023 5:07 PM
To: python-list@python.org
Subject: LRU cache


Here's my problem today. I am using a dict() to implement a quick and dirty
in-memory cache.

I am stopping adding elements when I am reaching 1000 elements (totally
arbitrary number), but I would like to have something slightly more
sophisticated to free up space for newer and potentially more relevant
entries.

I am thinking of the Least Recently Used principle, but how to implement
that is not immediate. Before I embark on reinventing the wheel, is there a
tool, library or smart trick that will allow me to remove elements with LRU
logic?

thanks

Dino
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: evaluation question

2023-02-13 Thread avi.e.gross
Weatherby,

Of course you are right and people can, and do, discuss whatever they feel like.

My question is a bit more about asking if I am missing something here as my 
personal view is that we are not really exploring in more depth or breadth and 
are getting fairly repetitive as if in a typical SF time loop. How does one 
break out?

Languages often change and people do come and go so some topics can often be 
re-opened. This though is a somewhat focused forum and it is legitimate to ask 
if a conversation might best be taken elsewhere for now. The main focus is, at 
least in theory, aspects of python and mostly about the core as compared to 
very specific modules, let alone those nobody here has ever even used. Within 
that, it is fair at times to compare something in python to other languages or 
ask about perceived bugs or about possible future enhancements. We could 
discuss if "YIELD FROM" is just syntactic sugar as it often can be done with 
just the naked YIELD statement, or whether it really allows you to do 
innovative things, as an example.

But I think where this conversation has gone is fairly simple. The question was 
why print() does not return the number of characters printed. The answers 
boiled down to that this was not the design chosen and is mostly consistent 
with how python handles similar functions that return nothing when the change 
is largely "internal" in a sense. In addition, plenty of us have suggested 
alternate ways to get what the OP asked for, and also pointed out there are 
other things that someone may have wanted instead or in addition, including the 
actual number of bytes generated for encodings other than vanilla ASCII, or 
pixels if the text was rendered on a screen using variable width fonts and so 
on.

Some of the above already counted, in my mind, as adding depth or breadth to 
the original question. But if the conversation degrades to two or more sides 
largely talking past each other and not accepting what the other says, then 
perhaps a natural ending point has been reached. Call it a draw, albeit maybe a 
forfeit.

So, as part of my process, I am now stuck on hearing many questions as valid 
and others as not productive. I don't mean just here, but in many areas of my 
life. The answer is that historically, and in other ways, python took a 
specific path. Others took other paths. But once locked into this path, you run 
into goals of trying to remain consistent and not have new releases break older 
software or at least take time to deprecate it and give people time to adjust.

I have seen huge growing pains due to growth. An example is languages that have 
added features, such as promises and their variants and used them for many 
purposes such as allowing asynchronous execution using multiple methods or 
evaluating things in a more lazy way so they are done just in time. Some end up 
with some simple function call being augmented with quite a few additional 
functions with slightly different names and often different arguments and ways 
they are called that mostly should no longer be mixed with other variants of 
the function. You need compatibility with the old while allowing the new and 
then the newer and newest.

Had the language been built anew from scratch, it might be simpler and also 
more complex, as they would skip the older versions and pretty much use the 
shinier new version all the time, even as it often introduces many costs where 
they are not needed. 

So it can be very valid to ask questions as long as you also LISTEN to the 
answers and try to accept them as aspects of reality. Yes, python could have 
had a different design and perhaps someday may have a different design. But 
that is not happening TODAY so for today, accept what is and accept advice on 
how you might get something like what you want when, and if, you need it. The 
goal often is to get the job done, not to do it the way you insist it has to be 
done.

At some point, unless someone has more to say with some new twist, it does 
become a bit annoying.

So let me say something slightly new now. I have been reading about interesting 
uses of the python WITH statement and how it works. Some of the examples are 
creating an object with dunder methods that get invoked on entry and exit that 
can do all kinds of things. One is the ability to modify a list in a way that 
can be rolled back if there is an error and it is quite simple. You make a copy 
of the original list on entry. Things are then done to the copy. And if you 
exit properly, you copy the modified list back on top of the original list. 
Errors that result simply unwind the serious of transactions by leaving the 
original list untouched. 

Another example has your output stream redirected within the WITH and then put 
back in place after. What this allows, among many things, is for everything 
printed to be  desrever. Now clearly, such a technique could also be used to 
capture what is going to be printed, and 

RE: evaluation question

2023-02-10 Thread avi.e.gross
There are no doubt many situations someone wants to know how long something
will be when printed but often at lower levels.

In variable-width fonts, for example, the number of characters does not
really line up precisely with how many characters. Some encodings use a
varying number of bytes and, again, the width of the output varies.

So for people who want to make 2-D formatted output like tables, or who want
to wrap lines longer than N characters, you more often let some deeper
software accept your data and decide on formatting it internally and either
print it at once, when done calculating, or in the case of some old-style
terminals, use something like the curses package that may use escape
sequences to update the screen more efficiently in various ways.

If someone wants more control over what they print, rather than asking the
print() function to return something that is mostly going to be ignored,
they can do the things others have already suggested here. You can make your
message parts in advance and measure their length or anything else before
you print. Or make a wrapper that does something for you before calling
print, perhaps only for common cases and then returns the length to you
after printing.

I wonder if the next request will be for  print() to know what your output
device is and other current settings so it return the width your text takes
up in pixels in the current font/size ...

I add a tidbit that many ways of printing allow you to specify the width you
want something printed in such as you want a floating point value with so
many digits after the decimal point in a zero or space padded field on the
left. So there are ways to calculate in advance for many common cases as to
how long each part will be if you specify it. Besides, I am not really sure
if "print" even knows easily how many characters it is putting out as it
chews away on the many things in your request and calls dunder methods in
objects so they display themselves and so on. I assume it can be made to
keep track, albeit I can imagine printing out an APL program with lots of
overwritten characters where the number of bytes sent is way more than the
number of spaces in the output.

Why are we even still talking about this? The answer to the question of why
print() does not return anything, let alone the number of characters
printed, is BECAUSE.


-Original Message-
From: Python-list  On
Behalf Of Python
Sent: Friday, February 10, 2023 4:56 PM
To: python-list@python.org
Subject: Re: evaluation question

On Sat, Feb 11, 2023 at 08:30:22AM +1100, Chris Angelico wrote:
> On Sat, 11 Feb 2023 at 07:36, Python  wrote:
> > You would do this instead:
> >
> > message = f"{username} has the occupation {job}."
> > message_length = len(message)
> > print(message)
> > print(message_length)
> > ...
> >
> 
> It's worth noting WHY output functions often return a byte count. It's 
> primarily for use with nonblocking I/O, with something like this:
> 
> buffer = b".."
> buffer = buffer[os.write(fd, buffer):]
> 
> It's extremely important to be able to do this sort of thing, but not 
> with the print function, which has a quite different job.

I would agree with this only partially.  Your case applies to os.write(),
which is essentially just a wrapper around the write() system call, which
has that sort of property... though it applies also to I/O in blocking mode,
particularly on network sockets, where the number of bytes you asked to
write (or read) may not all have been transferred, necessitating trying in a
loop.

However, Python's print() function is more analogous to C's printf(), which
returns the number of characters converted for an entirely different
reason... It's precisely so that you'll know what the length of the string
that was converted is.  This is most useful with the
*snprintf() variants where you're actually concerned about overrunning the
buffer you've provided for the output string, so you can realloc() the
buffer if it was indeed too small, but it is also useful in the context of,
say, a routine to format text according to the size of your terminal.  In
that context it really has nothing to do with blocking I/O or socket
behavior.

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: RE: How to read file content and send email on Debian Bullseye

2023-02-05 Thread avi.e.gross
Bart,

Some really decent cron jobs can be written without using anything complex.

I get it  now that perhaps your motivation is more about finding an excuse
to learn python better. The reality is there is not much that python cannot
do if other programming languages and environments can do them so asking if
it can do it feels a tad naïve. Like many languages, python is very
extendable with all kinds of modules so often instead of doing something
totally on your own, you can find something that does much of the hard work
for you, too. 

Yes, python can nicely read in lines from a log and compare them to a fixed
string or pattern and either find or not find what you ask for.

But a shell script can be written in a few minutes that simply gets the
string for the current date in the format you want, interpolates it in a
regular expression, calls grep or a dozen other programs that handle a
command line argument, and if it returns a single line, you send one email
and if none you send another and if more than one, you may have done it
wrong. Some such programs, like AWK are designed mainly to do exactly
something like you ask and examine each input line against a series of
patterns. Sending an email though is not always something easy to do from
within a program like that but a shell script that checks how it ends may do
that part.

If you are on a specific machine and only need to solve the problem on that
machine or something similar, this seems straightforward. 

My impression is you want to match a line in the log file that may look like
it should  match "anything", then some keyword or so that specifies all
lines about this particular upload on every day, then matches another
"anything" up to something that exactly matches the date for today, and
maybe match another "anything" to the end of the line. It can be a fairly
straightforward regular expression if the data has a regular component in
the formatting. Grep, sed, awk, perl and others can do this and others. 

Could you do this faster in python? Maybe. Arguably if speed is a major
issue, write it in some compiled language like C++. 

But if your data is very regular such as the entry will have some key phrase
between the 12th and 19th characters and the date will be exactly in another
exact region, then you certainly can skip regular expressions and read each
line and examine the substrings for equality. You may also speed it a bit if
you exit any such loop as soon as you find what you are looking for. 

I note if your log file is big but not very busy, and you are pretty sure
the entry will be in the last few (maybe hundred) lines, some may use the
tail command and pipe the text it returns to whatever is processing the
data. There are many ways to do what you want.

But you improve your chances of getting an answer if you ask it more
clearly. There have been people (maybe just one) who have been posing
questions of a rather vague nature and then not responding as others debate
it in seemingly random directions. You are interacting nicely but some of us
have become hesitant to jump in until they see if the request is
well-intended. You do sound like you know quite a bit and your question
could have  been closer to saying that you know several ways to do it
(include examples or at least outlines) and wonder if some ways are better
or more pythonic or ...

So if people keep wondering what you want, it is because the people here are
not generally interested in doing homework or complete programs for people.
If you ask us how to generate a string with the current date, and cannot
just find it on your own, we might answer. If you want to know how to store
a date as an object including the current time, and also convert the text on
a line in the log file to make another such date object and then be able to
compare them and be able to include in your email how LONG AGO the upload
was done, that would be another more specific request. If you are not sure
how python does regular expressions, ...

Otherwise, what you are asking for may not be BASIC for some but seems
relatively straightforward to some of us and we sort of may be  wondering
if we are missing anything?

Good Luck,

^Avi

-Original Message-
From: Python-list  On
Behalf Of ^Bart
Sent: Sunday, February 5, 2023 8:58 AM
To: python-list@python.org
Subject: Re: RE: How to read file content and send email on Debian Bullseye

> For example, try to do whatever parts you know how to do and when some 
> part fails or is missing, ask.

You're right but first of all I wrote what I'd like to do and if Python
could be the best choice about it! :)

> I might have replied to you directly if your email email address did 
> not look like you want no SPAM, LOL!

Ahaha! I think you know what is spam and what is a reply\answer to a post
request so you can feel free to use also my email! :)

> The cron stuff is not really relevant and it seems your idea is to 
> read a part or all of a log file, parse the 

RE: How to read file content and send email on Debian Bullseye

2023-02-04 Thread avi.e.gross
Bart, you may want to narrow down your request to something quite specific.
For example, try to do whatever parts you know how to do and when some part
fails or is missing, ask.

I might have replied to you directly if your email email address did not
look like you want no SPAM, LOL!

The cron stuff is not really relevant and it seems your idea is to read a
part or all of a log file, parse the lines in some way and find a line that
either matches what you need or fail to find it. Either way you want to send
an email out with an appropriate content.

Which part of that do you not know how to do in python? Have you done some
reading or looking?



-Original Message-
From: Python-list  On
Behalf Of ^Bart
Sent: Saturday, February 4, 2023 10:05 AM
To: python-list@python.org
Subject: How to read file content and send email on Debian Bullseye

Hi guys,

On a Debian Bullseye server I have a lftp upload and after it I should send
an email.

I thought to read the lftp log file where I have these lines:

2023-01-30 18:30:02
/home/my_user/local_folder/upload/my_file_30-01-2023_18-30.txt ->
sftp://ftp_user@ftpserver_ip:2201/remote_ftp_folder/my_file_30-01-2023_18-30
.txt
0-1660576 4.92 MiB/s

2023-02-02 18:30:02
/home/my_user/local_folder/upload/my_file_02-02-2023_18-30.txt ->
sftp://ftp_user@ftpserver_ip:2201/remote_ftp_folder/my_file_02-02-2023_18-30
.txt
0-603093 3.39 MiB/s

I'd like to use Python to check, from monday to friday (the lftp script runs
in crontab from monday to friday) when the upload works is finished and I
should send an email.

I could read by Python lftp.log and after it if there's a line with the same
day of the machine I could send an email with ok otherwise the email will
send a message with "no upload".

How could I do by Python?

Regards.
^Bart
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: evaluation question

2023-01-31 Thread avi.e.gross
I think its has been discussed here that many functions are DELIBERATELY
designed to return without returning anything. Earlier languages like Pascal
had explicit ideas that a function that did not return a value was declared
as a "procedure" but many other languages like python make no real
differentiation.

Some functions are designed for a sort of side-effect and often there is
nothing much that needs to be returned or even can be. If a function prints
a dozen items one at a time, should it return nothing, or a copy of the last
item or somehow of all items? Generally nothing looks right. If you want to
return something, fine. Do it explicitly.

Similar arguments have been made about methods that do things like sort the
contents of an object internally and then return nothing. Some would like
the return to be the (now altered) object itself. You can emulate that by
not sorting internally but instead sorted(object) returns a new object that
has been sorted from the old one.

So should or could print return anything? Other languages exist, like R,
that do return (and often ignore) whatever print displayed elsewhere. This
can be of use in many ways such as making it easier to print or store
additional copies without recalculating. 

My preference might be to simply allow a local option at the end of a print
statement such as print(..., return=True) or even a way to set a global
option so all print statements can be turned on when you want. But is this
pythonic? In particular, people who want to give type hints now can safely
claim it returns None and would have to modify that so it can optionally
return something like str or None. And, of course, once you change print()
this way, someone else will want the number of characters (or perhaps bytes)
returned instead.

Much of this can be worked around by simply making your own customized print
function which evaluates the arguments to make a string and then calls
print, perhaps with the results pre-calculated, and returns what you wanted.
That is not as easy as it sounds, though as print  supports various
arguments like sep= and end= and file= and flush= so a weird but doable idea
is simply to substitute a temporary file for any file= argument and write
the results to a temporary file or something in memory that emulates a file.
You can then read that back in and return what you want after handling the
original print statement with the original arguments, or perhaps just use
your result to any actually specified file or the default.

You can thus create something like what you want and leave the original
print() command alone to do what it was designed to do.

And, in general, people who want a copy of what they print, often use other
python functionality to craft some or all parts of the text they want
printed and only then call print() and thus already may have the ability to
use the text afterwards.

For many purposes, including efficiency, returning nothing makes good sense.
But it is not really the only choice or the right choice and yet, if you
want to use THIS language, it has to be accepted as the documented choice.


-Original Message-
From: Python-list  On
Behalf Of Thomas Passin
Sent: Tuesday, January 31, 2023 1:16 PM
To: python-list@python.org
Subject: Re: evaluation question

On 1/31/2023 4:24 AM, mutt...@dastardlyhq.com wrote:
> On Tue, 31 Jan 2023 12:57:33 +1300
> Greg Ewing  wrote:
>> On 30/01/23 10:41 pm, mutt...@dastardlyhq.com wrote:
>>> What was the point of the upheaval of converting the print command 
>>> in python 2 into a function in python 3 if as a function
>>> print() doesn't return anything useful?
>>
>> It was made a function because there's no good reason for it to have 
>> special syntax in the language.
> 
> All languages have their ugly corners due to initial design mistakes 
> and/or constraints. Eg: java with the special behaviour of its string 
> class, C++ with "=0" pure virtual declaration. But they don't dump 
> them and make all old code suddenly cease to execute.
> 
> Pragmatism should always come before language purity.
> 

It was more fundamental than that, and not mainly about print():

https://snarky.ca/why-python-3-exists/
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to make argparse accept "-4^2+5.3*abs(-2-1)/2" string argument?

2023-01-29 Thread avi.e.gross
Cameron,

You are technically correct but perhaps off the mark.

Yes, a python program only sees what is handed to it by some shell if invoked a 
certain way.

The issue here is what you tell people using your program about what they need 
to type to get it to work. That means if their shell is going to make changes 
in what they typed, they need to know how to avoid unintended changes. As one 
example not mentioned, whitespace disappears if not somehow protected as in 
quotes.

What the OP is being told is that their Python program only controls what is 
fed to it. A user needs to know enough to avoid doing silly things like provide 
an unquoted string containing reserved symbols like a pipe symbol or odd things 
may happen and their program may not even be called. 

So the documentation of how to use the program may need to spell some things 
out alongside suggesting use of "--" ...


-Original Message-
From: Python-list  On 
Behalf Of Cameron Simpson
Sent: Sunday, January 29, 2023 12:51 AM
To: python-list@python.org
Subject: Re: How to make argparse accept "-4^2+5.3*abs(-2-1)/2" string argument?

On 28Jan2023 18:55, Jach Feng  wrote:
>Mark Bourne 在 2023年1月28日 星期六晚上10:00:01 [UTC+8] 的信中寫道:
>> I notice you explain the need to enclose the equation in quotes if it 
>> contains spaces. That's not even a feature of your application, but 
>> of the shell used to call it. So why so much objection to explaining 
>> the need for "--"?
>>
>> Depending on the shell, there are other cases where quoting might be 
>> needed, e.g. if the equation includes a "*" or "?" and happens to 
>> look like a pattern matching files in the current directory (most 
>> shells I've used pass the pattern unchanged if it doesn't match any 
>> files). In bash, if a "$" is used I'd need to enclose that in 'single quotes'
>> (can't even use "double quotes" for that one). You can't really 
>> expect to document all that sort of thing, because it depends on 
>> which shell the user happens to run your application from - you just 
>> have to trust the user to know or learn how to use their shell.
>
>Thank you for detail explanation of the role the shell is involved in this 
>problem. I'm very appreciated!

The shell has basicly _nothing_ to do with your problems. By the time you've 
got sys.argv in your Python programme you will have no idea whether quotes were 
used with an argument. (UNIX/POSIX, not Windows, where things are ... more 
complex.) This means you don't know if the use
typed:

 -4.5

or

 "-4.5"

You'll just get a string '4.5' in your Python programme both ways.

All the quotes in the shell do is delimit what things should be kept together 
as a single argument versus several, or where variables should be interpolated 
when computing arguments etc. It's just _shell_ punctuation and the invoked 
programme doesn't see it.

>It seems that a CLI app may become very complex when dealing with different 
>kind of shell, and may not be possible to solve its problem.

It doesn't matter what shell is used. The just controls what punctuation the 
end user may need to use to invoke your programme. You programme doesn't need 
to care (and can't because it doesn't get the quotes etc, only their result).

>> So why so much objection to explaining the need for "--"?
>Because of using " to enclose a space separated string is a common 
>convention, and adding a "--" is not:-)

They're unrelated. As others have mentioned, "--" is _extremely_ common; almost 
_all_ UNIX command like programmes which handle -* style options honour the 
"--" convention. _argparse_ itself honours that convention, as does getopt etc.

The "--" convention has nothing to do with the shell.

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to make argparse accept "-4^2+5.3*abs(-2-1)/2" string argument?

2023-01-29 Thread avi.e.gross
Although today you could say POSIX is the reason for many things including
the use of "--" I hesitate to mention I and many others used that convention
long before as a standard part of many UNIX utilities. Like many other such
things, you build things first and by the time you standardize, ...


-Original Message-
From: Python-list  On
Behalf Of 2qdxy4rzwzuui...@potatochowder.com
Sent: Sunday, January 29, 2023 7:12 AM
To: python-list@python.org
Subject: Re: How to make argparse accept "-4^2+5.3*abs(-2-1)/2" string
argument?

On 2023-01-29 at 16:51:20 +1100,
Cameron Simpson  wrote:

> They're unrelated. As others have mentioned, "--" is _extremely_ 
> common; almost _all_ UNIX command like programmes which handle -* 
> style options honour the "--" convention. _argparse_ itself honours 
> that convention, as does getopt etc.

And why do UNIX programs behave this way?

Because POSIX says they should:

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html
--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


RE: How to make argparse accept "-4^2+5.3*abs(-2-1)/2" string argument?

2023-01-28 Thread avi.e.gross
Jack,

I get uneasy when someone thinks a jackhammer is a handy dandy tool for pushing 
in a thumbtack that is sitting on my expensive table.

I agree it is quite easy to grab some code that does lot of things and also 
does something truly minor, and use it for that purpose. Sometimes the cost is 
that it is far slower. I mean if I want to print the smaller of variables alpha 
and beta containing integers, I could combine them in something like a list or 
array and call a nifty sort() function someone wrote that takes a function 
argument that lets you compare whatever is needed and it likely will work. You 
get back the result and print the first item in the sorted output or maybe the 
last. You could have used a simpler function like min(alpha, beta) and skipped 
making a list of them. Heck, you could have used a fairly simple if statement. 

But if you searched the internet and found a function that takes any number of 
data items and performs many statistical tests on it tat start with things like 
mean, median and mode and continues to calculate standard deviation, and skew 
and standard error and also the min/max/range. Sounds good. It returns some 
structure/object and you dig into it and find the minimum and you are done!

It sounds like the jackhammer approach to me.

In your case, there is indeed nothing wrong with using the function to help 
parse the command line arguments and you are being told that it generally works 
EXCEPT when it is documented NOT TO WORK. It was designed to deal with UNIX 
conventions so you could call a program with a mix of optional flags and then 
optional arguments such as filenames or anything else or ALMOST anything else. 
The early design allowed something like to list files using:

ls - la *.c

or

ls - l -a *.c

Or many more variations that can include newer ways that use longer options. 
People ran into the problem of having some options that included a word 
following them that is to be parsed as part of that option and other 
enhancements. The minus sign or hyphen was chosen and is not trivial to change. 
So the idea was to add a flag with two minus signs that has no meaning other 
than to say that anything after that is to be somewhat ignored as not being a 
flag. Thus it is safe for any further arguments to start with a minus sign. 

How does that line up with some of the advice you got?

One idea was to have your users told they should add a " -- " (without quotes) 
before the formula you want to evaluate. Another was asking if you had a more 
complex command line with many options that needed a swiss army knife approach. 
If not, since you have formulas starting with a minus sign that may be used, 
consider NOT using a tool you admit you did not understand.

And since python like many languages provides you with a data structure that 
already holds the info you need, use the thumbtack approach or find or write a 
different library function that extracts what you need in an environment where 
it is NOT looking for things with a minus sign.

You know when you go to a doctor and ask why a part of your skin  is bleeding 
and so on, and you admit you keep scratching yourself, rather than suggesting 
complex surgeries or putting your hands in a cast to keep you from scratching, 
they may first try telling you to NOT DO THAT and let it heal.

Do note a final point. There is nothing magical about using a minus sign in 
UNIX and they could have insisted say that command line arguments be placed in 
square brackets or some other method to identify them. It seemed harmless to 
use a minus sign at the time. But the programs they includes starting using so 
many allowed tweaks of many kinds, that it got really hard to parse what was 
being asked and thus several libraries of functions have been built that manage 
all that. Your problem is far from unique as any program trying to deal with an 
argument for a filename that begins with a minus sign will have a similar 
problem. Programs like grep often take an argument that is a regular expression 
or other utilities take a "program" that also may for whatever reason start 
with a minus sign. 

Some people ask a question and you have asked many here, and got answers. 
Usually you accept what people tell you better. I am trying to say that the 
issue is not whether you are using the wrong tool in general. It is that NOW 
that you know there is sometimes a problem, you resist understanding it was 
never designed to do what you want.


-Original Message-
From: Python-list  On 
Behalf Of Jach Feng
Sent: Saturday, January 28, 2023 12:04 AM
To: python-list@python.org
Subject: Re: How to make argparse accept "-4^2+5.3*abs(-2-1)/2" string argument?

Jach Feng 在 2023年1月22日 星期日上午11:11:22 [UTC+8] 的信中寫道:
> Fail on command line,
> 
> e:\Works\Python>py infix2postfix.py "-4^2+5.3*abs(-2-1)/2" 
> usage: infix2postfix.py [-h] [infix]
> infix2postfix.py: error: unrecognized arguments: -4^2+5.3*abs(-2-1)/2
> 
> Also fail in REPL,

logically Boolean

2023-01-28 Thread avi.e.gross
The topic has somewhat modified to asking what is a BOOLEAN.

 

The right answer is that the question is too simple. It is what YOU want it
to be in your situation. And it is also what the LANGUAGE designers and
implementers have chosen within their domain.

 

Mathematically, as part of classical logic, there is a concept of having two
sets with the union of the two being a null set. The sets are disjoint and
in some cases, cover all possibilities and in other cases the possibilities
are narrowed so that just two distinct groupings remain. Using 0 and 1 makes
sense for a binary bit, albeit in either order for True versus False. No
other numbers can be stored in a single binary digit so this is a trivial
but useful use of Booleans in some languages. In hardware or storage
mediums, there are not literal zeroes or ones. You can have a high or low
value or a magnetic field or spin or whatever you want to implement it, and
sometimes it fails anyway and you get indeterminate values or a bit seems to
flip. That is an implementation detail but Boolean logic remains a
mathematical concept.

 

UNIX programs did not return a Boolean value on exit so that discussion is a
tad off the point. They returned a STATUS of sorts. Mostly the status was
zero for all forms of success and you could signal many forms of failure
using any other number. But I can imagine designing a program like grep to
return 0 if nothing was found but the program did not fail, and a positive
number up to some limit that specified how many lines matched and perhaps -1
or other numbers on failure. A return status need not be seen as Boolean,
albeit in my made-up example, any non-zero exit status up to the limit would
be in the set of SUCCESS and -1 or whatever would be in the set for FAILURE.

 

 

Consider how a Python function can return OR throw an error. Many different
errors can be thrown and some propagate up from deeper levels if not caught
and handled. Generally success simply means no errors BUT you can also
design a function that throws various success "errors" and indirectly
returns one of several kinds of values in the payload, as well as errors
meant to be treated as bad. Heck, you can have some errors be warnings and
the caller of the function may have to call it from a "try" and specify what
to catch and how to deal with it. I repeat, not everything must be Boolean
when it comes to status types of things.

 

As noted, many other schemes have been used to represent a Boolean variable.

And given some implementations, something emerges that can confuse some
people. Grant mentioned even/odd. On most machines this simply means you can
safely ignore a longer clustering of bits such as with a 32-bit integer and
simply examine the least significant bit which determines if it is even or
odd. Heck, you can even store other useful info in the remaining bits, if
you wish.

 

As has been said repeatedly, it does not matter what the OP expects, as many
implementations are possible including some that do not follow the above
suggestions.

 

Much of my work has included other possibilities often using data that is
not always able to be analyzed in a truly Boolean fashion. Modern
Programming languages have to grapple with items that look like they can
have three or more values. I mean you can have a variable that can be TRUE
or FALSE but also one of many forms of Not Available or more. In a language
like R, their version of Boolean can have a value of NA but actually in some
contexts you have to specify which kind of NA such as NA_integer_ or
NA_character_ so obviously a vector of Booleans cannot be implemented using
a single bit for each. I have seen packages that can import data with many
kinds of missing values as there are other programs that allow you to tag
missing values to mean things like "missing: child did not show up to  take
test" or "missing: test not completed, will re-test" or "missing: caught
cheating" or "missing: identity of test taker not verified"  and so on. I
helped with one such package to create a type within R that does not discard
the other info so it supports multiple NA variants while at the same time it
can be processed as if every NA was the same.

 

Python has concepts along the same lines such as np.nan so there has to be a
way of storing data with perhaps a wrapper around things that can specify if
the contents are what is expected, or maybe something else. 

 

And I won't get into fuzzy logic. But just plain classical Boolean logic  is
a mathematical concept that can be expressed many ways. Some of the ways in
python look compatible with integers but that was a design choice. If they
had chosen to store a character containing either "T" or "F" then perhaps
they would allow Booleans to be treated as characters and let them be
concatenated to strings and so on.

 

-Original Message-

From: Python-list mailto:python-l

  1   2   >