Re: Saying bye bye to Python 2

2020-01-11 Thread Marko Rauhamaa
tommy yama :
> As many know, python 2 was retired. 
> This means imminent migration to 3 will be a must ?

Python 2 will have a lively retirement. It won't be dead before RHEL 7
is dead. According to

   https://access.redhat.com/support/policy/updates/errata

the support dates for RHEL 7 are:

   End of Full Support: Aug 6, 2019
   End of Maintenance Support 1: Aug 6, 2020.
   End of Maintenance Support 2: June 30, 2024.
   End of Extended Life-cycle Support: TBD
   End of Extended Life Phase: ongoing
   Last Minor Release: TBD


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Coding technique: distinguish using type or abc?

2020-01-09 Thread Marko Rauhamaa
r...@zedat.fu-berlin.de (Stefan Ram):
> if type( object ) is list:

I would recommend isinstance() because:

   >>> isinstance(True, int)
   True
   >>> type(True) is int
   False


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python, Be Bold!

2020-01-04 Thread Marko Rauhamaa
Greg Ewing :
> You can pass a zip file with a .pyz extension to the python
> interpreter and it will look for a __main__.py file and run
> it.

This discussion has been interesting, and I've now learned about zipapp.
This is really nice and will likely be a favorite distribution format
for Python applications.

My initial instinct would be to use zipapp as follows:

 * The starting point is a directory containing the Python source files,
   possibly with subdirectories. (A simple, two-file example:
   "myapp.py", "auxmod.py".)

   Be sure that "myapp.py" contains the boilerplate footer:

   if __name__ == '__main__':
   main()

 * Test that the program works:

   $ ./myapp.py
   hello

 * Generate the zipapp:

   $ python3 -m zipapp . -o ../myapp.pyz --main myapp:main \
   -p "/usr/bin/env python3"

 * Try it out:

   $ ../myapp.pyz
   hello


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Use epoll but still lose packet

2019-11-20 Thread Marko Rauhamaa
Dennis Lee Bieber :
> (though as I don't really understand the use of this function, that
> may just mean that all the events will be in the one return structure,
> and not that there is only one event).

I use epoll almost every day. You've done a good job explaining it.

>   Given that your code only has one socket/file-descriptor I have
> to ponder why a simple select() isn't usable; nor do I see anything
> that detects who is sending the packet (there is no .accept() or
> .recv_from() which would provide the remote host information).

The main driver for epoll vs select is that every call to select causes
the kernel to parse the event specification from the arguments, which
can be slow if there are a lot of file descriptors to monitor. With
epoll, the event specification is loaded to the kernel only once.

To get a full benefit from epoll, you call it with the EPOLLET flag. One
nasty problem with epoll's predecessors (select and poll) is that when
output buffers are full, you have to turn on output monitoring
separately, and when the output buffers have room again, you need to
turn output monitoring off. That can create annoying code and silly
traffic between the process and the kernel.

The EPOLLET flag only gives you a notification when the situation
changes. Thus, you would monitor for EPOLLIN|EPOLLOUT|EPOLLET and forget
the file descriptor. epoll_wait will return whenever the input or output
status changes. Really economical. You must be careful, though, if you
forget to react to an event, you won't get another notification before
the status changes again.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using Makefiles in Python projects

2019-11-12 Thread Marko Rauhamaa
Rhodri James :
> On 11/11/2019 19:05, Bill Deegan wrote:
>> You could use SCons (native python... )
>
> I could.  But I'd have to learn how to first, and particularly for
> complex cross-platform working that involves learning a lot of stuff I
> already know how to do in Make.  The time investment has never seemed
> that worthwhile.

SCons can be learned by reading its man page in a single afternoon.
That's how I learned it.

The toughest part is to decide *how* to use it as SCons entices you with
a plugin architecture. The temptation is to create a sentient monster
out of your build specification.

I have successfully applied a minimalistic style that is plain to a
random bystander who doesn't know either Python or SCons.

Here's a synopsis...

At the root of your repository, you write a (Python) file called
"SConstruct". In each source directory, you write a (Python) file called
"SConscript". The SConstruct uses SCons primitives to call the
individual SConscript files.

The SCons primitives in the SCons* files don't execute build commands.
Rather, they construct a dependency graph between all source and target
files across the repository. (I favor a non-globbing style where every
file is specified explicitly.)

Each SConscript file starts with the command:

Import('env')

where "env" is short for "environment". An "environment" does *not*
refer to environment variables but to a collection of build parameters
(include paths, libraries, compilation flags etc).

The SConscript file might contain this line:

env.Program("hello", [ "hello.c", "world.c" ])

meaning:

Using the build parameters stored in "env", compile the executable
program "hello" out of two C source code files.

SCons has builtin knowledge of some programming languages. So SCons
knows how to preprocess the source files and can deduct the
dependencies.

Note that the above "env.Program()" command does not yet execute
anything; it simply specifies a build node with associated explicit and
implicit dependencies.

Ad-hoc build rules are expressed using "env.Command()":

env.Command("productivity.txt", [ "hello.c", "world.c" ],
r"""cat $SOURCES | wc -l >$TARGET""")


The tricky part is writing SConstruct. At its simplest, it could be
something like this:

def construct():
env = Environment()
SConscript("src/SConscript", exports="env")

if __name__ == "SCons.Script":
construct()


In my experience, all kinds of cross-compilation and variant-directory
logic is placed in SConstruct.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Using Makefiles in Python projects

2019-11-08 Thread Marko Rauhamaa
Skip Montanaro :

> On Thu, Nov 7, 2019 at 1:24 PM Vitaly Potyarkin  wrote:
>>
>> What do you think of using Makefiles for automating common chores in
>> Python projects? Like linting, type checking and testing?
>
> Kinda unsure why this needs to be asked (says the guy who's used Make
> longer than Python and nearly as long as Emacs). :-) That said, I will
> answer in the affirmative. Make is a great tool.

I can't agree that make is a great tool. It's a tool a slight step
better than unconditional build scripts, but it's really only suitable
for projects involving a single directory (although baffling heroics
have been achieved using GNU Make in particular).

Of the more modern build systems, I have found SCons to be the best. It
has a killer feature not found elsewhere: dependency checks are based on
content, not timestamps. That way going back in time (which is very
common in everyday development) won't confuse build dependencies. We
have considered other advanced competitors to SCons, but the
content-based type-checking feature is something I wouldn't give up.

(If SCons developers are reading, thanks for your insight and efforts.
SCons has gotten so many things right with very few blind spots. I've
been a happy user since 2003.)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Artifact repository?

2019-10-31 Thread Marko Rauhamaa
Dan Stromberg :
> Can anyone please recommend an opensource "Artifact Repository" suitable
> for use with CPython, including dozens of wheels, tar.gz's and .py's?
>
> By an Artifact Repository, I mean something that can version largish
> binaries that are mostly produced by a build process.
>
> It doesn't necessarily have to be written in Python, but it does need to
> cooperate with Python well.
>
> I'm thinking of something like Artifactory or Archiva or similar - but I
> have zero experience with these tools, so a recommendation would be really
> helpful.

This question doesn't seem to have much to do with Python.

Anyway, at work, we use an in-house component system (written using
Python) that uses Artifactory as an artifact store. The system works
well. Artifactory's role in the system is to just be a "dumb disk" with
a REST API. It supports a nice feature that you can allow developers to
read and write *but not delete* from it.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Boolean comparison & PEP8

2019-07-28 Thread Marko Rauhamaa
Jonathan Moules :

> Lets say I want to know if the value of `x` is bool(True).
> My preferred way to do it is:
>
> if x is True:
> [...]
>
> But this appears to be explicitly called out as being "Worse" in PEP8:
>
> [...]
>
> Why?

It has primarily to do with the naturalness of expression. In English,
you say:

   If you have a moment, I'll show you.

   If you had a dog, you'd understand.

instead of:

   If your having a moment is true, I'll show you.

   If your having a dog were true, you'd understand.

By the same vein, in Python you say:

   if len(students) < 7:
   klass.cancel()

rather than:

   if (len(students) < 7) is True:
   klass.cancel()


Furthermore, while True and False are singleton objects, referring to
them through the "is" operator seems strikingly ontological in most
contexts. You are no longer interested in the message of the letter but
the fibers of the paper it was written on.

I *could* imagine a special case where a positional argument's semantics
would depend on the specific object. For example,

   >>> os.path.exists(False)
   True

is rather funky and the os.path.exists function would probably benefit
from a check such as:

   if path is True or path is False:
   raise Hell()

but even in such cases, it is more customary to say:

   if isinstance(path, bool):
   raise Hell()


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Definite or indefinite article for non-singletons?

2019-07-28 Thread Marko Rauhamaa
Ethan Furman :

> On 07/27/2019 02:10 PM, Chris Angelico wrote:
>> When talking about indistinguishable objects, is it correct to talk
>> about "the " or "an "?
>
> Multiple indistinguishable objects are still multiple, so "an".
>
> Implementation details should only enter the conversation when
> specifically discussing the implementation -- so CPython is an
> implementation detail while Python is the language.

Yes. If the API guarantees singleton-ness (reliable testability through
"is"), the correct article is "the", otherwise "an".

There is no guarantee that two empty strings are the same object so "an
empty string" is the right expression.

Now, len(string) return "the length" of the string of "a length" of the
string:

   >>> s = "*"*2000
   >>> len(s)
   2000
   >>> len(s) is len(s)
   False

Here "the length" is the right answer and must be understood as a
contraction of the pedantic: "an integer representing the length".

So it depends on the context if the relevant equivalence is "is" or
"==". Maybe the rule of thumb is that if we are talking about strings,
integers and similar things, we should think about it from the point of
view of Python's data model (objects; "is"). But when we talk about
things like "length", "square root", "sum" or "name", the point of view
is the abstractions the objects are standing for ("==").


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 594 cgi & cgitb removal

2019-05-25 Thread Marko Rauhamaa
Jon Ribbens :
> On 2019-05-25, Michael Torrie  wrote:
>> Not really. Serverless just means stateless web-based remote
>> procedure calls. This is by definition what CGI is.
>
> No, it isn't. CGI is a specific API and method of calling a program in
> order to serve a web request. It isn't a shorthand for "any web-based
> remote procedure call".

Both of you make relevant and insightful statements.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 594 cgi & cgitb removal

2019-05-24 Thread Marko Rauhamaa
Paul Rubin :
> Stéphane Wirtel  writes:
>> Not a massive effort, but we are limited with the resources.
>
> I keep hearing that but it makes it sound like Python itself is in
> decline. That is despite the reports that it is now the most popular
> language in the world. It also makes me ask why the Python team keeps
> adding new stuff if it can't even keep the old stuff running. I'd urge
> a more conservative approach to this stuff.

Generally, my feelings are the same as yours, and I'm saddened by the
steady decline of one of my all-time favorite programming languages.

However, the Python developers can do whatever they want with their free
time. Of course, it's much more exciting to add new bells and whistles
to a language than maintain some 1980's legacy. So I can't make any
demands for Python.

> People who want bleeding edge advances in language technology should
> use Haskell. People who want amorphous crap-laden ecosystems that keep
> changing and breaking should use Javascript/NPM. Those who want to be
> assimilated by the Borg and get aboard an entire micromanaged
> environment have Goland or (even worse) Java. Python for a while
> filled the niche of being a not too cumbersome, reasonably stable
> system for people trying to get real-world tasks done and wanted a
> language that worked and stayed out of the way.

There's a programming language arms race. Python wants to beat Java, C#
and go in the everything-for-everybody game. Python developers seem to
take the popularity of the language as proof of success. Pride goes
before the fall.

> Please don't abandon that.

I'm afraid the damage is already done.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Class Issue`

2019-03-06 Thread Marko Rauhamaa
Rhodri James :

> On 06/03/2019 14:15, Calvin Spealman wrote:
>>> C++ (a language I have no respect for)
>> This was uncalled for and inappropriate. Please keep discord civil.
>
> That was the civil version :-)

C++ is a programming language without feelings. It's nobody's ethnicity,
sexual orientation, race or religion, either. You can despise it freely.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lifetime of a local reference

2019-02-28 Thread Marko Rauhamaa
Roel Schroeven :
> In the absence of any other mention of bindings being removed, to me
> it seems clear that bindings are not automatically removed. Otherwise
> many things become ambiguous. Example: the documentation for dicts
> defines "d[key] = value" as "Set d[key] to value". Does that mean it
> can get unset later behind your back? Of course not.

Nothing is so obvious here.

A Python implementation is free to do anything at all that complies with
correct semantics. Sometimes correct behavior is unspecified.

In your case, "d[key] = value", the binding can disappear if the name d
becomes "stale". d[key] will be bound if anybody should look but if the
compiler or runtime environment can prove nobody will every look, it
might be able to optimize it away.

If you accept that

   open("xyz")

can silently close the file immediately, sometime later or never, you
should have no problem believing that

   f = open("xyz")

could behave the same way.

> It's purely an implementation detail. As long as the visible behavior
> of the program complies with the language specification, the compiler
> can do as it wishes.

And that's the key: what is specified for Python? It would appear
nothing has been stated explicitly so it would be dangerous for a Python
application to rely on assumed semantics.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lifetime of a local reference

2019-02-28 Thread Marko Rauhamaa
Chris Angelico :
> What if an exception gets raised at some point before the function has
> returned? The exception object will give full access to the function's
> locals.

It wouldn't hurt for the Python gods to make an explicit ruling on the
matter.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lifetime of a local reference

2019-02-28 Thread Marko Rauhamaa
Alan Bawden :

> Gregory Ewing  writes:
>
>> Alan Bawden wrote:
>> > the Java Language
>> > Specification contains the following language:
>> >Optimizing transformations of a program can be designed that reduce
>> >the number of objects that are reachable to be less than those which
>> >would naively be considered reachable.  For example, a Java compiler
>> >or code generator may choose to set a variable or parameter that
>> >will no longer be used to null to cause the storage for such an
>> >object to be potentially reclaimable sooner.
>> 
>> However, it only makes sense to do that if the compiler can be
>> sure that reclaiming the object can't possibly have any side
>> effects. That's certainly not true of things like file objects
>> that reference resources outside of the program. I'd be pretty
>> upset if a Java implementation prematurely closed my files on
>> the basis of this clause.
>
> The Java compiler has no way to know whether a variable references an
> object with a finalize() method that has side effects, so that quote
> from the Specification licenses a Java implementation to do exactly
> the thing you say will make you upset.

And that's not only theoretical. Hotspot (as well as gcc on the C side)
has been very aggressive in taking liberties awarded by the standards.

What's trickier is that Hotspot's JIT optimizations kick in only when a
code snippet is executed numerous times. So often these effects don't
come up during functional testing but may make their way to production.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lifetime of a local reference

2019-02-27 Thread Marko Rauhamaa
Rhodri James :
> On 27/02/2019 06:56, Marko Rauhamaa wrote:
>> Then there's the question of a sufficient way to prevent premature
>> garbage collection:
>>
>>   def fun():
>>   f = open("lock")
>>   flock.flock(f, fcntl.LOCK_EX)
>>   do_stuff()
>>   f.close()
>>   sys.exit(0)
>>
>>   def fun():
>>   f = open("lock")
>>   flock.flock(f, fcntl.LOCK_EX)
>>   do_stuff()
>>   f.close
>>   sys.exit(0)
>>
>>   def fun():
>>   f = open("lock")
>>   flock.flock(f, fcntl.LOCK_EX)
>>   do_stuff()
>>   f
>>   sys.exit(0)
>>
>>   def fun():
>>   f = open("lock")
>>   flock.flock(f, fcntl.LOCK_EX)
>>   do_stuff()
>>   sys.exit(0)
>
> I would go with:
>
> def fun():
> with open("lock") as f:
> flock.flock(f, fcntl.LOCK_EX)
> do_stuff()
> sys.exit(0)
>
> The description of the with statement does explicitly say that the
> context manager's __exit__() method won't be called until the suite
> has been executed, so the reference to the open file must exist for at
> least that long.

Yeah, but the *true* answer, of course, is:

def fun():
f = os.open("lock", os.O_RDONLY)
flock.flock(f, fcntl.LOCK_EX)
do_stuff()
sys.exit(0)

Collect that!

;-)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lifetime of a local reference

2019-02-26 Thread Marko Rauhamaa
Alan Bawden :

> Marko Rauhamaa  writes:
>> def fun():
>> f = open("lock")
>> flock.flock(f, fcntl.LOCK_EX)
>> do_stuff()
>> sys.exit(0)
>> 
>> Question: can a compliant Python implementation close f (and,
>> consequently, release the file lock) before/while do_stuff() is
>> executed?
>
> A correct-but-fails-to-answer-your-real-question answer would be: "If
> you _care_ about when f gets closed, then just call f.close()
> yourself." So if you want to be _sure_ the lock stays locked while you
> "do stuff", you should write:
>
>def fun():
>f = open("lock")
>flock.flock(f, fcntl.LOCK_EX)
>do_stuff()
>f.close()
>sys.exit(0)
>
> And now you don't have to worry about the details of variable
> lifetimes.

Yes, although the operating system closes all files (and releases the
associated file locks) on process exit.

> But I appreciate that that isn't the true question that you wanted to ask!
> You are wondering if a Python implementation is _permitted_ to treat the
> code you wrote _as if_ you had written:
>
>def fun():
>f = open("lock")
>flock.flock(f, fcntl.LOCK_EX)
>del f
>do_stuff()
>sys.exit(0)
>
> which deletes the variable f from the local environment at a point where it
> will never be used again.  (Which could cause the lock to be released
> before do_stuff is even called.)
>
> This is an interesting question, and one that any garbage collected
> language should probably address somehow.  For example, the Java Language
> Specification contains the following language:
>
>Optimizing transformations of a program can be designed that reduce the
>number of objects that are reachable to be less than those which would
>naively be considered reachable.  For example, a Java compiler or code
>generator may choose to set a variable or parameter that will no longer be
>used to null to cause the storage for such an object to be potentially
>reclaimable sooner.
>
> (This is from section 12.6.1 of the Java 8 version, which is what I had
> handy.)

C compilers do similar things, which is why Guile documentation mentions
a special mechanism to prevent premature garbage collection:

   https://www.gnu.org/software/guile/docs/docs-2.0/guile-ref/Rememb
   ering-During-Operations.html>

> I suspect that given the history of Python, pretty much everybody has
> always assumed that a Python implementation will not delete local
> variables early. But I agree with you that the Python Language
> Reference does not appear to address this question anywhere!

Then there's the question of a sufficient way to prevent premature
garbage collection:

 def fun():
 f = open("lock")
 flock.flock(f, fcntl.LOCK_EX)
 do_stuff()
 f.close()
 sys.exit(0)

 def fun():
 f = open("lock")
 flock.flock(f, fcntl.LOCK_EX)
 do_stuff()
 f.close
 sys.exit(0)

 def fun():
 f = open("lock")
 flock.flock(f, fcntl.LOCK_EX)
 do_stuff()
 f
 sys.exit(0)

 def fun():
 f = open("lock")
 flock.flock(f, fcntl.LOCK_EX)
 do_stuff()
 sys.exit(0)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Lifetime of a local reference

2019-02-26 Thread Marko Rauhamaa


Consider this function:

def fun():
f = open("lock")
flock.flock(f, fcntl.LOCK_EX)
do_stuff()
sys.exit(0)

Question: can a compliant Python implementation close f (and,
consequently, release the file lock) before/while do_stuff() is
executed?

I couldn't find an immediate answer in the documentation.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why float('Nan') == float('Nan') is False

2019-02-13 Thread Marko Rauhamaa
"Avi Gross" :

> A NaN is a bit like a black hole. Anything thrown in disappears and
> that is about all we know about it. No two black holes are the same
> even if they seem to have the same mass, spin and charge. All they
> share is that we don't know what is in them.

Then, how do you explain:

   >>> float("nan") != float("nan")
   True

Why's that not False?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: with exceptions?

2018-12-19 Thread Marko Rauhamaa
r...@zedat.fu-berlin.de (Stefan Ram):

> try:
> with open( 'file', 'r' ) as f:
> use( f )
> except Exception as inst:
> print( inst )
>
>   Is target code the correct way to use »with« with together
>   with »except«?
>
>   Or is it recommended to continue to use »finally« in such
>   cases?

I'd advice against such ambiguous use of exceptions. I'd write:

  try:
  f = open('file')
  except WhatEverException as e:
  print(e)
  else:
  with f:
  try:
  use(f)
  except OtherException as e:
  print(e)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: multiple JSON documents in one file, change proposal

2018-12-01 Thread Marko Rauhamaa
Chris Angelico :

> On Sat, Dec 1, 2018 at 10:16 PM Marko Rauhamaa  wrote:
>> and the framing format is HTTP. I will need to type something like this:
>>
>>POST / HTTP/1.1^M
>>Host: localhost^M
>>Content-type: application/json^M
>>Content-length: 54^M
>>^M
>>{
>>"msgtype": "echo-req",
>>"opid": 3487547843
>>}
>>
>> That's almost impossible to type without a syntax error.
>
> 1) Set your Enter key to send CR-LF, at least for this operation.
> That's half your problem solved.

That can be much harder than typing ctrl-SPC. It *is* supported by
netcat, for example, but then you have to carefully recompute the
content-length field.

> 2) Send the request like this:
>
> POST / HTTP/1.0
> Content-type: application/json
>
> {"msgtype": "echo-req", "opid": 3487547843}
>
> Then shut down your end of the connection, probably with Ctrl-D. I'm
> fairly sure I can type that without bugs, and any compliant HTTP
> server should be fine with it.

If I terminated the input, I wouldn't need any framing. The point is to
carry a number of JSON messages/documents over a single bytestream or in
a single file. That means the HTTP content-length would be mandatory.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: multiple JSON documents in one file, change proposal

2018-12-01 Thread Marko Rauhamaa
Chris Angelico :
> On Sat, Dec 1, 2018 at 9:16 PM Marko Rauhamaa  wrote:
>> The need for the format to be "typable" (and editable) is essential
>> for ad-hoc manual testing of components. That precludes all framing
>> formats that would necessitate a length prefix. HTTP would be
>> horrible to have to type even without the content-length problem, but
>> BEEP (RFC 3080) would suffer from the content-length (and CRLF!)
>> issue as well.
>
> I dunno, I type HTTP manually often enough that it can't be all *that*
> horrible.

Say I want to send this piece of JSON:

   {
   "msgtype": "echo-req",
   "opid": 3487547843
   }

and the framing format is HTTP. I will need to type something like this:

   POST / HTTP/1.1^M
   Host: localhost^M
   Content-type: application/json^M
   Content-length: 54^M
   ^M
   {
   "msgtype": "echo-req",
   "opid": 3487547843
   }

That's almost impossible to type without a syntax error.

>> Finally, couldn't any whitespace character work as a terminator? Yes,
>> it could, but it would force you to use a special JSON parser that is
>> prepared to handle the self-delineation. A NUL gives you many more
>> degrees of freedom in choosing your JSON tools.
>
> Either non-delimited or newline-delimited JSON is supported in a lot
> of tools. I'm quite at a loss here as to how an unprintable character
> gives you more freedom.

As stated by Paul in another context, newline-delimited is a no-go
because it forces you to restrict JSON to a subset that doesn't contain
newlines (see the JSON example above).

Of course, you could say that the terminating newline is only
interpreted as a terminator after a complete JSON value, but that's not
the format "supported in a lot of tools".

If you use any legal JSON character as a terminator, you have to make it
contextual or add an escaping syntax.

> I get it: you have a bizarre set of tools and the normal solutions
> don't work for you. But you can't complain about the tools not
> supporting your use-cases. Just code up your own styles of doing
> things that are unique to you.

There are numerous tools that parse complete JSON documents fine.
Framing JSON values with NUL-termination is trivial to add in any
programming environment. For example:

   def json_docs(path):
   with open(path) as f:
   for doc in f.read().split("\0")[:-1].
   yield json.loads(doc)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: multiple JSON documents in one file, change proposal

2018-12-01 Thread Marko Rauhamaa
Paul Rubin :

> Marko Rauhamaa  writes:
>> Having rejected different options (> https://en.wikipedia.org/wiki/JSON_streaming>), I settled with
>> terminating each JSON value with an ASCII NUL character, which is
>> illegal in JSON proper.
>
> Thanks, that Wikipedia article is helpful.  I'd prefer to not use stuff
> like NUL or RS because I like keeping the file human readable.  I might
> use netstring format (http://cr.yp.to/proto/netstrings.txt) but I'm even
> more convinced now that adding a streaming feature to the existing json
> module is the right way to do it.

We all have our preferences.

In my case, I need an explicit terminator marker to know when a JSON
value is complete. For example, if I should read from a socket:

   123

I can't yet parse it because there might be another digit coming. On the
other hand, the peer might not see any reason to send any further bytes
because "123" is all they wanted to send at the moment.

As for NUL, a control character that is illegal in all JSON contexts is
practical so the JSON chunks don't need to be escaped. An ASCII-esque
solution would be to pick ETX (= end of text). Unfortunately, a human
operator typing ETX (= ctrl-C) to terminate a JSON value will cause a
KeyboardInterrupt in many modern command-line interfaces.

It happens NUL (= ctrl-SPC = ctrl-@) is pretty easy to generate and
manipulate in editors and the command line.

The need for the format to be "typable" (and editable) is essential for
ad-hoc manual testing of components. That precludes all framing formats
that would necessitate a length prefix. HTTP would be horrible to have
to type even without the content-length problem, but BEEP (RFC 3080)
would suffer from the content-length (and CRLF!) issue as well.

Finally, couldn't any whitespace character work as a terminator? Yes, it
could, but it would force you to use a special JSON parser that is
prepared to handle the self-delineation. A NUL gives you many more
degrees of freedom in choosing your JSON tools.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: multiple JSON documents in one file, change proposal

2018-11-30 Thread Marko Rauhamaa
Paul Rubin :
> Maybe someone can convince me I'm misusing JSON but I often want to
> write out a file containing multiple records, and it's convenient to
> use JSON to represent the record data.
>
> The obvious way to read a JSON doc from a file is with "json.load(f)"
> where f is a file handle. Unfortunately, this throws an exception

I have this "multi-JSON" need quite often. In particular, I exchange
JSON-encoded messages over byte stream connections. There are many ways
of doing it. Having rejected different options (https://en.wikipedia.org/wiki/JSON_streaming>), I settled with
terminating each JSON value with an ASCII NUL character, which is
illegal in JSON proper.

> I also recommend the following article to those not aware of how badly
> designed JSON is: http://seriot.ch/parsing_json.php

JSON is not ideal, but compared with XML, it's a godsend.

What would be ideal? I think S-expressions would come close, but people
can mess up even them: https://www.ietf.org/rfc/rfc2693.txt>.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Enums: making a single enum

2018-11-17 Thread Marko Rauhamaa
Ian Kelly :

> On Fri, May 25, 2018 at 11:00 PM, Chris Angelico  wrote:
>> On Sat, May 26, 2018 at 2:46 PM, Steven D'Aprano
>>> class State(Enum):
>>> Maybe = 2
>>
>> # Tri-state logic
>> Maybe = object()
>
> The enum has a nice __str__ though.

That's why I usually use string sentinels:

  Maybe = "Maybe"


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: @staticmethod or def function()?

2018-10-31 Thread Marko Rauhamaa
Peter Otten <__pete...@web.de>:

> Do not make changes to your code only to apeace a linter.

Precisely. Don't let a tool drive a show.

> Regarding more the general question, should I use an instance method,
> a class method, a static method, or a function? -- that is hard to
> answer without an idea what the specific task of the function/method
> is, and how strong the link to the class is.

I just always use regular instance methods. If I really want to
"disappear" self, I'll just take the function out of the class.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Accessing clipboard through software built on Python

2018-10-27 Thread Marko Rauhamaa
Musatov :

> I work from a web database of users and I continually have to copy email
> address and user ID to two separate fields on a Salesforce.com page.
>
> I go to the webpage, highlight email address then copy.
> Then go to Salesforce page, and paste.
> Then go back to the webpage, then copy the User ID.
> Then go back to Salesforce page, and paste.
>
> I think it would be much more efficient to:
> On webpage, copy emailaddress and user ID.
> Then go to Salesforce and paste email address and user ID.

Theoretically possible. In practice, probably not so much because both
applications must be aware of a special user-id-plus-email-address
selection data type.

What should be possible, though is:

 * select the email address with the left mouse button on the web page
 * middle click on the Salesforce page to copy it there
 * same for the user name

Similarly, drag-and-drop should work.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Accessing clipboard through software built on Python

2018-10-27 Thread Marko Rauhamaa
Michael Torrie :
> As far as I know it's not possible for an application to directly yank
> highlighted text from another application.

That's an age-old pattern in X11. I don't know if Wayland supports it.

Application 1 holds a selection (usually highlighted) and Application 2
wants to copy the selection. No clipboard is needed. Application 2
simply asks for the selection. The request is relayed to Application 1,
which generates the response:

https://en.wikipedia.org/wiki/X_Window_selection#Selections>


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 10442: character maps to

2018-10-21 Thread Marko Rauhamaa
pjmcle...@gmail.com:

> not sure why utf-8 gives an error when thats the most wide all caracters
> inclusive right?/

Not all sequences of bytes are legal in UTF-8. For example,

   >>> b'\x80'.decode("utf-8")
   Traceback (most recent call last):
 File "", line 1, in 
   UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: 
invalid start byte

Not all sequences of bytes are legal in ASCII, either.

However, all sequences of bytes are legal in Latin-1 (among others). Of
course, decoding with Latin-1 gives you gibberish unless the data really
is Latin-1. But you'll never get a UnicodeDecodeError.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 394

2018-10-19 Thread Marko Rauhamaa
Thomas Jollans :

> On 2018-10-19 12:06, Marko Rauhamaa wrote:
>> Anders Wegge Keller :
>>> * python3 will refer to some version of Python 3.x.
>> 
>> Untrue for macOS, ArchLinux, RHEL and CentOS.
>
> Sure it's true for *EL (it's just that python3 might not be installed by
> default even on EL7)

The newest versions of RHEL and CentOS don't provide Python3 at all.

Red Hat says the upcoming RHEL-8 won't have Python2 at all. I'm guessing
it *will* provide Python3.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 394

2018-10-19 Thread Marko Rauhamaa
Anders Wegge Keller :
> * python2 will refer to some version of Python 2.x.

Untrue for macOS.

> * python3 will refer to some version of Python 3.x.

Untrue for macOS, ArchLinux, RHEL and CentOS.

> * for the time being, all distributions should ensure that python, if
> installed, refers to the same target as python2, unless the user
> deliberately overrides this or a virtual environment is active.

Should, would, could.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-17 Thread Marko Rauhamaa
Paul Rubin :

> Marko Rauhamaa  writes:
>> Emacs occasionally hangs for about a minute to perform garbage
>> collection.
>
> I've never experienced that, especially with more recent versions that I
> think do a little bit of heap tidying in the background.  Even in the
> era of much slower computers I never saw an Emacs GC pause of more than
> a second or two unless something had run amuck and exhausted memory.
> It's always near imperceptible in my experience now.  Is your system
> swapping or something?

I can't be positive about swapping. I don't remember hearing thrashing.
However, I do admit running emacs for months on end and occasionally
with huge buffers so the resident size can be a couple of gigabytes.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-16 Thread Marko Rauhamaa
Paul Rubin :

> But it's possible to do parallel GC with bounded latency. Perry
> Cheng's 2001 PhD thesis says how to do it and is fairly readable:
>
> http://reports-archive.adm.cs.cmu.edu/anon/2001/CMU-CS-01-174.pdf

Thanks. On a quick glance, it is difficult to judge what the worst-case
time and space behavior are as the thesis mixes theory and practice and
leans heavily on practice. The thesis says in its introduction:

   A real-time collector comprises two important features: pauses are
   bounded by some reasonably small value and the mutator can make
   sufficient progress between pauses. Different collectors meet these
   conditions with varying degrees of success and their viability
   depends on application needs. It is important to note that a
   collector must also complete collection within a reasonable time. A
   "real-time" collector which mereloy stops collections whenever it
   runs out of time would be hard real-time but useless if it never
   finishes a collection. In such cases, memory is soon exhausted. As
   with other real-time applications, the most important distinction
   among real-time collectors is the strength of the guarantee.

> If you hang out with users of Lisp, Haskell, Ocaml, Java, Ruby, etc.,
> they (like Python users) have all kinds of complaints about their
> languages, but GC pauses aren't a frequent topic of those complaints.

I don't suffer from it, either.

> Most applications don't actually care about sub-millisecond realtime.
> They just want pauses to be small or infrequent enough to not interfere
> with interactively using a program.  If there's a millisecond pause
> every few seconds of operation and an 0.2 second pause a few times an
> hour, that's usually fine.

Emacs occasionally hangs for about a minute to perform garbage
collection.

Similarly, Firefox occasionally becomes unresponsive for a long time,
and I'm guessing it's due to GC.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-16 Thread Marko Rauhamaa
Paul Rubin :

> Marko Rauhamaa  writes:
>>> Right, if I need near realtime behaviour and must live
>>> with [C]Python's garbage collector.
>> Or any other GC ever invented.
>
> There are realtime ones, like the Azul GC for Java, that have bounded
> delay in the milliseconds or lower. The total overhead is higher
> though.

I'd be interested in a definitive, non-anecdotal analysis on the topic.
Do you happen to have a link?

One reference I found stated there was no upper bound for heap use:

  A second cost of concurrent garbage collection is unpredictable heap
  growth. The program can allocate arbitrary amounts of memory while the
  GC is running.

  https://making.pusher.com/golangs-real-time-gc-in-theory-and-prac
  tice/>

If that worst-case behavior were tolerated, it would be trivial to
implement real-time GC: just let the objects pile up and never reclaim.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-15 Thread Marko Rauhamaa
Rhodri James :

> On 15/10/18 12:28, Marko Rauhamaa wrote:
>> Try running
>>
>>  emacs -q abc.c
>>
>> and observe the indentation depth.
>
> """User Option: c-basic-offset
>
> This style variable holds the basic offset between indentation
> levels. It's factory default is 4, but all the built-in styles set it
> themselves, to some value between 2 (for gnu style) and 8 (for bsd,
> linux, and python styles)."""

To realize why 2 is the factory default despite the above statement, we
observe that the factory setting of c-default-style specifies "gnu" for
C-like files other than Java and awk.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-15 Thread Marko Rauhamaa
Rhodri James :

> On 15/10/18 05:45, Marko Rauhamaa wrote:
>> The two-space indentation is the out-of-the-box default for emacs.
>
> Ahem.  It's the default for certain C styles.  It's not even the default
> for C-mode itself, which is 4.

You must be running a different version of emacs than all the versions
I've every run.

Try running

emacs -q abc.c

and observe the indentation depth.

> Those of us who believe that tabs are evil set indent-tabs-mode nil
> anyway to stop the annoying behaviour.

Yes.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-15 Thread Marko Rauhamaa
dieter :

> Marko Rauhamaa  writes:
>> Keeping the number of long-term objects low is key.
>
> Right, if I need near realtime behaviour and must live
> with [C]Python's garbage collector.

Or any other GC ever invented.

> But, a web application does usually not need near realtime behaviour.
> An occasional (maybe once in a few days) garbage collection and
> associated reduced response time is acceptable.
> A problem only arises if a badly designed component produces
> quite frequently hundreds of thousands of temporary objects
> likely triggering (frequent) garbage collections.

But I think you are barking up the wrong tree. You could rightly blame
GC itself as an unworkable paradigm and switch to, say, C++ or Rust.

Or you could blame the parts of the software that create too many
long-term objects.

You shouldn't blame the parts of the software that churn out zillions of
short-term objects.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-15 Thread Marko Rauhamaa
Chris Angelico :

> On Mon, Oct 15, 2018 at 3:51 PM Marko Rauhamaa  wrote:
> I don't understand your point here. It prints a letter, then some
> spaces, then a tab, then another letter. On my terminal, that displays
> the tab by advancing to the next tab position. If I highlight to
> select, it's obvious that the spaces have not been collapsed or
> converted in any way; it is indeed printing that many spaces, then a
> tab. Universal default? Not very.

The point is that your tab stops are at every 8th column even if some
other tab stops were used in your editor.

Unless you configured your terminal (emulator) with the same tab stops
as your editor.

Then, you'll need to configure your printer, browser and other visual
tools that have the every-8th-column-tab-stop default.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-14 Thread Marko Rauhamaa
Chris Angelico :
> I'm saying I have never seen is this:
>
> On Mon, Oct 15, 2018 at 8:56 AM Marko Rauhamaa  wrote:
>> However, it is trumped by an older
>> convention whereby the indentation levels go as follows:
>>
>>0:
>>1: SPC SPC
>>2: SPC SPC SPC SPC
>>3: SPC SPC SPC SPC SPC SPC
>>4: TAB
>>5: TAB SPC SPC
>>6: TAB SPC SPC SPC SPC
>>7: TAB SPC SPC SPC SPC SPC SPC
>>8: TAB TAB
>
> Specifically that two-space indents and tab-collapsing are a
> *convention*. I have never seen this used anywhere, much less seen it
> commonly enough to call it a convention.

The two-space indentation is the out-of-the-box default for emacs.
However, the tab collapsing principle is a universal default. If you go
against it, you will have to educate more tools than your editor. For
example, try running this Python snippet (in REPL or as a program):

for i in range(32):
print("x{}\ty".format(" " * i))


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-14 Thread Marko Rauhamaa
Chris Angelico :

> Tabs for indentation have semantic meaning. Top-level has zero tabs.
> One indentation level is represented by one tab. Two indentation
> levels? Two tabs. It's about as perfect a representation as you could
> hope for. If you like your indentation levels to be as wide as four
> spaces, you can have that. I could have them at eight, and it wouldn't
> make a difference. And if someone messes up their code by using tabs
> to align all their comments, reject that code at code review time.
> This ain't rocket science.

That *could* be the situation. However, it is trumped by an older
convention whereby the indentation levels go as follows:

   0:
   1: SPC SPC
   2: SPC SPC SPC SPC
   3: SPC SPC SPC SPC SPC SPC
   4: TAB
   5: TAB SPC SPC
   6: TAB SPC SPC SPC SPC
   7: TAB SPC SPC SPC SPC SPC SPC
   8: TAB TAB

That's how emacs indents source code files out of the box, BTW.

Moreover:

   SPC TAB = TAB
   SPC SPC SPC SPC SPC SPC SPC SPC = TAB

etc.

This older convention is honored by Python2 as well.

This older convention has been honored by many operating systems (at
least UNIX, CP/M and MS-DOS).

If I had to choose between your scheme and the older scheme, I'd choose
the older one.

Instead, I have chosen to banish HT as an unnecessary distraction.

Your scheme also is ad hoc in that it doesn't follow its logic to other
ASCII control characters. Why not use VT to separate methods? Why not
use US to separate operators from operands? Why not use RS to separate
the operands of optional arguments? Why not use GS to separate logical
blocks of code? After all, those schemes would allow people to
personalize the visual representation of more aspects of the source
code.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread Marko Rauhamaa
Paul Rubin :
> Note that Java has a lot of [GC] options to choose from:
> https://docs.oracle.com/javase/9/gctuning/available-collectors.htm

I'm all for GC, but Java's GC tuning options are the strongest
counter-argument against it. The options just shift the blame from the
programming language to the operator of the software.

For GC to be acceptable, you shouldn't ever have to tune it. And I've
seen it in action. A customer complains about bad performance. The
system engineer makes a tailored GC recipe to address the issue, which
may help for a short while.

Here's my rule of thumb. Calculate how much memory you need for
long-term objects. Don't let the application exceed that amount.
Multiply the amount by 10 and allocate that much RAM for your
application.

> Another approach is Erlang's, where the application is split into a
> lot of small lightweight processes, each of which has its own GC and
> heap. So while some of them are GC'ing, the rest can keep running.

So the lightweight processes don't share any data. That may be a fine
approach.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-13 Thread Marko Rauhamaa
dieter :
> Marko Rauhamaa  writes:
>> However, I challenge the notion that creating hundreds of thousands of
>> temporary objects is stupid. I suspect that the root cause of the
>> lengthy pauses is that the program maintains millions of *nongarbage*
>> objects in RAM (a cache, maybe?).
>
> Definitely. The application concerned was a long running web application;
> caching was an important feature to speed up its typical use cases.

As an optimization technique, I suggest turning the cache into a "binary
blob" opaque to GC, or using some external component like SQLite.
Keeping the number of long-term objects low is key.

Note that Python creates a temporary object every time you invoke a
method. CPython removes them quickly through reference counting, but
other Python implementations just let GC deal with them, and that's
generally ok.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ESR "Waning of Python" post

2018-10-12 Thread Marko Rauhamaa
dieter :

> Every system you use has its advantages and its drawbacks.
> Depending on the specific context (problem, resources, knowledge, ...),
> you must choose an appropriate one.

Yep. I use Python for numerous tasks professionally and at home. Just
this past week I used it to plan a junior soccer winter tournament.
Python is used to verify various team and match constraints and
Sudoku-solver-type match order generation.

> Python uses the GIL mainly because it uses reference counting (with
> almost constant changes to potentially concurrently used objects) for
> memory management. Dropping the GIL would mean dropping reference
> counting likely in favour of garbage collection.

Reference counting was likely a bad idea to begin with.

> I work in the domain of web applications. And I made there a nasty
> experience with garbage collection: occasionally, the web application
> stopped to respond for about a minute. A (quite difficult) analysis
> revealed that some (stupid) component created in some situations (a
> search) hundreds of thousands of temporary objects and thereby
> triggered a complete garbage collection. The garbage collector started
> its mark and sweep phase to detect unreachable objects - traversing a
> graph of millions of objects.
>
> As garbage collection becomes drastically more complex if the object
> graph can change during this phase (and this was Python), a global
> look prevented any other activity -- leading to the observed
> latencies.

Yes. The occasional global freeze is unavoidable in any
garbage-collected runtime environment regardless of the programming
language.

However, I challenge the notion that creating hundreds of thousands of
temporary objects is stupid. I suspect that the root cause of the
lengthy pauses is that the program maintains millions of *nongarbage*
objects in RAM (a cache, maybe?).

> When I remember right, there are garbage collection schemes that
> can operate safely without stopping other concurrent work.

There are heuristics, but I believe the worst case is the same.

> Nevertheless, even those garbage collectors have a significant impact
> on performance when they become active (at apparently
> non-deterministic times) and that may be inacceptable for some
> applications.

If performance is key, Python is probably not the answer. Python's
dynamism make it necessarily much slower than, say, Java or Go.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-08 Thread Marko Rauhamaa
Thomas Jollans :

> On 08/10/2018 08:31, Marko Rauhamaa wrote:
>> Where I work (and at home), the only control character that is allowed
>> in source code is LF.
>
> No tolerance for form feeds?

None whatsoever.

CR is ok but only if immediately followed by BEL. That way typing source
code gives out the air of an old typewriter.

Highlighting keywords with ANSI escape sequences can also be rather
cute.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python indentation (3 spaces)

2018-10-08 Thread Marko Rauhamaa
Chris Angelico :
> How wide my indents are on my screen shouldn't influence your screen
> or your choices.

Where I work (and at home), the only control character that is allowed
in source code is LF.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to await multiple replies in arbitrary order (one coroutine per reply)?

2018-10-06 Thread Marko Rauhamaa
Russell Owen :
> I think what I'm looking for is a task-like thing I can create that I
> can end when *I* say it's time to end, and if I'm not quick enough
> then it will time out gracefully. But maybe there's a simpler way to
> do this. It doesn't seem like it should be difficult, but I'm stumped.
> Any advice would be appreciated.

I have experimented with similar questions in the past. FWIW, you can
see my small program here:

   http://pacujo.net/~marko/philosophers.py>


Occasionally, people post questions here about asyncio, but there are
few answers. I believe the reason is that asyncio hasn't broken through
as a very popular programming model even with Python enthusiasts.

I do a lot of network and system programming where event multiplexing is
essential. There are different paradigms to manage concurrency and
parallelism:

  * multithreading

  * multiprocessing

  * coroutines

  * callbacks from an event loop

Out of these, I prefer callbacks and processes and steer clear of
threads and coroutines. The reason is that in my (long) experience,

   Callbacks and processes make simple problems hard but manageable.
   Callbacks and processes make complex problems hard but manageable.

   Threads and coroutines make simple problems simple to solve.
   Threads and coroutines make complex problems humanly intractable.

Why is this? Threads and coroutines follow the same line of thinking.
They assume that concurrency involves a number linear state machines
("threads") that operate independently from each other. The "linear"
part means that in each blocking state, the thread is waiting for one
very definite event. When there are a number of possible events in each
state -- which there invariably are -- the multithreading model loses
its attractiveness.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] master/slave debate in Python

2018-09-26 Thread Marko Rauhamaa
Ian Kelly :

> The terminology should be changed because it's offensive, full stop.
> It may be normalized to many who are accustomed to it, but that
> doesn't make it any less offensive.
>
> Imagine if the terminology were instead "dominant / submissive".
> Without meaning to assume too much, might the cultural context
> surrounding those terms make you feel uncomfortable when using them?
> Would you desire for something else to be used in their place? Well,
> there are plenty of people who feel exactly that way about "master /
> slave".

I'm not a great fan of word taboos.

In particular, you can't ban a word just because someone gets offended
by it.

> Honestly, it's absurd that this is even a debate. Let's just make the
> change and get it over with.

I agree that this debate sounds absurd, satirical even.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Object-oriented philosophy

2018-09-11 Thread Marko Rauhamaa
Rick Johnson :

> Michael F. Stemper wrote:
>> Object-oriented philosophy
> [...] [...] [...]
>
> So, to make a long story short, you may want to do some
> googling...

Long story short, Michael, Rick is a masterful troll extraordinaire.
Highly amusing when you're in the mood.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-07 Thread Marko Rauhamaa
Marko Rauhamaa :

> def f(n):
> def auxf1(sum, m, i):
> if i == n:
> return sum
> else:
> def auxf2(sum, m, i):
> if sum % m == 0:
> return auxf1(sum, m + 1, i)
> else:
> return auxf1(sum, m, i)
> return auxf2(sum + m * i, m, i + 1)
> return auxf1(0, 1, 0)
>
> cheating slightly with locally named functions.
>
> If cheating is not allowed, you will need a Y combinator construct...

... and here's the same function without cheating:

f = (lambda n:
 (lambda auxf1, auxf2: auxf1(auxf1, auxf2, 0, 1, 0))(
 lambda auxf1, auxf2, sum, m, i:
 sum
 if i == n else
 auxf2(auxf1, auxf2, sum + m * i, m, i + 1),
 lambda auxf1, auxf2, sum, m, i:
 auxf1(auxf1, auxf2, sum, m + 1, i)
 if sum % m == 0 else
 auxf1(auxf1, auxf2, sum, m, i)))

... or replacing the conditional with arrays:

f = (lambda n:
 (lambda auxf1, auxf2: auxf1(auxf1, auxf2, 0, 1, 0))(
 lambda auxf1, auxf2, sum, m, i:
 [lambda m: auxf2(auxf1, auxf2, sum + m * i, m, i + 1),
  lambda m: sum][i == n](m),
 lambda auxf1, auxf2, sum, m, i:
 [lambda m: auxf1(auxf1, auxf2, sum, m, i),
  lambda m: auxf1(auxf1, auxf2, sum, m + 1, i)][sum % m == 0](m)))

It is possible to get rid of the arrays and numbers, too. Combinatory
logic would allow us to get rid of the local variables. At the pinnacle
of functional programming is iota:

   https://en.wikipedia.org/wiki/Iota_and_Jot>

The iota programming language has only one primitive value, the
Swiss-army-knife function ι, which can express Life, Universe and
Everything:

ι = lambda f: (f(lambda x: lambda y: lambda z: (x(z))(y(z(
lambda q: lambda i: q)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-07 Thread Marko Rauhamaa
Paul Moore :

> On Fri, 7 Sep 2018 at 15:10, Paul Moore  wrote:
>>
>> On Fri, 7 Sep 2018 at 14:06, Steven D'Aprano
>>  wrote:
>> > However I have a follow up question. Why the "let" construct in the
>> > first place? Is this just a matter of principle, "put everything in
>> > its own scope as a matter of precautionary code hygiene"? Because I
>> > can't see any advantage to the inner function:
>>
>> My impression is that this is just functional programming "good
>> style". [...]
>
> It's also worth noting that functional languages don't typically have
> variables or assignments (more accurately, such things aren't
> fundamental to the programming model the way they are in imperative
> languages). So although technically let introduces a new scope, in
> practical terms it's basically just "how functional programmers do
> assignments".

To put it succinctly, SML does it because there's no other way to
introduce local variables.

IOW, in many functional programming languages, the only local variables
are function arguments.

And if you want to go really functional, you can't even alter bindings.
So to implement a for loop in Python under these constraints, you would
implement:

def f(n):
sum = 0
m = 1
for i in range(n):
sum += m * i
if sum % m == 0:
m += 1
return sum

like this:

def f(n):
def auxf1(sum, m, i):
if i == n:
return sum
else:
def auxf2(sum, m, i):
if sum % m == 0:
return auxf1(sum, m + 1, i)
else:
return auxf1(sum, m, i)
return auxf2(sum + m * i, m, i + 1)
return auxf1(0, 1, 0)

cheating slightly with locally named functions.

If cheating is not allowed, you will need a Y combinator construct...


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: don't quite understand mailing list

2018-09-06 Thread Marko Rauhamaa
mm0fmf :
> On 06/09/2018 21:06, Ethan Furman wrote:
>> On 09/06/2018 12:42 PM, Reto Brunner wrote:
>>> What do you think the link, which is attached to every email you
>>> receive from the list, is for? Listinfo sounds very promising,
>>> doesn't it?
>>>
>>> And if you actually go to it you'll find: "To unsubscribe from
>>> Python-list, get a password reminder, or change your subscription
>>> options enter your subscription email address"
>>>
>>> So how about you try that?
>>
>> Reto,  your response is inappropriate.  If you can't be kind and/or
>> respectful, let someone else respond.
>
> Seriously if someone has a swanky signature advertising that they are
> a rocket scientist viz. "Software Contractor, Missiles and Fire
> Control" and yet doesn't know what a language runtime is or how
> mailing lists work then they are asking for that kind of reply.

I'm with Ethan on this one.

There was nothing in the original posting that merited ridicule.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Object-oriented philosophy

2018-09-06 Thread Marko Rauhamaa
"Michael F. Stemper" :

> Since the three classes all had common methods (by design), I
> thought that maybe refactoring these three classes to inherit from
> a parent class would be beneficial. I went ahead and did so.
> (Outlines of before and after are at the end of the post.)
>
> Net net is that the only thing that ended up being common was the
> __init__ methods. Two of the classes have identical __init__
> methods; the third has a superset of that method. The other methods
> all have completely different implementations. This isn't due to
> poor coding, but due to the fact that what these model have
> different physical characteristics.
>
> [...]
>
> Is there really any benefit to this change? Yes, I've eliminated
> some (a few lines per class) duplicate code. On the other hand,
> I've added the parent class and the (probably small, but not
> non-existent) overhead of invoking super().
>
> How does one judge when it's worthwhile to do this and when it's
> not? What criteria would somebody seasoned in OO and python use
> to say "good idea" vs "don't waste your time"?

Ultimately, a seasoned OO programmer might prefer one structure in the
morning and another structure in the afternoon.

Python's ducktyping makes it possible to develop classes to an interface
that is not spelled out as a base class. My opinion is that you should
not define base classes only to show pedigree as you would need to in,
say, Java.

Then, is it worthwhile to define a base class just to avoid typing the
same constructor twice? That's a matter of opinion. You can go either
way. Just understand that the base class doesn't serve a philosophical
role but is simply an implementation utility. Be prepared to refactor
the code and get rid of the base class the moment the commonality
disappears.

On the other hand, why is it that the two constructors happen to
coincide? Is it an indication that the two classes have something deeper
in common that the third one doesn't? If it is not a coincidence, go
ahead and give the base class a name that expresses the commonality.

Don't be afraid of "wasting your time" or worry too much about whether
something is a "good idea." Be prepared to massage the code over time as
your understanding and tastes evolve.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-06 Thread Marko Rauhamaa
Chris Angelico :
> The request was to translate this into Python, not to slavishly
> imitate every possible semantic difference even if it won't actually
> affect behaviour.

I trust Steven to be able to refactor the code into something more
likable. His only tripping point was the meaning of the "let" construct.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-06 Thread Marko Rauhamaa
Chris Angelico :

> On Thu, Sep 6, 2018 at 6:44 PM, Marko Rauhamaa  wrote:
> And even more idiomatically, Python doesn't require a new scope just
> for a new variable. So a much more idiomatic translation would be to
> simply ensure that the inner variable can't collide, and then ignore
> the function boundary. And I'm not sure if there even is a name
> collision. What's the issue with scoping at all here? What's the inner
> function actually achieving?

We can debate what the inner function is achieving. The main point is
that functional languages create an inner function for each "let"
statement. Steve was wondering what "let" meant, and the Python code--I
hope--helped illustrate it.

Typical Scheme code is riddled with inner functions. Local variables?
An inner function. A loop? An inner function. Exception handling? An
inner function.

Scheme can also be written procedurally through macros, but those macros
generate inner functions.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-06 Thread Marko Rauhamaa
Chris Angelico :

> On Thu, Sep 6, 2018 at 2:29 PM, Marko Rauhamaa  wrote:
>> Marko Rauhamaa  (Marko Rauhamaa):
>>> Steven D'Aprano :
>>>> I have this snippet of SML code which I'm trying to translate to Python:
>>>>
>>>> fun isqrt n = if n=0 then 0
>>>>  else let val r = isqrt (n/4)
>>>>   in
>>>> if n < (2*r+1)^2 then 2*r
>>>> else 2*r+1
>>>>   end
>>> [...]
>>> You must make sure "r" doesn't leak outside its syntactic context so:
>>>
>>> def isqrt(n):
>>> if n == 0:
>>> return 0
>>> else:
>>> def f2398478957():
>>> r = isqrt(n//4)
>>> if n < (2*r+1)**2:
>>> return 2*r
>>> else:
>>> return 2*r+1
>>> return f2398478957()
>>
>> Actually, this is a more direct translation:
>>
>>def isqrt(n):
>>if n == 0:
>>return 0
>>else:
>>def f2398478957(r):
>>if n < (2*r+1)**2:
>>return 2*r
>>else:
>>return 2*r+1
>>return f2398478957(isqrt(n//4))
>>
>
> I don't understand why you created that nested function instead of
> something simple like renaming the variable. Is there a difference
> here?

Yes, in understanding the semantics of "let."

"let" is used to introduce local bindings in some functional programming
languages. I must admit I'm not fully versed in ML but it looks like the
analogue in Lisp variants. This is how the above function would be
written in Scheme:

   (define (isqrt n)
  (if (= n 0)
  0
  (let ((r (isqrt (quotient n 4
(if (< n (expt (1+ (* 2 r)) 2))
(* 2 r)
(1+ (* 2 r))

Now, Lisp's "let" can be implemented/defined using "lambda":

   (let ((X A) (Y B) ...) . BODY)

   =>

   ((lambda (X Y ...) . BODY) A B ...)

which gives us:

   (define (isqrt n)
  (if (= n 0)
  0
  ((lambda (r)
(if (< n (expt (1+ (* 2 r)) 2))
(* 2 r)
(1+ (* 2 r
   (isqrt (quotient n 4)

Python does have a limited form of "lambda" and even a conditional
expression so--as others have mentioned--this particular function could
be translated pretty directly into Python using its lambda.

More generally and idiomatically, though, Python's functions are named.
So that explains the version I give above.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-05 Thread Marko Rauhamaa
Marko Rauhamaa  (Marko Rauhamaa):
> Steven D'Aprano :
>> I have this snippet of SML code which I'm trying to translate to Python:
>>
>> fun isqrt n = if n=0 then 0
>>  else let val r = isqrt (n/4)
>>   in
>> if n < (2*r+1)^2 then 2*r
>> else 2*r+1
>>   end
> [...]
> You must make sure "r" doesn't leak outside its syntactic context so:
>
> def isqrt(n):
> if n == 0:
> return 0
> else:
> def f2398478957():
> r = isqrt(n//4)
> if n < (2*r+1)**2:
> return 2*r
> else:
> return 2*r+1
> return f2398478957()

Actually, this is a more direct translation:

   def isqrt(n):
   if n == 0:
   return 0
   else:
   def f2398478957(r):
   if n < (2*r+1)**2:
   return 2*r
   else:
   return 2*r+1
   return f2398478957(isqrt(n//4))


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-05 Thread Marko Rauhamaa
Steven D'Aprano :

> I have this snippet of SML code which I'm trying to translate to Python:
>
> fun isqrt n = if n=0 then 0
>  else let val r = isqrt (n/4)
>   in
> if n < (2*r+1)^2 then 2*r
> else 2*r+1
>   end
>
>
> I've tried reading up on SML and can't make heads or tails of the
> "let...in...end" construct.
>
>
> The best I've come up with is this:
>
> def isqrt(n):
> if n == 0:
> return 0
> else:
> r = isqrt(n/4)
> if n < (2*r+1)**2:
> return 2*r
> else:
> return 2*r+1
>
> but I don't understand the let ... in part so I'm not sure if I'm doing
> it right.

You must make sure "r" doesn't leak outside its syntactic context so:

def isqrt(n):
if n == 0:
return 0
else:
def f2398478957():
r = isqrt(n//4)
if n < (2*r+1)**2:
return 2*r
else:
return 2*r+1
return f2398478957()

(Also use // instead of /: isqrt = integer square root.)


Marko

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any SML coders able to translate this to Python?

2018-09-04 Thread Marko Rauhamaa
Steven D'Aprano :

> I have this snippet of SML code which I'm trying to translate to Python:
>
> fun isqrt n = if n=0 then 0
>  else let val r = isqrt (n/4)
>   in
> if n < (2*r+1)^2 then 2*r
> else 2*r+1
>   end
>
>
> I've tried reading up on SML and can't make heads or tails of the 
> "let...in...end" construct.
>
>
> The best I've come up with is this:
>
> def isqrt(n):
> if n == 0:
> return 0
> else:
> r = isqrt(n/4)
> if n < (2*r+1)**2:
> return 2*r
> else:
> return 2*r+1
>
> but I don't understand the let ... in part so I'm not sure if I'm doing 
> it right.

You must make sure "r" doesn't leak outside its syntactic context so:

def isqrt(n):
if n == 0:
return 0
else:
def f2398478957():
r = isqrt(n//4)
if n < (2*r+1)**2:
return 2*r
else:
return 2*r+1
return f2398478957()

(Also use // instead of /: isqrt = integer square root.)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: bytes() or .encode() for encoding str's as bytes?

2018-08-31 Thread Marko Rauhamaa
Malcolm Greene :
> Is there a benefit to using one of these techniques over the other?
> Is one approach more Pythonic and preferred over the other for
> style reasons?
> message = message.encode('UTF-8')
> message = bytes(message, 'UTF-8')

I always use the former. I wonder why that is. I guess the aesthetic
rule is something along the lines: use a dot if you can.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-21 Thread Marko Rauhamaa
Gregory Ewing :

> Marko Rauhamaa wrote:
>> Lexically, there is special access:
>>
>>class C:
>>def __init__(self, some, arg):
>>c = self
>>class D:
>>def method(self):
>>access(c)
>>access(some)
>>access(arg)
>
> [...]
>
> you can do that without creating a new class every time you want an
> instance. You just have to be *slightly* more explicit about the link
> between the inner and outer instances.

By "*slightly* more explicit," do you mean more syntactic clutter?

Because of course you replace inner classes and closures with top-level
classes and methods of top-level classes.

And of course, I would prefer not to create a class for a singleton
object:

class C:
def __init__(self, some, arg):
c = self
self.d = object:
def method(self):
access(c)
access(some)
access(arg)

Unfortunately, there is no such syntax in Python.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-20 Thread Marko Rauhamaa
Dan Sommers :

> On Mon, 20 Aug 2018 14:39:38 +, Steven D'Aprano wrote:
>> I have often wished Python had proper namespaces, so I didn't have to
>> abuse classes as containers in this way :-(
>> 
>> (Not that I do this using "inner classes", but I do often want to use
>> a class as a container for functions, without caring about "self" or
>> wrapping everything in staticmethod.)
>
> Isn't that what modules are for?  (I suspect that I'm missing something,
> because I also suspect that you knew/know that.)

What's the syntax for creating an inner module...?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-20 Thread Marko Rauhamaa
Steven D'Aprano :

> On Mon, 20 Aug 2018 11:40:16 +0300, Marko Rauhamaa wrote:
>
>>class C:
>>def __init__(self, some, arg):
>>c = self
>>class D:
>>def method(self):
>>access(c)
>>access(some)
>>access(arg)
>> 
>> IOW, inner class D is a container for a group of interlinked closure
>> functions.
>
> If a class' methods don't use self, it probably shouldn't be a class.
>
> I have often wished Python had proper namespaces, so I didn't have to 
> abuse classes as containers in this way :-(
>
> (Not that I do this using "inner classes", but I do often want to use a 
> class as a container for functions, without caring about "self" or 
> wrapping everything in staticmethod.)

Yes, the reason to use a class is that there is no handier way to create
a method dispatch or a singleton object.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Writing bytes to stdout reverses the bytes

2018-08-20 Thread Marko Rauhamaa
Thomas Jollans :

> On 2018-08-20 04:22, Chris Angelico wrote:
>> On Mon, Aug 20, 2018 at 12:01 PM, Grant Edwards
>>  wrote:
>>> On 2018-08-20, Ben Bacarisse  wrote:
 It is if you run it as hd.
>>> What do you mean "run it as hd"?
>>> I don't have an "hd" in my path.
>> Your system is different from mine, then.
> Wonderful. Now why don't we all forget about hexdump and use xxd? ;-)
>
> FWIW, I have an Ubuntu system with hd, and an SL7 system without.

Fedora:

   $ xxd
   bash: xxd: command not found
   $ hd
   bash: hd: command not found
   $ od -Ax -tx1z -v <

Re: Pylint false positives

2018-08-20 Thread Marko Rauhamaa
Gregory Ewing :
> Marko Rauhamaa wrote:
>> Some of these chores are heavier, some of them are lighter. But where
>> I have used Python, performance hasn't been a bottleneck. It it were,
>> I'd choose different approaches of implementation.
>
> The point is that creating a class object every time you want a
> closure is pointlessly wasteful. There is *no benefit whatsoever* in
> doing that. If you think there is, then it's probably because you're
> trying to write Java programs in Python.

The benefit, as in using closures in general, is in the idiom.

>> But now I'm thinking the original Java approach (anonymous inner
>> classes) is probably the most versatile of them all. A single
>> function rarely captures behavior. That's the job of an object with
>> its multiple methods. In in order to create an ad-hoc object in
>> Python, you will need an ad-hoc class.
>
> An important difference between Python and Java here is that in Python
> the class statement is an *executable* statement, whereas in Java it's
> just a declaration. So putting a class statement inside a Python
> function incurs a large runtime overhead that you don't get with a
> Java inner class.

The same is true for inner def statements.

I don't see how creating a class would be fundamentally slower to
execute than, say, adding two integers. It may be the CPython has not
been optimized with the inner class use case. And it may be that
Python's data model has painted CPython into a corner, which then would
call the data model into question.

Anyway, in practice on my laptop it takes 7 µs to execute a class
statement, which is clearly worse than executing a def statement (0.1
µs) or integer addition (0.05 µs). However, 7 microseconds is the least
of my programming concerns.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-20 Thread Marko Rauhamaa
Gregory Ewing :

> Marko Rauhamaa wrote:
>> At least some of the methods of inner classes are closures (or there
>> would be no point to an inner class).
>
> In Python there is no such thing as an "inner class" in the Java
> sense. You can nest class statements, but that just gives you
> a class that happens to be an attribute of another class.
> Nothing in the nested class has any special access to anything
> in the containing class.

Lexically, there is special access:

   class C:
   def __init__(self, some, arg):
   c = self
   class D:
   def method(self):
   access(c)
   access(some)
   access(arg)

IOW, inner class D is a container for a group of interlinked closure
functions.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-19 Thread Marko Rauhamaa
Chris Angelico :

> On Sun, Aug 19, 2018 at 11:54 PM, Marko Rauhamaa  wrote:
>>> 3) Every invocation of method() has to execute the class body, which
>>> takes time.
>>
>> That's what happens with every method invocation in Python regardless.
>
> No. You have to execute the *class body*. Every method invocation has
> to execute a function body. Yours includes a class definition.
> Remember: The 'class' statement is NOT a declaration. It is an
> executable statement.

Sorry, I was being imprecise.

Obviously, whenever you execute a def statement, you have to execute a
def statement. Similarly, whenever you execute a class statement, you
have to execute a class statement. And executing a class statement
involves certain chores. Intantiating an object involves a different set
of chores. And invoking a method involves yet another set of chores.

Some of these chores are heavier, some of them are lighter. But where I
have used Python, performance hasn't been a bottleneck. It it were, I'd
choose different approaches of implementation.

> Oh, even easier then. You don't need any of this locality. Which means
> the inner class is buying you a whole lot of nothing.

When I first ran into Java's anonymous inner classes, I wondered why
they hadn't simply introduced anonymous functions. In fact, the whole
debacle ended up in a schism where C# forked out of Java and introduced
delegates, which are a neat concept and should have been there in C++
from the get-go. In their arms race both C# and Java finally introduced
lambdas.

But now I'm thinking the original Java approach (anonymous inner
classes) is probably the most versatile of them all. A single function
rarely captures behavior. That's the job of an object with its multiple
methods. In in order to create an ad-hoc object in Python, you will need
an ad-hoc class.

> Hmm... so it's fine to create a class at run time, but it's not okay
> to define its methods at run time.

Yes.

> I'm seriously confused here as to what you gain by that. How is it of
> value to create a new class at run time, but to require that all its
> methods and attributes be hand-written in the source code, and thus
> completely fixed?

Not completely fixed because the methods of the inner class can refer to
the variables of the outer scope.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-19 Thread Marko Rauhamaa
Chris Angelico :

> On Sun, Aug 19, 2018 at 10:28 PM, Marko Rauhamaa  wrote:
>> The most useful use of inner classes is something like this:
>>
>> class Outer:
>> def method(self):
>> outer = self
>>
>> class Inner:
>> def spam(self, a, b):
>> outer.quarantine(a, b)
>>
>> return Inner()
>
> That's pretty inefficient.

Hasn't become an issue for me.

> I'm not sure what you gain by having Inner be local to that method,

It's a practical way of implementing the state pattern (https://en.wikipedia.org/wiki/State_pattern>). Your outer object behaves
differently in different states, and the inner object encapsulates the
behavior differences.

Java listeners have for ever used the pattern to implement event
listeners: http://www.fredosaurus.com/notes-java/GUI/events/anonym
ous_listener.html>.

> but here's what you lose:
>
> 1) Each instance of Inner carries with it a large class object.

Again, that hasn't been an issue for me in practice.

> 2) You can't identify these objects as being of the same type (since
> they're not).

That's a feature, not a bug. Type membership checking goes against
duck-typing.

> 3) Every invocation of method() has to execute the class body, which
> takes time.

That's what happens with every method invocation in Python regardless.

> At the price of a very small amount of encapsulation, you can make it
> actually an inner class:
>
> class Outer:
> class Inner:
> def __init__(self, outer):
> self.spam = outer.quarantine
>
> def method(self):
> return self.Inner(self)

Sure, there are ways to avoid closures, but the expressive price is
usually higher than the supposed performance gain.

> Now all instances of Inner are actually instances of the same type.
> It's still local to the class, just not local to that method.

Locality to the class is usually not worth the trouble. It's good enough
to have names local to the module.

> None of this explains your aversion to creating functions in a loop at
> class scope, while still being perfectly happy doing so at function
> scope.

It had to do with populating a namespace programmatically using strings
as field/method names (which, generally, you shouldn't be doing).
Defining functions and classes dynamically during runtime is perfectly
ok.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-19 Thread Marko Rauhamaa
Steven D'Aprano :

> On Sun, 19 Aug 2018 11:43:44 +0300, Marko Rauhamaa wrote:
>> At least some of the methods of inner classes are closures (or there
>> would be no point to an inner class).
>
> [...]
>
> (2) Whether or not the methods of an inner class are closures depends on 
> the methods, not the fact that it is an inner class. There are no 
> closures here:
>
> class Outer:
> class Inner:
>...
>
> no matter what methods Inner has. Nor is this a closure:
>
> class Outer:
> def method(self):
> class Inner:
> def spam(self):
> return self.eggs
> return Inner

The most useful use of inner classes is something like this:

class Outer:
def method(self):
outer = self

class Inner:
def spam(self, a, b):
outer.quarantine(a, b)

return Inner()

> You made a vague comment about inner classes being equivalent to
> closures in some unknown fashion, but inner classes are not themselves
> closures, and the methods of inner classes are not necessarily
> closures.

I hope the above outline removes the vagueness.

>>>> populating an object with fields (methods) in a loop is very rarely
>>>> a good idea.
>>>
>>> Of course it is *rarely* a good idea
>> 
>> So no dispute then.
>
> Isn't there? Then why are you disagreeing with me about the
> exceptional cases where it *is* a good idea?

I don't know which statement of mine you are referring to exactly now.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: printing to stdout

2018-08-19 Thread Marko Rauhamaa
richard lucassen :
> As I'm new to Python, just this question: are there any unPythony
> things in this code?

Your code looks neat.

>   except IOError:
> print ("[ALERT] I/O problem device 0x%x" % list_pcf[i])

Just double check that simply printing the alert is the correct recovery
from the exception. Should there be additional logic instead of the
fallthrough?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-19 Thread Marko Rauhamaa
Chris Angelico :
> On Sun, Aug 19, 2018 at 9:03 AM, Marko Rauhamaa  wrote:
>> Chris Angelico :
>>
>>> *headscratch*
>>>
>>> So this is okay:
>>>
>>> def f():
>>> for i in range(5):
>>> def g(): ...
>>>
>>> But this isn't:
>>>
>>> class C:
>>> for i in range(5):
>>> def m(self): ...
>>>
>>> I've missed something here.
>>
>> No, you got it right.
>
> Then I've completely missed the problem. Why is one of them acceptable
> and the other not?

In the def-def case, you will do something mundane with g. For example,
you will register it as a callback.

In the class-def case, you are defining the method m five times in the
same namespace and overwriting all but one of the definitions, which
probably isn't what you are after.

In order to populate the class with methods of different names, you will
need to manipulate the namespace programmatically. If you find yourself
needing to do something like that, you need to take a couple of steps
back and ask yourself if there might be a more conventional way to solve
the problem at hand.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-19 Thread Marko Rauhamaa
Steven D'Aprano :

> On Sun, 19 Aug 2018 00:11:30 +0300, Marko Rauhamaa wrote:
>
>> In Python programming, I mostly run into closures through inner classes
>> (as in Java).
>
> Inner classes aren't closures.

At least some of the methods of inner classes are closures (or there
would be no point to an inner class).

> Its also quite expensive to be populating your application with lots
> of classes used only once each, which is a common pitfall when using
> inner classes. Memory is cheap, but it's not so cheap that we ought to
> just profligately waste it needlessly.

That is a completely separate question.

There's is no a-priori reason for inner classes to be wasteful; they
have been part and parcel of Java programming from its early days, and
Java is widely used for high-performance applications.

CPython does use memory quite liberally. I don't mind that as
expressivity beats performance in 99% of programming tasks.

>> populating an object with fields (methods) in a loop is very rarely a
>> good idea.
>
> Of course it is *rarely* a good idea

So no dispute then.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-18 Thread Marko Rauhamaa
Chris Angelico :

> *headscratch*
>
> So this is okay:
>
> def f():
> for i in range(5):
> def g(): ...
>
> But this isn't:
>
> class C:
> for i in range(5):
> def m(self): ...
>
> I've missed something here.

No, you got it right.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-18 Thread Marko Rauhamaa
Chris Angelico :
> Your acceptance of closures is a perfect proof of how magic stops
> looking like magic once you get accustomed to it.

Actually, that's a very good observation. You should stick with a
smallish kernel of primitives and derive the universe from them.

Anyway, functions as first-class objects are truly foundational in all
high-level programming. In Python programming, I mostly run into
closures through inner classes (as in Java).

> If you can accept closures because they just DTRT, why not accept a
> much simpler and more obvious operation like putting a 'def' statement
> in a loop?

Nothing wrong or extraordinary with putting a def statement in a loop,
but populating an object with fields (methods) in a loop is very rarely
a good idea.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-18 Thread Marko Rauhamaa
Steven D'Aprano :
>> In a word, steer clear of metaprogramming.
>
> [...]
> (2) if you mean what you say, that means no decorators,

Correct. I don't find decorators all that useful or tasteful.

> no closures,

Closures I consider ordinary programming. Nothing meta there.

> no introspection ("reflection" in Java terms),

Introspection is suspect in general.

> no metaclasses (other than type),

Correct.

> no use of descriptors (other than the built-in ones),

Haven't used (or at least defined) them.

> no template-based programming,

Please, no.

> no source-code generators.

Lisp-style macros (or scheme syntax rules) are rather a clean way to do
that, but even that mechanism should be used very sparingly and
tastefully.

> No namedtuples, Enums, or data-classes.

They don't seem all that meta to me, but coincidentally I never found
uses for them in my Python code.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pylint false positives

2018-08-17 Thread Marko Rauhamaa
Chris Angelico :
> Programming is heavily about avoiding duplicated work.

That is one aspect, but overcondensing and overabstracting programming
logic usually makes code less obvious to its maintainer. It is essential
to find coding idioms that communicate ideas as clearly as possible. In
some situations boilerplate and redundancy can help make the code more
robust, as not every line of code becomes a clever brainteaser.

> Creating methods is work. Creating many identical (or similar) methods
> is duplicated work. What's wrong with using a loop to create
> functions?

I would guess such techniques could come in handy in some framework
development but virtually never in ordinary application development. In
a word, steer clear of metaprogramming.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: >< swap operator

2018-08-14 Thread Marko Rauhamaa
skybuck2...@hotmail.com:
> On Monday, August 13, 2018 at 10:01:37 PM UTC+2, Léo El Amri wrote:
>> On 13/08/2018 21:54, skybuck2...@hotmail.com wrote:
>> > I just had a funny idea how to implement a swap operator for types:
>> > 
>> > A >< B
>> > 
>> > would mean swap A and B.
>> 
>> I think that:
>> 
>> a, b = b, a
>> 
>> is pretty enough
>
> LOL.
>
> A >< B is shorter !

Don't forget the

   (A >:< B)

operator:

   def fib(n):
   curr = prev = 1
   for _ in range(n):
   (curr >:< prev) += prev
   return curr

Obviously, we see immediately the need for the Pythonic improvement:

   def fib(n):
   curr = prev = 1
   for _ in range(n):
   curr >+=< prev
   return curr


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Non-unicode file names

2018-08-09 Thread Marko Rauhamaa
INADA Naoki :

> For Python 3.6, I think best way to allow arbitrary bytes on stdout is
> using `PYTHONIOENCODING=utf-8:surrogateescape` environment variable.

Good info!


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RFC -- custom operators

2018-08-07 Thread Marko Rauhamaa
Steven D'Aprano :

> (1) This proposal requires operators to be legal identifiers, 
> such as "XOR" or "spam", not punctuation like % and
> absolutely not Unicode symbols like ∉

Oh, that's a let-down. Operator symbols get their expressive value from
visual conciseness:

   life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}

   https://en.wikipedia.org/wiki/APL_(programming_language)#G
   ame_of_Life>

> (If there aren't any such use-cases, then there's no need for custom
> operators.)
>
> Thoughts?

I have never felt the need for custom operators in Python code. I
believe introducing them will make it harder, not easier, to read code.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Checking whether type is None

2018-07-26 Thread Marko Rauhamaa
Steven D'Aprano :
> On Wed, 25 Jul 2018 16:14:18 +, Schachner, Joseph wrote:
>> thing is None looks just as odd to me. Why not thing == None ? That
>> works.
>
> It is wrong (in other words, it doesn't work) because it allows
> non-None objects to masquerade as None and pretend to be what they are
> not.
>
> If that's your intent, then of course you may do so. But without a
> comment explaining your intent, don't be surprised if more experienced
> Python programmers correct your "mistake".

Also, while

   thing == None

would work perfectly in almost all cases in practice, it's unidiomatic
and suggests the writer isn't quite comfortable with the workings of the
language.

Somewhat similarly, this code works perfectly:

   while (x > 0):
   y = y * x
   x = x - 1
   # end of while
   return(y)

but it doesn't look like Python.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: coding style - where to declare variables

2018-07-23 Thread Marko Rauhamaa
Steven D'Aprano :

> Lambda calculus has the concept of a binding operator, which is
> effectively an assignment operator: it takes a variable and a value
> and binds the value to the variable, changing a free variable to a
> bound variable. In other words, it assigns the value to the variable,
> just like assignment does.

In traditional Lambda Calculus semantics, there are no values at all.
There are only well-formatted formulas and syntactic transformation
rules. You could view it as a macro preprocessing system where you keep
transforming the formula until no transformation rule applies.

Yes, λ can be viewed as a binding operator although classically, it is
simply a dead symbol just like '(', '.' and 'z'.

> Especially in this case. Anyone who understands lambda calculus is
> unlikely to be confused by Python using the same terms to mean
> something *almost identical* to what they mean in lambda calculus.
> (The only difference I can see is that lambda calculus treats
> variables as abstract mathematical entities, while Python and other
> programming languages vivify them and give them a concrete
> implementation.)
>
> If one in ten thousand programmers are even aware of the existence of
> lambda calculus, I would be surprised. To give up using perfectly
> good, accurate terminology in favour of worse, less accurate
> terminology in order to avoid unlikely and transient confusion among a
> minuscule subset of programmers seems a poor tradeoff to me.

The lambda calculus comment is just an aside. The main point is that
you shouldn't lead people to believe that Python has variables that are
any different than, say, Pascal's variables (even if you, for whatever
reason, want to call them "names"). They are memory slots that hold
values until you assign new values to them.

It *is* true that Python has a more limited data model than Pascal (all
of Python's values are objects in the heap and only accessible through
pointers). Also, unlike Pascal, variables can hold (pointers to) values
of any type. IOW, Python has the data model of Lisp.

Lisp talks about binding and rebinding variables as well:

   https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node79.html>

which might be Lambda Calculus legacy, but at least they are not shy to
talk about variables and assignment.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: coding style - where to declare variables

2018-07-23 Thread Marko Rauhamaa
Ben Finney :
> Gregory Ewing  writes:
>
>> Marko is asking us to stop using the word "binding" to refer to
>> assignment because of the potential confusion with this other meaning.
>
> That's about as reasonable as my request that we stop using the term
> “variable” for what is, in Python, an un-typed reference to an object.
>
> I expect both of these requests to meet with little satisfaction.

I'm actually not asking, only wishing.

People new to Python are unnecessarily confused by talking about names
and binding when it's really just ordinary variables and assignment. It
seems to be mostly some sort of marketing lingo that seeks to create an
air of mystique around Python.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: coding style - where to declare variables

2018-07-23 Thread Marko Rauhamaa
Dennis Lee Bieber :
> On Mon, 23 Jul 2018 00:08:00 +0300, Marko Rauhamaa 
> declaimed the following:
>
>>I Java terms, all Python values are boxed. That's a very usual pattern
>>in virtually all programming languages (apart from FORTRAN).
>
>   FORTRAN, C, COBOL, BASIC, Pascal, ALGOL, BCPL, REXX, VMS DCL, probably
> R, Matlab, APL.
>
>   I never encountered the term "boxed" until trying to read some of the
> O'Reilly books on Java.
>
>   In my world, Java and Python are the ones that are not "common".

Yes, "boxed" is a Java term. However, the programming pattern of using
dynamic memory and pointers is ubiquitous and ancient:

FILE *f = fopen("xyz", "r");

where f holds a pointer, fopen() returns a pointer, and "xyz" and "r"
evaluate to pointer values.

In Python, every expression evaluates to a pointer and every variable
holds a pointer.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: coding style - where to declare variables

2018-07-22 Thread Marko Rauhamaa
Richard Damon :

>> On Jul 22, 2018, at 3:50 PM, Marko Rauhamaa  wrote:
>> I wish people stopped talking about "name binding" and "rebinding,"
>> which are simply posh synonyms for variable assignment. Properly, the
>> term "binding" comes from lambda calculus, whose semantics is defined
>> using "bound" and "free" variables. Lambda calculus doesn't have
>> assignment.
>
> Marko, I think the term binding makes sense in python due to how names
> work. In python and the following code:
>
> X = getit()
> Y = X
> X.changeit()
>
> In python, presuming getit() returns some form of object (so it has a
> changeit() member) then X and Y are bound to the same object, and
> changeit() will thus also affect the object that we see at Y.

Would you call it binding in this case:

   X[0]["z"] = getit()
   X[3]["q"] = X[0]["z"]
   X[0]["z"].changeit()

I think what you are talking about is more usually called "referencing."

> With a language with more ‘classical’ variable, the assignment of Y =
> X would normal make a copy of that object, so the value Y does not get
> changed by X.changeit().

I Java terms, all Python values are boxed. That's a very usual pattern
in virtually all programming languages (apart from FORTRAN).


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: coding style - where to declare variables

2018-07-22 Thread Marko Rauhamaa
Bart :
> If you did need one of those others to be variable, then you just assign
> it to a variable the rare times you need to do that. For example:
>
>   def fn1(): pass
>   def fn2(): pass
>
>   fn = fn1 if cond else fn2
>
> fn1, fn2 will always be functions. fn will always be a variable, but one
> that can change between referring to fn1, fn2 or anything else.

In high-level programming languages, functions are ordinary values. You
can perform similar operations on functions as on integers or strings.
You can give me two functions, and I can use those two to create a
third function:

   def compose(f1, f2):
   def composition(x):
   return f1(f2(x))
   return composition

Here "compose", "composition", "f1" and "f2" are variables:

 * "compose" gets assigned when the first, outer "def" statement is
   executed,

 * "f1" and "f2" get assigned when the function held by "compose" is
   called,

 * "composition" gets assigned when the inner "def" statement is
   executed.


While FORTRAN or C couldn't operate on functions like this, an assembly
language program could easily. Simply compose a CPU instruction sequence
on the fly, mark it executable and use the "CALL" opcode to transfer
control to your constructed function.

In the same vein, you could understand the "def" statement as a runtime
compiler that takes the function body, compiles it into machine language
and assigns the start address to the given variable. In fact, that would
be a perfectly working way to implement "def." Whether it would be a
smart thing to do is a different question. Key is, though, that "def"
always creates a new *data* object that can be called as *executable
code*.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: coding style - where to declare variables

2018-07-22 Thread Marko Rauhamaa
r...@zedat.fu-berlin.de (Stefan Ram):
>>Rebinding names is near-universal in programming, but usually names
>>that are intended to be rebound, such as variables.
>
>   To someone like me who has grown up with a LISP 1
>   this is completely natural.
>
> |>( SETQ A ( LAMBDA () 'ALPHA ))
> |(LAMBDA () 'ALPHA)

"Setq" is the dirty little secret of LISP.

Scheme marks its shameful primitives with an exclamation mark. Thus, its
assignment primitive is "set!".

I wish people stopped talking about "name binding" and "rebinding,"
which are simply posh synonyms for variable assignment. Properly, the
term "binding" comes from lambda calculus, whose semantics is defined
using "bound" and "free" variables. Lambda calculus doesn't have
assignment.

More about variable binding here: https://en.wikipedia.org/wiki/Free_variables_and_bound_variables>


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] Bit twiddling homework

2018-07-20 Thread Marko Rauhamaa
Chris Angelico :

> On Sat, Jul 21, 2018 at 1:14 AM, Grant Edwards
>  wrote:
>> I refuse to believe there's an extant processor in common use where
>> an ADD is faster than an OR unless somebody shows me the processor
>> spec sheet.
>
> "Faster than"? I'd agree with you. But "as fast as"? I believe that's
> how most modern CPUs already operate. (Well, mostly.) There are
> sophisticated methods of daisy-chaining the carry bit that mean the
> overall addition can be performed remarkably quickly, and the end
> result is a one-clock ADD operation, same as OR. For most data, most
> code, and most situations, integer addition is exactly as fast as
> integer bit shift.

I'm guessing the clock speed is adjusted for the longest propagation
delays. According to

   https://en.wikipedia.org/wiki/Carry-lookahead_adder#Implementa
   tion_details>

the maximal gate delay of a 16-bit carry-lookahead-adder is 8 gate
delays. A 64-bit addition results in some more delay:

   https://en.wikipedia.org/wiki/Lookahead_carry_unit#64-bit_adder>


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-18 Thread Marko Rauhamaa
Antoon Pardon :

> On 17-07-18 14:22, Marko Rauhamaa wrote:
>> If you assume that NFC normalizes every letter to a single codepoint
>> (and carefully use NFC everywhere), you are right. But equally likely
>> you may inadvertently be setting yourself up for a surprise.
>
> You are moving the goal post. I didn't claim there were no surprises.
> I only claim that in the end combining regular expressions and working
> with multiple languages ended up being far easier with python3 strings
> than with python2 strings.

Fair enough.

> Sure there were some surprises or gotcha's, but the result was still
> better than doing it in python2 and they were easier to deal with than
> in python2.

BTW, in those needs, even Python2 has Unicode strings and unicodedata at
your disposal.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
MRAB :
> "ch" usually represents 2 phonemes, basically the sounds of "t"
> followed by "sh";

Traditionally, that sound is considered a single phoneme:

   https://en.wikipedia.org/wiki/Affricate_consonant>

Can you hear the difference in these expressions:

   high chairs

   height shares

   height chairs

Try them on an English-speaking person. In a restaurant, ask for a
"height share" and see if they bring you a high chair.

The English "tr" sound can also be considered a single affricate
phoneme:

   https://en.wikipedia.org/wiki/Voiceless_postalveolar_affricate>

Is there a difference between these expressions:

   rye train

   right rain

   right train


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What "cult-like behavior" meant (was: Re: Glyphs and graphemes

2018-07-17 Thread Marko Rauhamaa
Chris Angelico :

> On Tue, Jul 17, 2018 at 6:41 PM, Marko Rauhamaa  wrote:
>> I can see that the bullying behavior comes from exasperation instead of
>> an outright meanness. They sincerely believe they understand the issues
>> better than their opponents and are at a loss to get the message across
>> without resorting to ad hominems.
>
> Have you considered the possibility that you're the one who doesn't
> understand the issues?

Of course. But I hope my argumentation has always been on the topic and
never perceived as a personal attack on other participants.

> Possible evidence to support this fact includes that many of us have
> ACTUAL REAL WORLD EXPERIENCE writing code for different languages'
> texts.

My actual real world experience is as valid as yours, and Python3's
Unicode support might be a better fit for yours than mine.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-17 Thread Marko Rauhamaa
Rhodri James :
> On 17/07/18 02:17, Steven D'Aprano wrote:
>> Ah yes, the unfortunate design error that iterating over byte-strings
>> returns ints rather than single-byte strings.
>>
>> That decision seemed to make sense at the time it was made, but turned
>> out to be an annoyance. It's a wart on Python 3, but fortunately one
>> which is fairly easily dealt with by a helper function.
>
> I don't think I agree with you, but that may just be my heritage as a C
> programmer.  Every time I've iterated over a byte string, I've really
> meant bytes (as in integers).  Those bytes may correspond to ASCII
> characters, but that's just a detail.

The practical issue is how you refer to ASCII bytes. What I've resorted
to is:

  if nxt == b":"[0]:
  ...

Alternatively, I *could* write:

  if nxt in b":":
  ...

What's your favorite way of expressing character constants?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Antoon Pardon :

> On 17-07-18 10:27, Marko Rauhamaa wrote:
>> Also, Python2's strings do as good a job at delivering codepoints as
>> Python3.
>
> No they don't. The programs that I work on, need to be able to treat
> at least german, french, dutch and english text. My experience is that
> in python3 it is way easier to do things right. Especially if you are
> working with regular expressions.

If you assume that NFC normalizes every letter to a single codepoint
(and carefully use NFC everywhere), you are right. But equally likely
you may inadvertently be setting yourself up for a surprise.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Chris Angelico :

> On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa  wrote:
>> Of course, UTF-8 doesn't relieve you from Unicode problems. But it has
>> one big advantage: it can usually deal with non-Unicode data without any
>> extra considerations while Python3's strings make you have to take
>> elaborate measures to handle those special cases. Why, even print() must
>> be guarded against UnicodeEncodeError when the printed string is not in
>> the programmer's control.
>
> What is this "non-Unicode data" that UTF-8 can handle? Do you mean
> arbitrary byte sequences? Because no, it cannot; properly-formed UTF-8
> sequences MUST comply with the precise requirements of the format.

I was being imprecise: byte strings carrying UTF-8 can handle bad UTF-8
with equal ease. And that's a real, practical advantage.

> Can you give an example of how Python 3's print function can raise
> UnicodeEncodeError when given a Python 3 string?

   >>> print("\ud810")
   Traceback (most recent call last):
 File "", line 1, in 
   UnicodeEncodeError: 'utf-8' codec can't encode character '\ud810' \
   in position 0: surrogates not allowed


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Chris Angelico :

> On Tue, Jul 17, 2018 at 7:03 PM, Marko Rauhamaa  wrote:
>> What I'd need is for the tty to tell me what column the cursor is
>> visually. Or better yet, the tty would have to tell me where the column
>> would be *after* I emit the next grapheme cluster.
>
> Are you prepared for the possibility that emitting characters won't
> change what column you're in?

Absolutely.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Chris Angelico :

> On Tue, Jul 17, 2018 at 6:27 PM, Marko Rauhamaa  wrote:
>> For me, the issue is where do I produce a line break in my text output?
>> Currently, I'm just counting codepoints to estimate the width of the
>> output.
>
> Well, that's just flat out wrong, then. Counting graphemes isn't going
> to make it any better. Grab a well-known library like Pango and let it
> do your measurements for you, *in pixels*. Or better still, just poke
> your text to a dedicated text-display widget and let it display it
> correctly.

What I'd need is for the tty to tell me what column the cursor is
visually. Or better yet, the tty would have to tell me where the column
would be *after* I emit the next grapheme cluster.

The tty *does* know that but I don't know if there is an interface to
query it. This doesn't seem to be working properly:

sys.stdout.write("a\u0300\u001b[6n\n")

(and would be a tricky interface even if it did)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What "cult-like behavior" meant (was: Re: Glyphs and graphemes

2018-07-17 Thread Marko Rauhamaa
INADA Naoki :

> On Tue, Jul 17, 2018 at 4:57 PM Marko Rauhamaa  wrote:
>>
>> Python3 is not a cult. It's a programming language. What is cult-like is
>> the manner in which Python3's honor is defended in a good many of the
>> discussions in this newsgroup: anger, condescension, ridicule,
>> name-calling.
>
> OK, I understand now.
>
> But I think it's true for all popular programming languages, not only
> Python. And it's not only for programming languages. I can see many
> too-defensive people on Twitter. Honestly speaking, I'm too defensive
> sometimes, too.

You are absolutely right. That behavior is a (lamentable) hereditary
trait in our species and apparently serves an important evolutionary
function (or it would have disappeared).

> Anyway, I feel "Cult-like behavior" in mail subject was misleading
> when discussing about byte-transparent string vs unicode string.

Yeah, discussions meander a lot.

> Such powerful words may make people more defensive, and heat non
> productive discussion.

Thing is, you need to stand up to bullying. Maybe you are not seeing it,
but quite many people have become victims of it here while the bullies
thrive and lead the pack.

I can see that the bullying behavior comes from exasperation instead of
an outright meanness. They sincerely believe they understand the issues
better than their opponents and are at a loss to get the message across
without resorting to ad hominems.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Steven D'Aprano :

> On Tue, 17 Jul 2018 09:52:13 +0300, Marko Rauhamaa wrote:
>
>> Both Python2 and Python3 provide two forms of string, one containing
>> 8-bit integers and another one containing 21-bit integers.
>
> Why do you insist on making counter-factual statements as facts? Don't 
> you have a Python REPL you can try these outrageous claims out before 
> making them?
>
> [...]
>
> Python strings are sequences of abstract characters.

which -- by your definition -- are codepoints -- which by any
definition -- are integers.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
Steven D'Aprano :
> On Mon, 16 Jul 2018 21:48:42 -0400, Richard Damon wrote:
>> Who says there needs to be one. A good engineer will use the
>> definition that is most appropriate to the task at hand. Some things
>> need very solid definitions, and some things don’t.
>
> The the problem is solved: we have a perfectly good de facto definition 
> of character: it is a synonym for "code point", and every single one of 
> Marko's objections disappears.

I admit it. Python3 is the perfect medium for your codepoint delivery
needs.

What you don't seem to understand about my objections is that no
programmer needs codepoints per se. Also, Python2's strings do as good a
job at delivering codepoints as Python3. Simultaneously, Python2's
strings are a better fit for the Unix system and network programming
model.

>> This goes back to my original point, where I said some people
>> consider UTF-32 as a variable width encoding. For very many things,
>> practically, the ‘codepoint’ isn’t the important thing,
>
> Ah, is this another one of those "let's pick a definition that nobody
> else uses, and state it as a fact" like UTF-32 being variable width?

   Each 32-bit value in UTF-32 represents one Unicode code point and is
   exactly equal to that code point's numerical value.

   https://en.wikipedia.org/wiki/UTF-32>

That is called bijection. Even more, it's a homomorphism. Homomorphism
is very high degree of sameness.

It is essential for people to understand that the very same issues that
plague UTF-8 plague UTF-32 as well. Using UTF in both highlights that
fact.

> If by "very many things", you mean "not very many things", I agree
> with you. In my experience, dealing with code points is "good enough",
> especially if you use Western European alphabets, and even more so if
> you're willing to do a normalization step before processing text.

Of course, UTF-8 doesn't relieve you from Unicode problems. But it has
one big advantage: it can usually deal with non-Unicode data without any
extra considerations while Python3's strings make you have to take
elaborate measures to handle those special cases. Why, even print() must
be guarded against UnicodeEncodeError when the printed string is not in
the programmer's control.

> But of course other people's experience may vary. I'm interested in 
> learning about the library you use to process graphemes in your software.

For me, the issue is where do I produce a line break in my text output?
Currently, I'm just counting codepoints to estimate the width of the
output.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
INADA Naoki :

>> I won't comment on Rust and Swift because I don't know them.
> ...
>> I won't comment on Go, either.
>
> Hmm, do you say Python 3 is "cult-like" without survey other popular,
> programming languages?

You can talk about Python3 independently of other programming languages.

Python3 is not a cult. It's a programming language. What is cult-like is
the manner in which Python3's honor is defended in a good many of the
discussions in this newsgroup: anger, condescension, ridicule,
name-calling.

> I can't agree that it's cult-like behavior.  I think it's practical
> design decision.

If Python3 works for you, I'm happy for you.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-17 Thread Marko Rauhamaa
INADA Naoki :

> On Tue, Jul 17, 2018 at 2:31 PM Marko Rauhamaa  wrote:
>> So I hope that by now you have understood my point and been able to
>> decide if you agree with it or not.
>
> I still don't understand what's your original point.
> I think UTF-8 vs UTF-32 is totally different from Python 2 vs 3.
>
> For example, string in Rust and Swift (2010s languages!) are *valid*
> UTF-8. There are strong separation between byte array and string, even
> they use UTF-8. They looks similar to Python 3, not Python 2.

I won't comment on Rust and Swift because I don't know them.

> And Python can use UTF-8 for internal encoding in the future. AFAIK,
> PyPy tries it now. After they succeeded, I want to try port it to
> CPython after we removed legacy Unicode APIs. (ref PEP 393)

How CPython3 implements str objects internally is not what I'm talking
about. It's the programmer's model in any compliant Python3
implementation.

Both Python2 and Python3 provide two forms of string, one containing
8-bit integers and another one containing 21-bit integers. Python3 made
the situation worse in a minor way and a major way. The minor way is the
uglification of the byte string notation. The major way is the wholesale
preference or mandating of Unicode strings in numerous standard-library
interfaces.

> So "UTF-8 is better than UTF-32" is totally different problem from
> "Python 2 is better than 3".

Unix programming is smoothest when the programmer can operate on bytes.
Bytes are the mother tongue of Unix, and programming languages should
not try to present a different model to the programmer.

> Is your point "accepting invalid UTF-8 implicitly by default is better
> than explicit 'surrogateescape' error handler" like Go?
> (It's 2010s languages with UTF-8 based string too, but accept invalid
> UTF-8).

I won't comment on Go, either.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Tim Chase :

> On 2018-07-16 23:59, Marko Rauhamaa wrote:
>> Tim Chase :
>> > While the python world has moved its efforts into improving
>> > Python3, Python2 hasn't suddenly stopped working.  
>> 
>> The sword of Damocles is hanging on its head. Unless a consortium is
>> erected to support Python2, no vendor will be able to use it in the
>> medium term.
>
> Wait, but now you're talking about vendors. Much of the crux of this
> discussion has been about personal scripts that don't need to
> marshal Unicode strings in and out of various functions/objects.

In both personal and professional settings, you face the same issues.
But you don't want to build on something that will disappear from the
Linux distros.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >