Re: List replication operator

2018-05-25 Thread bartc

On 25/05/2018 17:58, Rob Gaddi wrote:

So, in the spirit of explicit being better than implicit, please assume 
that for actual implementation replicate would be a static method of 
actual list, rather than the conveniently executable hackjob below.


_list = list
_nodefault = object()

class list(_list):
   @staticmethod
   def replicate(*n, fill=_nodefault, call=list):


That seems to work, but the dimensions are created in reverse order to 
what I expected. Which is to have the order of indices corresponding to 
the order of dimensions. So:


 x=list.replicate(2,3,4)

 print (len(x))
 print (len(x[0]))
 print (len(x[0][0]))

Gives output of 4, 3, 2 rather than 2, 3, 4.

Which means that x[0][0][3] is a bounds error.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: List replication operator

2018-05-25 Thread bartc

On 25/05/2018 17:11, Alexandre Brault wrote:



On 2018-05-25 11:40 AM, bartc wrote:

On 25/05/2018 16:27, Chris Angelico wrote:

You're way WAY too late to debate the matrix multiplication operator.


/The/ matrix multiplication operator?

In which language? And what was wrong with "*"?


In Python, the language we're discussing right now. What was wrong with
* is described in detail in PEP 465


(I've implemented matrix multiply in a language (although for
specialised matrix types), and I used the same "*" symbol as was used
to multiply anything else.)

Anyway this is not matrix multiplication, but replication, and using
'@' seems more a consequence of there not being any better ones
available as they are already used for other things.


You're right, it's not matrix multiplication. And Pathlib's use of / is
not division, nor do C++'s streams use bitshifting.


There's no need to be sarcastic.

The context here for those symbols is programming source code for which 
binary or infix +, -, * and / symbols are VERY commonly used for add, 
subtract, multiply and divide operations.


While some of them can sometimes be interpreted as various kinds of 
markup control when programming code is not distinguished from normal 
text, it can be particularly striking with @.



But overloading the matmul operator would allow this feature to work
without changing the syntax of the language, nor breaking existing code
(since no built-in types implement __matmul__).


As I mentioned before, why does it need to be an operator?
--
https://mail.python.org/mailman/listinfo/python-list


Re: List replication operator

2018-05-25 Thread bartc

On 25/05/2018 16:40, Steven D'Aprano wrote:

On Fri, 25 May 2018 22:46:37 +1000, Chris Angelico wrote:


We've already had a suggestion for [[]]@5 and that should deal with that
issue. Steven is proposing "multiply by copying" as an alternative to
"multiply by referencing", so an alternative multiplication operator
should fit that correctly. Steve, I don't want to speak for you; can you
confirm or deny acceptance of the matmul operator as a better spelling?


I thought that ** would be less controversial than @ since that's much
newer. Silly me.

Personally, I don't care much either way. Probably a microscopic
preference for @ over ** since it is shorter, but not enough to matter.

The usefulness of * with lists is seriously compromised by the inability
to copy mutable items in the list. Consequently, it is an on-going gotcha
and pain point.

On the other hand, it is arguable that what we really need is a standard
function to return an N-dimensional array/list:

# return a 3x4x5x6 4-D list initialised to all zeroes
arr = list.dimensions(3, 4, 5, 6, initial=0.0)

What stops you just writing such a function?

--
https://mail.python.org/mailman/listinfo/python-list


Re: List replication operator

2018-05-25 Thread bartc

On 25/05/2018 16:27, Chris Angelico wrote:

On Fri, May 25, 2018 at 10:58 PM, bartc <b...@freeuk.com> wrote:

I'm in general not in favour of piling in special symbols into a language
just to solve some obscure or rare problem.

As I went on to demonstrate, function-like syntax (or even actual functions)
could do that job better, by describing what the operation does and not
leaving people scratching their heads so much when they encounter that
funny-looking operator hidden in 20,000 lines of code.

As for '@', if a variable name can come before it /and/ after it, and either
or both can be dotted, wouldn't that cause it to be highlighted as an email
address in many circumstances? Such as in code posted here.

(OK, let's try it and see what happens. My Thunderbird doesn't do previews
so I will have to post it first:

abc@def
a...@def.ghi)

I would find that rather annoying.



You're way WAY too late to debate the matrix multiplication operator.


/The/ matrix multiplication operator?

In which language? And what was wrong with "*"?

(I've implemented matrix multiply in a language (although for 
specialised matrix types), and I used the same "*" symbol as was used to 
multiply anything else.)


Anyway this is not matrix multiplication, but replication, and using '@' 
seems more a consequence of there not being any better ones available as 
they are already used for other things.


--
bartc

--
https://mail.python.org/mailman/listinfo/python-list


Re: List replication operator

2018-05-25 Thread bartc

On 25/05/2018 13:46, Chris Angelico wrote:

On Fri, May 25, 2018 at 10:36 PM, bartc <b...@freeuk.com> wrote:

On 24/05/2018 19:17, Steven D'Aprano wrote:


But what do people think about proposing a new list replication with copy
operator?

  [[]]**5

would return a new list consisting of five shallow copies of the inner
list.

Thoughts?



Choice of ** doesn't seem right for a start, as it suggests it should mean
[]*[]*[]*[]*[], which it doesn't. (Apparently []*2 /is/ the same as []+[].)



We've already had a suggestion for [[]]@5 and that should deal with
that issue.


I'm in general not in favour of piling in special symbols into a 
language just to solve some obscure or rare problem.


As I went on to demonstrate, function-like syntax (or even actual 
functions) could do that job better, by describing what the operation 
does and not leaving people scratching their heads so much when they 
encounter that funny-looking operator hidden in 20,000 lines of code.


As for '@', if a variable name can come before it /and/ after it, and 
either or both can be dotted, wouldn't that cause it to be highlighted 
as an email address in many circumstances? Such as in code posted here.


(OK, let's try it and see what happens. My Thunderbird doesn't do 
previews so I will have to post it first:


   abc@def
   a...@def.ghi)

I would find that rather annoying.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: List replication operator

2018-05-25 Thread bartc

On 25/05/2018 13:36, bartc wrote:

Of course you have to implement dupllist(), but you'd have to implement 
** too, and that is harder. For this specific example, it can just be:


def dupllist(x,n):
     return [x[0].copy() for _ in range(n)]



On 25/05/2018 03:25, Steven D'Aprano wrote:
> You might be right: on further thought, I think I want deep copies, not
> shallow.

And my solution just becomes:

import copy

def dupllist(x,n):
return [copy.deepcopy(x[0]) for i in range(n)]

(It needs to iterate repeatedly over the elements of x for a general 
list. Replacing [0] by [i%len(x)] might just do it.)



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: List replication operator

2018-05-25 Thread bartc

On 24/05/2018 19:17, Steven D'Aprano wrote:


But what do people think about proposing a new list replication with copy
operator?

 [[]]**5

would return a new list consisting of five shallow copies of the inner
list.

Thoughts?


Choice of ** doesn't seem right for a start, as it suggests it should 
mean []*[]*[]*[]*[], which it doesn't. (Apparently []*2 /is/ the same as 
[]+[].)



How about just:

   x = dupllist([[]], 5)
   x[0].append(777)
   print (x)

which gives:

   [[777], [], [], [], []]

Of course you have to implement dupllist(), but you'd have to implement 
** too, and that is harder. For this specific example, it can just be:


def dupllist(x,n):
return [x[0].copy() for _ in range(n)]

--
bartc


--
https://mail.python.org/mailman/listinfo/python-list


Re: why do I get syntax error on if : break

2018-05-25 Thread bartc

On 25/05/2018 11:08, asa32s...@gmail.com wrote:

On Thursday, May 24, 2018 at 10:12:46 PM UTC-4, asa3...@gmail.com wrote:

here is the code, i keep getting an error, "break outside loop". if it is false 
just exit function


def d(idx):
 if type(idx) != int:
 break

d('k')


thanks... I believe the compiler. So how do I exit or return nothing?



Use 'return' instead of 'break'.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Raw string statement (proposal)

2018-05-25 Thread bartc

On 25/05/2018 05:34, Mikhail V wrote:


Proposal
---

Current proposal suggests adding syntax for the "raw text" statement.
This should enable the possibility to define text pieces in source
code without the need for interpreted characters.
Thereby it should solve the mentioned issues.
Additionally it should solve some issues with visual appearance.



General rules:

- parsing is aware of the indent of containing
   block, i.e. no de-dention needed.
- single line assignment may be allowed with
   some restrictions.

Difficulties:

- change of core parsing rules
- backward compatibility broken
- syntax highlighting may not work


I had one big problem with your proposal, which is that I couldn't make 
head or tail of your syntax. Such a thing should be immediately obvious.


(In your first two examples, what IS the exact string that you're trying 
to incorporate? That is not clear at all.)


The aim is to allow arbitrary text in program source which is to be 
interpreted as a string literal, and to be able to see the text as much 
in its natural form as possible.


One problem here is how to deal with embedded non-printable characters: 
CR, LF and TAB might become part of the normal source text, but how 
about anything else? Or would you only allow text that might appear in a 
text file where those characters would also cause issues?


Another thing that might come up: suppose you do come up with a workable 
scheme, and have a source file PROG1.PY which contains such raw strings.


Would it then be possible to create a source file PROG2.PY which 
contains PROG1.PY as a raw string? That is, without changing the text 
from PROG1.PY at all.


Here's one scheme I use in another language:

   print strinclude "file.txt"

'strinclude "file.txt"' is interpreted as a string literal which 
contains the contents of file.txt, with escapes used as needed. In fact 
it can be used for binary files too.


This ticks some of the boxes, but not all: the text isn't shown inline 
in the program source code. If you send someone this source code, they 
will also need FILE.TXT.


And it won't pass my PROG2/PROG1 test above (because both strincludes 
need expanding to strings, but the compiler won't recognise the one 
inside PROG1, as that is after all just text, not program code).


As for a better proposal, I'm inclined not to make it part of the 
language at all, but to make it an editor feature: insert a block of 
arbitrary text, and give a command to turn it into a string literal. 
With perhaps another command to take a string literal within a program 
and view it as un-escaped text.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: how to get INDEX count, or last number of Index

2018-05-24 Thread bartc

On 24/05/2018 03:37, Terry Reedy wrote:

On 5/23/2018 8:46 PM, bartc wrote:

On 24/05/2018 00:44, Terry Reedy wrote:

On 5/23/2018 5:56 PM, Rob Gaddi wrote:

On 05/23/2018 02:51 PM, asa32s...@gmail.com wrote:

s = "kitti"

0,1,2,3,4
k,i,t,t,i

how do i retrieve '4'. i know i can do a len(s)-1,


Use -1, which is the same as len(s)-1 but faster.


This illustrates one problem with having a example sequence of values 
being identical to the indices of those values.


I would have used 10,20,30,40,50 so there could be no such mix-up.

Because I assumed here that the OP wanted the index of the last value, 
the '4' they said they wanted, not the last value itself which would 
be 'i'.


And actually, the subject line seems to confirm that.

In that case, using a '-1' index won't work.


You snipped the code that shows that -1 does work to fetch the last 
character.




I'm sure it does.

But it's not clear whether the OP wants the last character, or the index 
of the last character. The subject line suggests the latter.


--
bart
--
https://mail.python.org/mailman/listinfo/python-list


Re: how to get INDEX count, or last number of Index

2018-05-23 Thread bartc

On 24/05/2018 01:46, bartc wrote:

On 24/05/2018 00:44, Terry Reedy wrote:

On 5/23/2018 5:56 PM, Rob Gaddi wrote:

On 05/23/2018 02:51 PM, asa32s...@gmail.com wrote:

s = "kitti"

0,1,2,3,4
k,i,t,t,i

how do i retrieve '4'. i know i can do a len(s)-1,


Use -1, which is the same as len(s)-1 but faster.


This illustrates one problem with having a example sequence of values 
being identical to the indices of those values.


I would have used 10,20,30,40,50 so there could be no such mix-up.


Hmm, although my observation is valid, that's not the case here, as the 
0,1,2,3,4 /are/ the indices! Not the list of values of which they are 
trying to access the '4', and which /could/ be obtained with the -1 
index some people are talking about, that added to the confusion.


--
https://mail.python.org/mailman/listinfo/python-list


Re: how to get INDEX count, or last number of Index

2018-05-23 Thread bartc

On 24/05/2018 00:44, Terry Reedy wrote:

On 5/23/2018 5:56 PM, Rob Gaddi wrote:

On 05/23/2018 02:51 PM, asa32s...@gmail.com wrote:

s = "kitti"

0,1,2,3,4
k,i,t,t,i

how do i retrieve '4'. i know i can do a len(s)-1,


Use -1, which is the same as len(s)-1 but faster.


This illustrates one problem with having a example sequence of values 
being identical to the indices of those values.


I would have used 10,20,30,40,50 so there could be no such mix-up.

Because I assumed here that the OP wanted the index of the last value, 
the '4' they said they wanted, not the last value itself which would be 'i'.


And actually, the subject line seems to confirm that.

In that case, using a '-1' index won't work.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: how to get INDEX count, or last number of Index

2018-05-23 Thread bartc

On 23/05/2018 22:51, asa32s...@gmail.com wrote:

s = "kitti"

0,1,2,3,4
k,i,t,t,i

how do i retrieve '4'. i know i can do a len(s)-1, but i assume there is a 
function that gives me last number of index


Not for that trivial task. But you can create your own:

 def upb(x): return len(x)-1 # 'upb' means 'upper-bound'

 print (upb("kitti"))# output: 4
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-23 Thread bartc

On 23/05/2018 14:56, Chris Angelico wrote:


Perfect! Now let's try that with other types.

Tuple of three: 1, 2, 3 or 1, 2, 3,

Not requiring any bracketing is poor IMO.

If you wanted the tuple to co-exist with any other thing in an 
expression, rather than being the only thing the expression comprises, 
then you will run into problems. Try adding two tuples:


  1, 2, 3 + 4, 5, 6

This in fact gives you 1,2,7,5,6 rather than 5,7,9. (I don't know if 
tuples can actually be added like this, but the point is clear.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-23 Thread bartc

On 23/05/2018 14:11, Steven D'Aprano wrote:

On Wed, 23 May 2018 11:10:33 +0100, bartc wrote:

(x,)  Tuple of one item


Incorrect. Yet again, you have failed to do enough testing. No special
form is required. Only a comma:

py> x = 1,
py> type(x)


It isn't enough to test examples which confirm a hypothesis. You need to
test examples which also refute it, and see if your hypothesis survives
the challenge.


I use this trailing comma scheme in my own own languages too. The reason 
is simple, it is to distinguish between these two:


  (x) Ordinary bracketed expression
  (x) A list (in my case) of one item.

That is not needed for one less item: (), or one more: (x,y). Like 
Python, that trailing comma is not needed for [x] (I don't have a {...} 
constructor).


In both languages, it's just a hack to get around a syntax clash in the 
language.


I don't say however, that the comma is the defining feature of a list.

Comma is used to separate items of /any/ kind of list, and that trailing 
comma is used when there is an ambiguity or a conflict.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-23 Thread bartc

On 23/05/2018 07:03, Chris Angelico wrote:

On Wed, May 23, 2018 at 3:32 PM, Christian Gollwitzer <aurio...@gmx.de> wrote:



I'd think that the definitive answer is in the grammar, because that is what
is used to build the Python parser:

 https://docs.python.org/3/reference/grammar.html

Actually, I'm a bit surprised that tuple, list etc. does not appear there as
a non-terminal. It is a bit hard to find, and it seems that "atom:" is the
starting point for parsing tuples, lists etc.



The grammar's a bit hard to read for this sort of thing, as the only
hint of semantic meaning is in the labels at the beginning. For
example, there's a "dictorsetmaker" entry that grammatically could be
a dict comp or a set comp; distinguishing them is the job of other
parts of the code.


Looking at all the instances of "','" (and there are plenty), none of 
them are tied to anything to do with tuples. Actually 'tuple' doesn't 
appear at all.


'dict' does, presumably because a dict-constructor is different 
syntactically in requiring key:value pairs.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-23 Thread bartc

On 23/05/2018 07:47, Steven D'Aprano wrote:

On Tue, 22 May 2018 18:51:30 +0100, bartc wrote:


On 22/05/2018 15:25, Chris Angelico wrote:

[...]

The tuple has nothing to do with the parentheses, except for the
special case of the empty tuple. It's the comma.


No? Take these:

   a = (10,20,30)
   a = [10,20,30]
   a = {10,20,30}

If you print type(a) after each, only one of them is a tuple - the one
with the round brackets.


You haven't done enough testing. All you have done is found that "round
brackets give a tuple, other brackets don't". But you need to test what
happens if you take away the brackets to be sure that it is the round
brackets which create the tuple:

 a = 10, 20, 30  # take away the ()

You still get a tuple. Taking away the [] and {} also give tuples.


If ... is a comma-separated sequence of (2 or more) expressions, then:

 Enclosed with: It yields:

 {...} brackets Set (or dict with key:value items)
 [...] brackets List
 (...) brackets Tuple
  ...  (no) bracketsTuple

Each ... contains commas. But it is what surrounds ..., or that doesn't 
surround ..., that determines what the construction yields. The commas 
are not specific to tuples.



What happens if you add extra brackets?

 a = ((10, 20, 30))  # tuple
 b = ([10, 20, 30])  # list
 c = ({10, 20, 30})  # set


0 items within the list:

()Empty tuple
[]Empty list
{}Empty dict

1 item within the list

(x)   Yields x
[x]   List of one item
{x}   Set of one item

Because () is used for normal bracketing of expression terms, it 
requires this special form to denote a tuple:


(x,)  Tuple of one item

which can ALSO be used to form one element lists, sets and dicts.


What if we take away the commas but leave the brackets? If "brackets make
tuples", then the commas ought to be optional.

 a = (10 20 30)  # SyntaxError


This will be a syntax error with [] and {} as well. I never said the 
commas were optional, only that the resulting type depends on bracketing 
(see above)



The comma is just generally used to separate expressions, it's not
specific to tuples.


Nobody said it was specific to tuples. That would be an absurd thing to
say. What was said is that the comma is what makes tuples, not the
brackets.


Suppose you were parsing Python source, had just processed a term of an 
expression, and the next token was a comma. What came before the term 
may have been (, {, [ or nothing. What comes next, would you call:


  readtuplesequence()

or:

  readexprsequence()

or:

  readexprsequence(type_of_sequence) # ie. tuple, list etc

(assume recursive descent parser). Since they have to accomplish the 
same thing (read series of comma-separated terms), what would be the 
advantage in having a separate routine just for tuples?



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-22 Thread bartc

On 22/05/2018 16:57, Chris Angelico wrote:

On Wed, May 23, 2018 at 1:43 AM, Ian Kelly <ian.g.ke...@gmail.com> wrote:



In other words, the rule is not really as simple as "commas make
tuples". I stand by what I wrote.


Neither of us is wrong here.


Sorry, but I don't think you're right at all. unless the official 
references for the language specifically say that commas are primarily 
for constructing tuples, and all other uses are exceptions to that rule.


AFAICS, commas are used just like commas everywhere - used as 
separators. The context tells Python what the resulting sequence is.


 "Commas make tuples" is a useful

oversimplification in the same way that "asterisk means
multiplication" is. The asterisk has other meanings in specific
contexts (eg unpacking), but outside of those contexts, it means
multiplication.


I don't think that's quite right either. Asterisk is just an overloaded 
token but you will what it's for as soon as it's encountered.


Comma seems to be used only as a separator.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-22 Thread bartc

On 22/05/2018 15:25, Chris Angelico wrote:

On Tue, May 22, 2018 at 8:25 PM, bartc <b...@freeuk.com> wrote:

Note that Python tuples don't always need a start symbol:

a = 10,20,30

assigns a tuple to a.


The tuple has nothing to do with the parentheses, except for the
special case of the empty tuple. It's the comma.


No? Take these:

 a = (10,20,30)
 a = [10,20,30]
 a = {10,20,30}

If you print type(a) after each, only one of them is a tuple - the one 
with the round brackets.


The 10,20,30 in those other contexts doesn't create a tuple, nor does it 
here:


  f(10,20,30)

Or here:

  def g(a,b,c):

Or here in Python 2:

  print 10,20,30

and no doubt in a few other cases. It's just that special case I 
highlighted where an unbracketed sequence of expressions yields a tuple.


The comma is just generally used to separate expressions, it's not 
specific to tuples.


--
bart
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-22 Thread bartc

On 22/05/2018 03:49, Mikhail V wrote:

On Mon, May 21, 2018 at 3:48 PM, bartc <b...@freeuk.com> wrote:



But I have to say it looks pretty terrible, and I can't see that it buys
much over normal syntax.




# t
# t
   11  22  33



Is this example complete? Presumably it means ((11,22,33),).


You get the point?
So basically all nice chars are already occupied.


You mean for introducing tuple, list and dict literals? Python already 
uses (, [ and { for those, with the advantage of having a closing ), ] 
and } to make it easier to see where each ends.


The only advantage of your proposal is that it resembles Python block 
syntax a little more, but I don't know if it follows the same rules of 
indentation and for inlining content.



Proposing Unicode symbols -- that will probably will be
dead on arrival (just remembering some of such past proposals).
Leaving out symbols could be an option as well.
Still the structure needs a syntactical entry point.


Note that Python tuples don't always need a start symbol:

   a = 10,20,30

assigns a tuple to a.


E.g.

data = ///
t
t
   11  22  33

Hmm. not bad. But I must think about parsing as well.


Have you tried writing a parser for this? It can be stand-alone, not a 
full parser for Python code. That could help reveal any problems.


But think about when t could be the name of a variable, and you want to 
construct the tuple (t,t,t):


 ///t t t t

That already looks a little odd. And when the /// is omitted:

 t t t t

Is that one tuple of (t,t,t), or a tuple of (t,(t))?


Also, is ///t ///t ///t a b c allowed, or does it have to be split 
across lines? If it is allowed, then it's not clear to which tuple b and 
c belong to, or even a, if an empty tuple is allowed.


I think this syntax is ambiguous; you need a more rigorous 
specification. (How will it parse ///.3 4 5 for example?)



So I can change types of all child nodes with one keystroke.


Suppose you only wanted to change the top one?





The ///d dictionary example is ambiguous: can you have more than one
key:value per line or not? If so, it would look like this:

   ///d "a" "b" "c" "d" "e" "f"


///d   "a" "b""c" "d""e" "f"

Now better? :-)


Not really. Suppose one got accidentally missed out, and there was some 
spurious name at the end, so that you had this (dispensing with quotes, 
these are variables):


   ///d a b c e f x

The pairing is a:b, c:e, f:x rather the a:b, c:d, e:f that was intended 
with the x being an error. Use of : and , add useful redundancy. It's 
not clear whether:


  ///d a b
c e
f x

is allowed (I don't know what the terminating conditions are), but in a 
very long dict literal, it's easy to get confused.



I think this is an interesting first draft of an idea, but it doesn't 
seem rigorous. And people don't like that triple stroke prefix, or those 
single letter codes (why not just use 'tuple', 'list', 'dict')?


For example, here is a proposal I've just made up for a similar idea, 
but to make such constructors obey similar rules to Python blocks:


 tuple:
 10
 20
 30

 list:
 list:
 10
 tuple: 5,6,7
 30
 "forty"
 "fifty"

So, the keyword prefixes are followed by ":"; entities can follow on the 
same line, but using "," rather than ";", and the end of a sequence is 
just like the end of a 'suite'.


But even here there is ambiguity: the '5,6,7' forms a tuple of its own 
in normal syntax, so these could be a tuple of one tuple of 3, rather 
than a tuple of 3. (I may need ";" here rather than ,")


--
bartc

--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-21 Thread bartc

On 20/05/2018 03:58, Mikhail V wrote:

I have made up a printable PDF with the current version
of the syntax suggestion.

https://github.com/Mikhail22/Documents/blob/master/data-blocks-v01.pdf

After some of your comments I've made some further
re-considerations, e.g. element separation should
be now much simpler.
A lot of examples with comparison included.


Comments, suggestions are welcome.


This is intended to be used inside actual Python programs?

In that case code is normally displayed in fixed pitch, as it would 
normally be viewed in a code editor, even if part of a document.


But I have to say it looks pretty terrible, and I can't see that it buys 
much over normal syntax.


The use of the funny /// symbol, and reserving identifiers t, L and d 
when following ///, is also a little naff.


(Note that lines starting // are interpreted as comment lines in C and 
C++ languages, and may be used by others too. Those used to see those as 
comments may get confused.)


It's not clear what ///. is for, or why it's necessary (presumably you 
have to use ///. /// instead of /// ///).


The ///d dictionary example is ambiguous: can you have more than one 
key:value per line or not? If so, it would look like this:


  ///d "a "b "c" "d" "e" "f"

so that the pairing is not clear.

You also seem to have more need of the "\" line continuation character 
in your syntax, because Python can do this:


   data = {

but you need:

   date = \
   ///

Or do you also allow: date = ///  with data following on the next line?


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: "Data blocks" syntax specification draft

2018-05-21 Thread bartc

On 21/05/2018 05:05, Chris Angelico wrote:

On Mon, May 21, 2018 at 2:00 PM, Mikhail V <mikhail...@gmail.com> wrote:



heaps! oh come on, youre making up again.


No, I'm not making it up. Just because the PDF works perfectly for
you, you assume that it'll work perfectly for everyone. That is not
the case, and that isn't my problem - it's your problem.


Perhaps the problem could well be at your end.

I don't remember having much trouble viewing PDFs, it's just a bit of a 
pain to do so (and I prefer to read them properly downloaded via the 
Adobe reader so that I can scroll in page mode, some extra steps).


I've just viewed a PDF on an old, low-spec Linux machine and it seemed 
fine, so it's not a Windows thing. (Can't access the OP's link for 
reasons unconnected with PDF.)


PDF seems to be universally used for all sorts of things (I used to have 
to print boarding passes via PDF; I doubt airlines wanted to alienate 
Linux users).


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-20 Thread bartc

On 20/05/2018 16:37, Dennis Lee Bieber wrote:

On Sun, 20 May 2018 12:38:59 +0100, bartc <b...@freeuk.com> declaimed the



Just for giggles, I decided to write the start of a PPM reader (it only
handles P6 binary, doesn't have the code for the other styles, and doesn't
incorporate PPM writer functions but...)

It successfully processed the PPM file my prior writing code
generated...

-=-=-=-=-=-
import struct

class PPM(object):

...


-=-=-=-=-=-
Type: b'P6' Width: 3Height: 3   MaxVal: 255
[(0, 255, 0), (128, 128, 128), (255, 0, 0), (128, 128, 128), (255, 255,
255), (128, 128, 128), (0, 0, 255), (128, 128, 128), (0, 0, 0)]


Yes, that appears to work. (But I think it has a bug when there are two 
successive #-comment lines.)


Meanwhile I've given up my campaign to have only line-oriented headers, 
and spent the five minutes needed to allow for free-format headers, and 
actually it's now simpler:


  readln @f,sig
  width  := readnextint(f)  # a 6-line function returning 0 on error
  height := readnextint(f)
  maxval := readnextint(f)  # (for some file types)

However I don't think this works properly when a comment follows (on the 
same line) the last format item, as a well-formed header will have an 
extra white-space character to be skipped. I believe your program might 
have the same problem; it will read the header OK, but not start reading 
the data at the right place.


(A rather poor specification I think which could have been tightened up.)

--
bartc

--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-20 Thread bartc

On 20/05/2018 10:19, Peter J. Holzer wrote:

On 2018-05-19 13:43:14 +0100, bartc wrote:



Text files, yes. Not 'text mode' which is something inflicted on us by the C
library.


I very much enjoy the fact that the programming languages I've used to
process text files in the last 15 years (i.e. Perl and Python) can
convert just about any character encoding and any newline convention to
a canonical representation with a single parameter to open.


And I quite like the fact that I have complete control over what is 
being read or written. And if there is some obscure bug, that the actual 
contents of a file aren't hidden behind some protocols in library.



When the "textmode" in the C library was invented (it wasn't just C,
though: Wirth's Modula-2 book describes a remarkably similar mechanism)


I also like the way that the single '\n' character in a source file 
conveniently matches the single '\n' character used by Unix. So no 
translation of any kind is needed, and you can assume that if you write 
1000 empty lines the output file will be 1000 bytes long.


The complications are in all the other operating systems. And the bugs 
when someone assumes that one '\n' in a program equals one 10 byte in a 
file, will only occur on those other systems.



The premise in your previous message was that you are on an EBCDIC based
system (and a quite limited at that which doesn't even have a character
set conversion library, so you would have to program the character set
conversion yourself). So you can't cop out with "I don't care about
EBCDIC based systems".


I think it was Steven D'Aprano who brought up EBCDIC. And not seriously.


(Your single readln wouldn't work reliably on ASCII systems either,
since - as others have already pointed out - the header format of a PPM
file isn't quite as simple as you imagine).


The line-reading has to be in a loop in order to skip comments, which I 
didn't show because the bigger obstacle in Python appeared to be reading 
the numbers (and is not needed for those specific Mandelbrot images). 
It's something like this: repeat ... until width.isint



A PPM reader for an EBCDIC system will certainly be able to read PPM
files on an EBCDIC system. Why would anybody write and release software
which doesn't work on the intended target system?


Then the /same software/ probably wouldn't work anywhere else. I mean 
taking source which doesn't know or care about what system its on, and 
that operates on a ppm file downloaded from the internet.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-20 Thread bartc

On 20/05/2018 02:58, Dennis Lee Bieber wrote:

On Sun, 20 May 2018 02:13:01 +0100, bartc <b...@freeuk.com> declaimed the
following:


I think if you are going to be generating ppm, then the best choice of
format, for the widest acceptance, is to separate the header groups with
a newline. (As I mentioned my downloaded viewer needs a new line after
the first group. My own viewer, which I only threw together the other
day to test that benchmark, also expects the newlines. Otherwise I might
need to do 5 minutes' coding to fix it.)


IrfanView had no problem opening my file, where the only  was the
one after the maxval field. Between that as an example, and the
documentation of the format, one could decree that any reader that
/requires/ s at various points is erroneous or incomplete.


I think that's the wrong approach. You need to work to the lowest common 
denominator, not the highest. (Within reason anyway.)


If you tested half a dozen viewers, and two of them don't need any 
newlines between groups, and the rest need at least one, or between all 
groups, what does this tell you about how you should be generating your 
file?


You may not know what viewer or reader is to be used.

I have a suspicion (although I will have to test more) that newlines 
between groups is more universally accepted.


(I tried to get irfanview but it tells me I need Windows 10, which is an 
odd requirement for an image viewer. So I'll have to try it later.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 20/05/2018 01:39, Dennis Lee Bieber wrote:

On Sat, 19 May 2018 23:14:08 +0100, bartc <b...@freeuk.com> declaimed the
following:



The comments and examples here:
https://en.wikipedia.org/wiki/Netpbm_format, and all actual ppm files
I've come across, suggest the 3 parts of the header (2 parts for P1/P4)
are on separate lines. That is, separated by newlines. The comments are
a small detail that is not hard to deal with.



Wikipedia is not a definitive document...

http://netpbm.sourceforge.net/doc/ppm.html has
"""
Each PPM image consists of the following:

 A "magic number" for identifying the file type. A ppm image's magic
number is the two characters "P6".
 Whitespace (blanks, TABs, CRs, LFs).
 A width, formatted as ASCII characters in decimal.
 Whitespace.
 A height, again in ASCII decimal.
 Whitespace.
 The maximum color value (Maxval), again in ASCII decimal. Must be less
than 65536 and more than zero.
 A single whitespace character (usually a newline).
"""


I think if you are going to be generating ppm, then the best choice of 
format, for the widest acceptance, is to separate the header groups with 
a newline. (As I mentioned my downloaded viewer needs a new line after 
the first group. My own viewer, which I only threw together the other 
day to test that benchmark, also expects the newlines. Otherwise I might 
need to do 5 minutes' coding to fix it.)


(Regarding those benchmarks 
(https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/mandelbrot.html), 
as far as I can tell every language generates the ppm file inline (no 
special ppm library), and they all generate the P4 signature on one line 
and width/height on the next line.


(Click on any source file and look for "P4". Most do it with less fuss 
than Python too.))



That all the ones you've seen have a certain layout may only mean that
the generating software used a common library implementation:
http://netpbm.sourceforge.net/doc/libnetpbm.html


Blimey, it makes a meal of it. I got the impression this was supposed to 
be a simple image format, and with the line-oriented all-text formats it 
was.


But it could be worse: they might have used XML.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 20:47, Dennis Lee Bieber wrote:

On Sat, 19 May 2018 13:28:41 +0100, bartc <b...@freeuk.com> declaimed the
following:



Out of interest, how would Python handle the headers for binary file
formats P4, P5, P6? I'd have a go but I don't want to waste half the day
trying to get past the language.


Based upon http://netpbm.sourceforge.net/doc/ppm.html

P6  1024768 255

and

P6
# random comment
1024
768
# another random comment
255

are both valid headers.


The comments and examples here: 
https://en.wikipedia.org/wiki/Netpbm_format, and all actual ppm files 
I've come across, suggest the 3 parts of the header (2 parts for P1/P4) 
are on separate lines. That is, separated by newlines. The comments are 
a small detail that is not hard to deal with.


I think if ppm readers expect the 2-3 line format then generators will 
be less tempted to either stick everything on one line or stretch it 
across half a dozen. The point of ppm is simplicity after all.


And actually, a ppm reader I've just downloaded, an image viewer that 
deals with dozens of formats, had problems when I tried to put 
everything on one line. (I think it needs the signature on its own line.)



Reading an arbitrary PPM thereby is going to be tedious.


PPM was intended to be simple to read and to write (try TIFF, or JPEG, 
for something that is going to be a lot more work).



ppmfil = open("junk.ppm", "wb")


(ppmfil? We don't have have 6-character limits any more.)


header = struct.pack("3s27s",

... b"P6 ",
... bytes("%8s %8s %8s\n" %
... (width, height, maxval),
... "ASCII"))


header

b'P6 1024  768  255\n'


Hmm, I'd write this elsewhere, if it's going to be one line, as just:

  println @f,"P6",width,height,"255"

I'm sure Python must be able to do something along these lines, even if 
it's:


  f.write("P6 "+str(width)+" "+str(height)+" 255\n")

with whatever is needed to make that string compatible with a binary 
file. I don't know what the struct.pack stuff is for; the header can 
clearly be free-format text.



And how would that language handle Unicode text?


That's not relevant here. (Where it might be relevant, then Unicode must 
be encoded as UTF8 within an 8-bit string.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 12:33, Peter J. Holzer wrote:

On 2018-05-19 11:33:26 +0100, bartc wrote:



Not you understand why some of us don't bother with 'text mode' files.


"Not" or "Now"?


Now.


Yesterday you claimed that you worked with them for 40 years.


Text files, yes. Not 'text mode' which is something inflicted on us by 
the C library.


(All my current programs can deal with lf or cr/lf line endings. I 
dropped cr-only line endings as I hadn't seen such a file since the 90's.)



However if you have an actual EBCDIC system and would to read .ppm files,
then you will have trouble reading the numeric parameters as they are
expressed using sequences of ASCII digits.



I think the simplest way would be perform the calculation by hand
(new_value = old_value * 10 + next_byte - 0x30). At least in a language
which lets me process individual bytes easily. That would even work on
both ASCII and EBCDIC based systems (and on every other platform, too).


/The/ simplest? Don't forget the numbers can be several digits each. 
Here's how I read them (NOT Python):


readln @f, width, height

Would it work in an EBCDIC based system? Probably not. But, who cares? 
(I can't say I've never used such a system, but that was some ancient 
mainframe from the 70s. But I'm pretty certain I won't ever again.)


(Perhaps someone who has access to an EBCDIC system can try a .PPM 
reader to see what happens. I suspect that won't work either.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 12:38, Chris Angelico wrote:

On Sat, May 19, 2018 at 8:33 PM, bartc <b...@freeuk.com> wrote:



But then you are acknowledging the file is, in fact, ASCII.


Cool! So what happens if you acknowledge that a file is ASCII, and
then it starts with a byte value of E3 ?


It depends.

If this is a .ppm file I'm trying to read, and it starts with anything 
other than 'P' followed by one of '1','2','3','4','5','6' (by which I 
mean the ASCII codes for those), then it's a bad ppm file.


What are you really trying to say here?

Out of interest, how would Python handle the headers for binary file 
formats P4, P5, P6? I'd have a go but I don't want to waste half the day 
trying to get past the language.


It is quite possible to deal with files, including files which are 
completely or partially text, a byte at a time, without having silly 
restrictions put on them by the language.


Here's the palaver I had to go through last time I wrote such a file 
using Python, and this is just for the header:


s="P6\n%d %d\n255\n" % (hdr.width, hdr.height)
sbytes=array.array('B',list(map(ord,s)))
f.write(sbytes)

Was there a simple way to write way to do this? Most likely, but you 
have to find it first! Here's how I write it elsewhere:


  println @f, "P6"
  println @f, width,height
  println @f, 255

It's simpler because it doesn't get tied up in knots in trying to make 
text different from bytes, bytearrays or array.arrays.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-19 Thread bartc

On 19/05/2018 02:26, Chris Angelico wrote:

On Sat, May 19, 2018 at 11:10 AM, bartc <b...@freeuk.com> wrote:



The .ppm (really .pbm) file which was the subject of this sub-thread has its
header defined using ASCII. I don't think an EBCDIC 'P4' etc will work.



"Defined using ASCII" is a tricky concept. There are a number of file
formats that have certain parts defined because of ASCII mnemonics,
but are actually defined numerically. The PNG format begins with the
four bytes 89 50 4E 47, chosen because three of those bytes represent
the letters "PNG" in ASCII. But it's defined as those byte values. The
first three represent "i&+" in EBCDIC, and that would be just as
valid, because you get the correct bytes.

Your file contains bytes. Not text.


Not you understand why some of us don't bother with 'text mode' files.

However if you have an actual EBCDIC system and would to read .ppm 
files, then you will have trouble reading the numeric parameters as they 
are expressed using sequences of ASCII digits.


The simplest way would be to pass each byte through an ASCII to EBCDIC 
lookup table (so that code 0x37 for ASCII '7', which is EOT in EBCDIC, 
is turned into 0xF8 which is EBCDIC '7').


But then you are acknowledging the file is, in fact, ASCII.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 19/05/2018 02:00, Steven D'Aprano wrote:

On Fri, 18 May 2018 20:42:05 -0400, Dennis Lee Bieber wrote:


Unfortunately -- in the current era, "text" means "a defined

encoding",

Text has ALWAYS meant "a defined encoding". It is just that for a long
time, people could get away with assuming that the encoding they used was
the *only* possible encoding, and using it implicitly without even
thinking about it.

That One True Encoding is, of course, EBCDIC.

No, I kid, of course it is Mac-Roman.

Ha ha, no, just pulling your leg... of course it's ISO 8859-1 (not to be
confused with ISO-8859-1, yes the hyphen is significant). Except for web
browsers, which are required to interpret declarations of ISO 8859-1 as
CP-1252 instead.

Actually, I'm still kidding around. Everyone knows the One True Encoding
is ISCII. (That's not a typo.)


The .ppm (really .pbm) file which was the subject of this sub-thread has 
its header defined using ASCII. I don't think an EBCDIC 'P4' etc will work.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 19/05/2018 01:42, Dennis Lee Bieber wrote:

On Fri, 18 May 2018 22:53:06 +0100, bartc <b...@freeuk.com> declaimed the
following:



I've worked with text files for 40 years. Now Python is telling me I've
been doing it wrong all that time!

Look at the original code I posted from which this Python was based.
That creates a file - just a file - without worrying about whether it's
text or binary. Files are just collections of bytes, as far as the OS is
concerned.


And on Windows, there is a difference.

On Windows, sending a  byte to a TEXT file will result in writing
. On Windows a new-line is indicated by that combination: .


Are you sure that's Windows itself, and not just the C library? (Which 
was presumably trying to be helpful by making programs work on Unix and 
Windows without changes, but is actually a nuisance.)


Some Windows programs may need cr,lf, but I doubt they well convert 
between lf and cr,lf unless they use C file functions.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 19/05/2018 01:00, Chris Angelico wrote:

On Sat, May 19, 2018 at 7:53 AM, bartc <b...@freeuk.com> wrote:

I've worked with text files for 40 years. Now Python is telling me I've been
doing it wrong all that time!

Look at the original code I posted from which this Python was based. That
creates a file - just a file - without worrying about whether it's text or
binary. Files are just collections of bytes, as far as the OS is concerned.

So what could be more natural than writing a byte to the end of a file?


So, you create a file without worrying about whether it's text or
binary, and you add a byte to the end of the file. That means you're
treating it as a binary file, not as a text file. Do you understand
that?


Well I /don't/ worry about, but the reason is that long ago I switched 
to using binary mode for all files. (My 'createfile' function shown 
after my sig shows how it works. It's a thin wrapper around C's fopen, 
and it's C that has this thing about text and binary files.)


But in Python, even as a binary file I had some trouble writing to it, 
as it's fussy when dealing with strings and bytearrays and bytes and 
array.arrays all with their own rules.


--
bartc

global function createfile(name, options="wb") =
if not name.isstring or name="" then
return nil
fi

return fopen(name,options)
end



--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 19:57, Chris Angelico wrote:

On Sat, May 19, 2018 at 4:48 AM, bartc <b...@freeuk.com> wrote:

The translation was straightforward, EXCEPT that I wasted an hour trying to
figure out to write /a single byte/ to a file. The following eventually
worked, using a binary file as a text one had Unicode problems, but it's
still hacky.


You can't write a single byte to a text file, because text files don't
store bytes. I'm not sure which part of this took you an hour to
figure out.


I've worked with text files for 40 years. Now Python is telling me I've 
been doing it wrong all that time!


Look at the original code I posted from which this Python was based. 
That creates a file - just a file - without worrying about whether it's 
text or binary. Files are just collections of bytes, as far as the OS is 
concerned.


So what could be more natural than writing a byte to the end of a file?

(Note that this particular file format is a hybrid; it has a text header 
followed by binary data. This is not unusual; probably every binary 
format will contain text too.


A programming language - and one that's supposed to be easy - should 
take that in its stride.)



# For Python 3 (it'll work on Python 2 but give the wrong results)


What does "work" mean? If it gives the wrong results, how is it working?


It works in that Python 2 is not complaining about anything, and it 
finishes (very quickly too). But the output file is 3 times the size it 
should be, and contains the wrong data.



end = 0# lines containing 'end' can be removed


You're not writing Python code here.


Sorry but I'm lost without the block terminators. I needed them to match 
the logic to the original. After I used them, I decided it looked better.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 20:15, Alexandre Brault wrote:

On 2018-05-18 02:48 PM, bartc wrote:



Note this version doesn't use any imports at all.

Except your version doesn't read its parameter from the command line
args and doesn't output to standard output, which all of the others do.
That's why the other Python versions of that code imported sys: Because
that's how you read from commandline args and write bytes to standard
output in Python. You don't need to know *exactly* how sys works to have
an idea of what sys.argv and sys.stdout do


My version wasn't an entry in the Shoot-out game.

The command line input was left out as, if someone wants to port the 
algorithm to their language, they will know how it's done. (And each one 
will do that messy bit of input a little differently.)


Capturing the output as I said was problematic, and I needed that in 
order to display the .ppm output so that I could see it. Maybe in a 
Linux environment it can be piped into a program to do that. But that's 
not what I used; I needed an actual .ppm file.


(Something went wrong with the header when trying to direct it to a 
file. But directing binary data to a text display, with arbitrary byte 
sequences might be some escape code that does undesirable things, is a 
no-no for me.)


--
bart
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 19:36, Chris Angelico wrote:

On Sat, May 19, 2018 at 3:27 AM, bartc <b...@freeuk.com> wrote:



Once again, you're confusing *porting* with *emulating*.


This is the point. Those libraries are specific to Python and cannot be 
ported.


And very often they don't just provide general support that can be 
found, in different forms, in any target language, but is also specific.


Then if you are still porting your application, rather than rewriting or 
heavily adapting, you will need to emulate what they do.



If you don't
understand the difference between those two concepts, I recommend
spending some time with Wikipedia.


And I recommend you doing some actual porting from Python or any other 
big language.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 18:27, bartc wrote:


(BTW here's a port of that benchmark based on the Lua code:

   https://pastebin.com/raw/ivDaKudX


And here's the payoff: I was able to use this version to port it to 
Python. One which works better the the originals, as they wrote output 
to the screen (/binary/ output) which I found difficult to capture into 
an actual ppm file in order to test it worked.


The translation was straightforward, EXCEPT that I wasted an hour trying 
to figure out to write /a single byte/ to a file. The following 
eventually worked, using a binary file as a text one had Unicode 
problems, but it's still hacky.


Note this version doesn't use any imports at all.



# For Python 3 (it'll work on Python 2 but give the wrong results)

n = 200# adjust this for output size: n * n pixels
outfile = "test.ppm"   # adjust for output file name
end = 0# lines containing 'end' can be removed

def start():
m = 2/n
ba = 1<<(n%8+1)
bb = 1<<(8-n%8)

f = open(outfile,"wb")

f.write(b"P4\n")
f.write((str(n)+" "+str(n)+"\n").encode())

for y in range(n):
ci = y*m-1
b = 1

for x in range(n):
cr = x*m-1.5
zr = cr
zi = ci
zrq = cr*cr
ziq = ci*ci
b <<= 1
for i in range(1,50):
zi = zr*zi*2+ci
zr = zrq-ziq+cr
ziq = zi*zi
zrq = zr*zr
if zrq+ziq>4:
b +=1
break
end
end
if b>=256:
f.write(bytes([511-b]))
b = 1
end
end
if b != 1:
f.write(bytes([(ba-b)*bb]))
end
end
f.close()
end

start()

--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 15:47, Chris Angelico wrote:

On Sat, May 19, 2018 at 12:37 AM, bartc <b...@freeuk.com> wrote:

Have a look at some of the implementations here (to test some Mandelbrot
benchmark):

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/mandelbrot.html

The three Python examples all use 'import sys' and 'import multiprocessing',
none of which I can find any trace of as any sort of files let alone .py.
One of them also uses 'import array', which I can't find either.


I guess you didn't look very hard.


import multiprocessing
multiprocessing.__file__

'/usr/local/lib/python3.8/multiprocessing/__init__.py'


But you have to load it first to find out. (And if you follow nested 
imports, you will eventually get to those that don't have a .py file.)



import array
array.__file__

'/usr/local/lib/python3.8/lib-dynload/array.cpython-38m-x86_64-linux-gnu.so'


I get "module 'array' has no attribute '__file__'".

However, if I load sys, multiprocessing and array, then print 
sys.modules, I get a very long list of modules, at which point anyone 
thinking of emulating the behaviour of those modules would promptly give up.




(BTW here's a port of that benchmark based on the Lua code:

  https://pastebin.com/raw/ivDaKudX

The actual language is not relevant, as this is clear enough that it 
could probably be re-implemented on anything. The 'files' module is only 
needed for openfile, closefile, and 'outbyte', which should be standard 
in any language. ('writebytes' might not be, so that was changed to a loop.)


The point of all this: writing clean simple code has a big advantage, 
and I think my version is one of the cleanest (even though it writes to 
a file, not the console).


Although it should be pointed out that these programs are pulling out 
all the stops in order to get the best performance; clarity wasn't a 
priority. But they should all use the same algorithm (to be within the 
rules), and it is that we're trying to extract.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 13:29, Steven D'Aprano wrote:

On Fri, 18 May 2018 12:09:02 +0100, bartc wrote:


On 18/05/2018 02:45, Steven D'Aprano wrote:

On Fri, 18 May 2018 02:17:39 +0100, bartc wrote:


Normally you'd use the source code as a start point. In the case of
Python, that means Python source code. But you will quickly run into
problems because you will often see 'import lib' and be unable to find
lib.py anywhere.


Seriously? There's a finite number of file extensions to look for:

.py .pyc .pyo .pyw .dll .so

pretty much is the lot, except on some obscure platforms which few
people use.


Which one corresponds to 'import sys'?


The functions in sys are built-in to the CPython interpreter. For other
interpreters, they could correspond to some file, or not, as the
implementer desires.

So just like built-in functions, your first stop when porting is to READ
THE DOCUMENTATION and learn what the semantics of the functions you care
about, not the source code.


But there is a huge amount of such functionality.

There are other ways of doing it which can make more use of the actual 
language (ie. Python) and make use of generally available libraries (eg. 
msvcrt.dll/libc.so.6), with fewer mysterious goings-on in the middle.



If I see:

 result = math.sin(x)

and I want to port it, what should I do?



 print("Hello World!")

is "writing from scratch, not porting" unless the Ruby print uses
precisely the same implementation as the Python print. All that matters
is that for *this* specific use, the two print commands behave the same.


And you've chosen two of the most common language features for your 
examples, which would probably be available on a 1970s BASIC (I know SQR 
was, can't remember about SIN).


Those are not the kinds of problem I mean.




Since every language has features that some other languages don't have,
is it your position that it is impossible to port code from any
language to any other?


I'm saying some languages make it more difficult, and Python is one of
them


Only if you are trying to port *down* the Blub hierarchy, to a less
powerful language.


You might be trying to go from one level of Blub to the same level of 
Blub, but the two Blubs are completely incompatible.



I'm just saying that in my experience, given the choice of porting the
same program from other Python or, say, Lua, I would choose Lua.


If they are *the same program* then surely they will be identical (modulo
syntax and naming of builtins) and there should be no difference in
difficulty in porting.

If they are *semantically the same* (they do the same thing) but have
different implementations, then you've just blown a hole in your own
argument.


Have a look at some of the implementations here (to test some Mandelbrot 
benchmark):


https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/mandelbrot.html

The three Python examples all use 'import sys' and 'import 
multiprocessing', none of which I can find any trace of as any sort of 
files let alone .py. One of them also uses 'import array', which I can't 
find either.


Now look at the Lua and Lua#2 examples. They don't appear to use 
anything that hairy, and could be good candidates for porting. As would 
be Free Pascal #3 (the other Pascals use threads).


(Although when I tried to port the Lua just now, I ran into trouble 
because I didn't know Lua well enough (it's also poorly written IMO, and 
I ran out of time). That's fair enough; you need to be familiar with the 
language you're porting from.


But I could be very familiar with Python and still wouldn't be able to 
directly translate code which depended heavily on library functions. 
Knowing the specification of those is not going to help if I don't 
already have something that does exactly the same job.)


Looking also at the amount of code, the Pythons don't appear to be that 
much shorter or simpler than many of the others.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: syntax oddities

2018-05-18 Thread bartc

On 17/05/2018 23:49, Chris Angelico wrote:

On Fri, May 18, 2018 at 8:44 AM, Paul  wrote:



I've been using email for thirty years, including thousands of group emails
at many tech companies, and no one has ever suggested, let alone insisted
on, bottom posting.  If someone's late to a thread they can read from it
the bottom up.


Remind me which direction text is usually written in English?


Is this a trick question? It's usually written left to right.
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-18 Thread bartc

On 18/05/2018 02:45, Steven D'Aprano wrote:

On Fri, 18 May 2018 02:17:39 +0100, bartc wrote:


Normally you'd use the source code as a start point. In the case of
Python, that means Python source code. But you will quickly run into
problems because you will often see 'import lib' and be unable to find
lib.py anywhere.


Seriously? There's a finite number of file extensions to look for:

.py .pyc .pyo .pyw .dll .so

pretty much is the lot, except on some obscure platforms which few people
use.


Which one corresponds to 'import sys'?

If the source to such libraries is not available then it is necessary to 
emulate that functionality. You are writing from scratch, not porting, 
according to specifications. And those specifications may be 
inexplicably tied to the inner workings of the language.


That is a little bit harder, yes? Especially as Python is a scripting 
language and might rely more than most on this quite extensive built-in 
functionality, even on fairly short programs.


(When I once thought about implementing an interpreter for Python 
byte-code, I found all this out very quickly. Such an interpreter could 
work perfectly but it would not get past 'import sys'.)



To successful port anything but the most trivial code, you actually have
to understand *both* languages -- including the syntax, semantics, built-
in language features, AND libraries.


Don't forget configuration and build systems. The code you want to port 
may not even exist, but is synthesised as part of the build process, and 
be specific to a particular build.


I'm talking about the seemingly rare case these days where you DO have 
the source code!



That's one problem. Others might involve how to deal with something like
__globals__ which doesn't have an equivalent in the target language. And
we haven't started on features that are specific to Python.


How about features which are specific to C


I'm quite familiar with C which has its own set of problems. But taking 
one aspect, if a C program relies on its standard library, then it is 
very often possible to directly call that standard library from another 
language, so you don't need to reimplement it, nor port it.



Since every language has features that some other languages don't have,
is it your position that it is impossible to port code from any language
to any other?


I'm saying some languages make it more difficult, and Python is one of 
them, especially written 'Pythonically', which seems to be code for 
'this only makes sense in Python', so that you can't understand it even 
if you have no intention of  porting it.




If you want to *really* see code that is hard to port, you should try
porting an Inform 7 program to another language. Any other language.


You seem to be saying that because it is rarely completely impossible to 
port software, that we disregard any difficulties and consider all such 
porting as equally trivial.


I'm just saying that in my experience, given the choice of porting the 
same program from other Python or, say, Lua, I would choose Lua.


Same with choosing between 'full-on' C++ and, say, Pascal.

Both C++ and Python can be used to write software in a simple style (as 
I would use); typically they are not used that way. Given the rich set 
of esoteric features of both, programmers do like to pull out all the stops.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-17 Thread bartc

On 17/05/2018 18:19, Steven D'Aprano wrote:

On Thu, 17 May 2018 15:50:17 +0100, bartc wrote:



Of course, full-on Python code is pretty much impossible to port
anywhere else anyway.


*rolls eyes*



Any pair of languages will have code that is hard to port from one to the
other without jumping through hoops. Try porting C code with lots of
dynamic memory allocations and pointer accesses to COBOL, or Scheme code
using continuations to Python, or Hyperscript text chunking code to
Fortran.

But hard does not mean "pretty much impossible".



Normally you'd use the source code as a start point. In the case of 
Python, that means Python source code. But you will quickly run into 
problems because you will often see 'import lib' and be unable to find 
lib.py anywhere.


That's one problem. Others might involve how to deal with something like 
__globals__ which doesn't have an equivalent in the target language. And 
we haven't started on features that are specific to Python.


But most non-esoteric languages will have functions, variables, 
assignments, expressions, loops and so on. The basics. Algorithms 
expressed using those will be simplest to port.


When I was looking at benchmarks involving multiple languages and wanted 
to port one to a new language, I usually ended up working from Pascal or 
Lua because they tended not to use advanced features that made the job 
harder.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-17 Thread bartc

On 17/05/2018 15:03, Chris Angelico wrote:

On Thu, May 17, 2018 at 9:58 PM, bartc <b...@freeuk.com> wrote:

On 17/05/2018 04:54, Steven D'Aprano wrote:


On Thu, 17 May 2018 05:33:38 +0400, Abdur-Rahmaan Janhangeer wrote:


what does := proposes to do?




A simple example (not necessarily a GOOD example, but a SIMPLE one):

print(x := 100, x+1, x*2, x**3)



It's also not a good example because it assumes left-to-right evaluation
order of the arguments. Even if Python guarantees that, it might be a
problem if the code is ever ported anywhere else.



Python DOES guarantee it, and nobody cares about your personal toy
language other than you. :)


As I said, it's poor form.

Of course, full-on Python code is pretty much impossible to port 
anywhere else anyway.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-17 Thread bartc

On 17/05/2018 14:32, Steven D'Aprano wrote:

On Thu, 17 May 2018 12:58:43 +0100, bartc wrote:


On 17/05/2018 04:54, Steven D'Aprano wrote:

On Thu, 17 May 2018 05:33:38 +0400, Abdur-Rahmaan Janhangeer wrote:


what does := proposes to do?



A simple example (not necessarily a GOOD example, but a SIMPLE one):

print(x := 100, x+1, x*2, x**3)


It's also not a good example because it assumes left-to-right evaluation
order of the arguments. Even if Python guarantees that, it might be a
problem if the code is ever ported anywhere else.


Seriously? You think we have a responsibility to write examples which
will work with arbitrary languages with arbitrarily different evaluation
order?

Okay, let's be clear:

- if the language has different evaluation order, it might not work;


That's right. The rest of your list either doesn't matter so much or is 
only remotely likely.


Doing a certain amount of restructuring of an algorithm expressed in one 
language in order to port it to another is expected. But relying on 
evaluation order is bad form. Suppose this bit of code was imported from 
elsewhere where evaluation was right to left?


Anyway, try this:

def showarg(x): print(x)

def dummy(*args,**kwargs): pass

dummy(a=showarg(1),*[showarg(2),showarg(3)])

This displays 2,3,1 showing that evaluation is not left to right.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: what does := means simply?

2018-05-17 Thread bartc

On 17/05/2018 04:54, Steven D'Aprano wrote:

On Thu, 17 May 2018 05:33:38 +0400, Abdur-Rahmaan Janhangeer wrote:


what does := proposes to do?



A simple example (not necessarily a GOOD example, but a SIMPLE one):

print(x := 100, x+1, x*2, x**3)


It's also not a good example because it assumes left-to-right evaluation 
order of the arguments. Even if Python guarantees that, it might be a 
problem if the code is ever ported anywhere else.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-16 Thread bartc

On 16/05/2018 16:09, Ian Kelly wrote:

On Tue, May 15, 2018, 6:36 PM bartc <b...@freeuk.com> wrote:


On 16/05/2018 01:04, Steven D'Aprano wrote:


I'm not a C coder, but I think that specific example would be immune to
the bug we are discussing, since (I think) you can't chain assignments in
C. Am I right?


Assignments can be chained in C (with right-to-left precedence) as can
augmented assignments (+= and so on).



Yes, but not in the particular example that Steven was referring to, which
you elided from your quoting.


I was responding to the chained assignment bit:

 a = b = c = d = x;

is allowed, but (depending on implementation details), the first = might 
be a different kind of assignment from the other three.



open(...) is not a valid LHS for assignment.


The LHS needs to be an lvalue. A function result by itself won't be. 
open() would need to be a macro that expands to an lvalue, or used like 
this when open() returns a pointer:


   a = *open() = x;

So it only needs an extra * (subject to the correct types of everything 
involved) for both these "=" to be plausible.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-15 Thread bartc

On 16/05/2018 01:04, Steven D'Aprano wrote:


I'm not a C coder, but I think that specific example would be immune to
the bug we are discussing, since (I think) you can't chain assignments in
C. Am I right?


Assignments can be chained in C (with right-to-left precedence) as can 
augmented assignments (+= and so on).


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-15 Thread bartc

On 15/05/2018 21:21, Peter J. Holzer wrote:


I have been programming in C since the mid-80's and in Perl since the
mid-90's (both languages allow assignment expressions). I accumulated my
fair share of bugs in that time, but AFAIR I made this particular error
very rarely (I cannot confidently claim that I never made it). Clearly
it is not “a total bug magnet” in my experience. There are much bigger
problems in C and Perl (and Python, too). But of course my experience is


All those languages use = for assignment and == for equality.

If like me you normally use a language where = means equality (and := is 
used for assignment), then you're going to get it wrong more frequently 
when using C or Python (I don't use Perl).


You might get it wrong anyway because = is used for equality in the real 
world too.


And it's an error that is awkward to detect (in C anyway, as it would be 
an error in Python) because usually both = and == are plausible in an 
expression.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-12 Thread bartc

On 12/05/2018 05:29, Steven D'Aprano wrote:

On Fri, 11 May 2018 16:56:09 +0100, bartc wrote:


0100, if not intended as octal, is
an undetectable error in C and Python 2.


How fortunate then that Python 2 is history (soon to be ancient history)
and people can use Python 3 where that error of judgement has been
rectified.


At least you're agreeing it was a mistake.

Although it does still mean that Python 3 has this funny quirk:

  a = 0   # ok
  a = 00123.  # ok
  a = int("00123")# ok
  a = 00123   # error


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-11 Thread bartc

On 11/05/2018 14:24, Chris Angelico wrote:

On Fri, May 11, 2018 at 9:09 PM, bartc <b...@freeuk.com> wrote:



 when 101'11010'000'B then ...

Try /that/ in hex /or/ octal.)


I've no idea what this is supposed to mean, or why you have groups of
three, five, and three. Looks like a possible bug to me. I'm sure it
isn't, of course, since you're one of those perfect programmers who
simply doesn't _make_ errors, but if it were my code, I would be
worried that it isn't correct somewhere.


The data-sheet for the 8087 numeric co-processor displays instructions 
of the two-byte opcodes in formats like this (high to low bits):


  [escape 1 0 1] [1 1 0 1 0 ST(i)]

'escape' is the 5-bit sequence 11011. ST(i) is a 3-bit register code. So 
considered as a one 16-bit value, it's divided into groups of 5:3:5:3. 
The escape sequence has already been detected, and the middle two groups 
have been isolated by masking with 111'1'000B.


So it is checking for combinations of those middle 3:5 groups of bits in 
a way that exactly matches how it's presented in the data sheet. And 
this instruction encoding is still used in current AMD/Intel x64 processors.


The x-101-11010-xxx pattern corresponds to the FST ST(0) to ST(i) 
instruction:


when 101'11010'000B then
genstr("fst ")
genstr(strfreg(freg))

It's not a bug. Just a good example of the use of binary where hex or 
octal doesn't cut it because the grouping isn't purely in threes or fours.


(I understand that binary literals were added to Python from version 
2.6. The question is why it took so long. They are not a heavyweight 
feature.)



Cool. So what's the native integer size for the real world? Use that
as your primary data type.

Oh, can't decide how many digits? That's a pity.


What's this got to do with octal? Because many languages impose a limit 
on the widths of numeric types, that somehow makes it OK to implement 
octal using leading zeros? Just to catch people out because octal is 
used so rarely.



Go get a time machine. Spend some time in the 1980s. See what kinds of
programming people were doing. HINT: It wasn't web app development.


I was doing lots of development in the 1980s. I just didn't use C.


Yeah, which is why your personal pet language has approximately one
user. The more things you change when you create a new language, the
more likely that it'll be utterly useless to anyone but yourself.

Consistency is a lot more important than many people give it credit for.


That's why 0100 is sixty four in Python 2, and an error in Python 3? 
Instead of being one hundred in both, as common sense would have dictated.


And, for that matter, one hundred in any of my languages, some of which 
did have more than one user.


BTW here is one C-ism that /didn't/ make it into Python 1:

 print (0xE-1)

This prints 13 (0xE is 14, minus 1). But it would be an error in 
conforming C compilers:


   printf("%d", 0xE-1);

"error: invalid suffix "-1" on integer constant"

Perhaps consistency isn't always as important as you say, not for 
something which is crazily stupid.


At least 0xE-1 generates an error (on some compilers; on others, only 
years later when you switch compiler). 0100, if not intended as octal, 
is an undetectable error in C and Python 2.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-11 Thread bartc

On 10/05/2018 21:18, bartc wrote:

On 10/05/2018 19:51, Chris Angelico wrote:

On Fri, May 11, 2018 at 4:31 AM, bartc <b...@freeuk.com> wrote:

   2x100  (4)   Binary
   3x100  (9)   Ternary
   4x100  (16)  Quaternary
   5x100  (25)  etc
   6x100  (36)
   7x100  (49)
   8x100  (64)  Octal
   9x100  (81)
   ...   (Not implemented 11x to 15x, nor 10x or 16x)
   0x100  (256) Hex


YAGNI much? How often do you need a base-9 literal in your code??


I've just found out these also work for floating point. So that:

  a := 8x100.5
  print a

gives 64.625 in decimal (not 64.5 as I expected, because .5 is 5/8 not 
5/10!). Exponent values are octal too, scaling by powers of 8.


I tried it in Python 3 (0o100.5 - I find that prefix fiddly to type 
actually as I have to stop and think), and it seems to be illegal.


Based floating point literals may be unusual, but bear in mind that in 
decimal, some values may not be represented exactly (eg 0.1). I believe 
that in base 2, 4, 8 or 16, any floating point literal can be 
represented exactly, at least up the precision available.



--
bartc


--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-11 Thread bartc

On 11/05/2018 01:11, Chris Angelico wrote:

On Fri, May 11, 2018 at 8:43 AM, bartc <b...@freeuk.com> wrote:

This is Wrong, and would have been just as obviously wrong in 1989.


Having spent many years programming in C and working on Unix, I
strongly disagree.


Using C is apt to give you a rather warped view of things. Such that 
everything in that language is superior to any other way of doing it.


(And actually, because C didn't have binary literals for a long time (I 
think it still doesn't, officially), there has been a lot of discussion 
in comp.lang.c about how they are not really necessary:


: A real programmer can auto-convert from hex
: It's so easy to knock up some macro that will do it
: They have /never/ needed binary in decades of programming

And so on. Meanwhile my own /recent/ code includes lines like this:

when 2x'101'11010'000 then ... # (x64/x87 disassembler)

although I think I still prefer a trailing B, with separator:

when 101'11010'000'B then ...

Try /that/ in hex /or/ octal.)


This was *not* obviously wrong. It's easy to say
"but look at the real world"; but in the 80s and 90s, nobody would
have said that it was "obviously wrong" to have the native integer
wrap when it goes past a certain number of bits. And in fact, your
description of the real world makes it equally obvious that numbers
should have a fixed width:


Much of the real world /does/ use fixed widths for numbers, like that 
odometer for a start, or most mechanical or electronic devices that need 
to display numbers. And with many such devices, they wrap as well 
(remember tape counters).


Even my tax return has a limit on how big a sum I can enter in the boxes 
on the paper form.


So the concept of fixed upper width, sometimes modulo numbers isn't 
alien to the general public. But leading zeros that completely change 
the perceived value of a number IS.



Octal makes a lot of sense in the right contexts. Allowing octal
literals is a Good Thing. And sticking letters into the middle of a
number doesn't make that much sense, so the leading-zero notation is a
decent choice.


No it isn't. You need something that is much more explicit. I know your 
C loyalty is showing here, but just admit it was a terrible choice in 
that language, even in 1972. And just as bad in 1989.


(I've used one language where the default base (radix it called it) was 
octal. But even there, when overriding it, the override mechanism was 
more obvious than the presence of a leading zero.)


 It's all very well to argue that it's a suboptimal

choice; but you have to remember that you're arguing that point from
2018, and we have about thirty years of experience using Python. The
choice was most definitely not fundamentally wrong. Ten years ago, the
point was revisited, and a different choice made. That's all.


This gives a few different ways of writing hex and octal:

  https://rosettacode.org/wiki/Literals/Integer

The leading zero method for octal seems to have permeated a few other 
languages. While F#, among others, uses 0o, which in my browser looks 
like Oo12. (If, for some reason, you did want leading zeros, then the 
number would look like OoO12.)


Why do language designers perpetuate bad ideas? The reason for designing 
a new language is just so you can get rid of some of them!


--
bartc


--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-10 Thread bartc

On 11/05/2018 01:25, Marko Rauhamaa wrote:

Chris Angelico <ros...@gmail.com>:


Octal makes a lot of sense in the right contexts.


I think octal is a historical relic from a time when people weren't yet
comfortable with hexadecimal.


It's a relic from when machines had word sizes that were multiples of 
three bits, or were divided up on 3-bit boundaries.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-10 Thread bartc

On 10/05/2018 18:58, Skip Montanaro wrote:

I wonder why someone would take a feature generally agreed to be a
poorly designed feature of C, and incorporate it into a new language.


I think you might be looking at a decision made in the late 1980s through a
pair of glasses made in 2018.

As a C programmer back then I never had a problem with C's octal number
notation. People coming from C, C++ or Java to Python at that time would
certainly have understood that syntax. It's only in the past 15 years or so
that we've seen tons of people coming to Python as a first language for
whom leading zero notation would be unfamiliar.


I'm pretty sure I would have had exactly the same opinion in 1989.

Decimal numbers in code should reflect their usage in everyday life as 
much as possible.


And in everyday life, leading zeros do not change the base of a number 
so that it becomes octal. If my car odometer says 075300, it means I've 
done 75,300 miles or km, not 31424:


   mileages = ( # python 2
215820,
121090,
075300,
005105)

   for m in mileages:
print m,"miles"

Output:

   215820 miles
   121090 miles
   31424 miles
   2629 miles

This is Wrong, and would have been just as obviously wrong in 1989.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-10 Thread bartc

On 10/05/2018 19:51, Chris Angelico wrote:

On Fri, May 11, 2018 at 4:31 AM, bartc <b...@freeuk.com> wrote:

   2x100  (4)   Binary
   3x100  (9)   Ternary
   4x100  (16)  Quaternary
   5x100  (25)  etc
   6x100  (36)
   7x100  (49)
   8x100  (64)  Octal
   9x100  (81)
   ...   (Not implemented 11x to 15x, nor 10x or 16x)
   0x100  (256) Hex


YAGNI much? How often do you need a base-9 literal in your code??


I've used base-4 a couple of times, but not base 9 yet, excepting when 
toying with stuff. But you need to be able to print numbers in those 
bases too [not Python, or C]:


a := 3x22   # base-3
println a:"x9"  # displays  in base-9

It's interesting to see the patterns that arise when doing arithmetic in 
mixed bases.


Anyway, those extra bases were easier to leave in than to exclude.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-10 Thread bartc

On 10/05/2018 18:03, Ian Kelly wrote:

On Thu, May 10, 2018 at 10:36 AM, bartc <b...@freeuk.com> wrote:

What, 0O100 instead of 0100? Yeah that's a big improvement...

Fortunately octal doesn't get used much.


The PEP discusses this:

"""
Proposed syntaxes included things like arbitrary radix prefixes, such
as 16r100 (256 in hexadecimal), and radix suffixes, similar to the
100h assembler-style suffix. The debate on whether the letter "O"
could be used for octal was intense -- an uppercase "O" looks
suspiciously similar to a zero in some fonts. Suggestions were made to
use a "c" (the second letter of "oCtal"), or even to use a "t" for
"ocTal" and an "n" for "biNary" to go along with the "x" for
"heXadecimal".

For the string % operator, "o" was already being used to denote octal.
Binary formatting is not being added to the % operator because PEP
3101 (Advanced String Formatting) already supports binary, %
formatting will be deprecated in the future.

At the end of the day, since uppercase "O" can look like a zero and
uppercase "B" can look like an 8, it was decided that these prefixes
should be lowercase only, but, like 'r' for raw string, that can be a
preference or style-guide issue.
"""

Personally I would have preferred the "t".




In my own [syntax] designs, for a long time I used:

  100H   (256) Hex
  100B   (4)   Binary

in addition to decimal. Then I when I switched to 0x for hex (so that I 
could write 0xABC instead of needing to write 0ABCH with a leading 
zero), it was easy to extend that scheme:


  2x100  (4)   Binary
  3x100  (9)   Ternary
  4x100  (16)  Quaternary
  5x100  (25)  etc
  6x100  (36)
  7x100  (49)
  8x100  (64)  Octal
  9x100  (81)
  ...   (Not implemented 11x to 15x, nor 10x or 16x)
  0x100  (256) Hex

I think Ada does something similar for example 2#100#.

However I wouldn't be bothered at having to use for example OCT(377), 
HEX(FF), BIN(_) or even OCTAL(377), although it's a bit clunky. 
At least it will be obvious; more so than 0o100.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Leading 0's syntax error in datetime.date module (Python 3.6)

2018-05-10 Thread bartc

On 10/05/2018 12:28, Skip Montanaro wrote:


This gave the following error:

Syntax Error: invalid token: C:\Users\Virgil Stokes\Desktop\Important
Notes_Files\CheckProcessingDate_02.py, line 7, pos 17
d0 = date(2018,02,01)



Note that this is a Python syntax error. It actually has nothing to do with
the datetime module. In Python before version 3, leading zeroes were how
you specified octal (base 8) numbers.


I wonder why someone would take a feature generally agreed to be a 
poorly designed feature of C, and incorporate it into a new language.


Especially one with a very different syntax that doesn't need to be 
backwards compatible.


 That changed in Python 3. If you slim

the start of PEP 3127, you'll learn the new notation.


What, 0O100 instead of 0100? Yeah that's a big improvement...

Fortunately octal doesn't get used much.

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-10 Thread bartc

On 10/05/2018 09:09, Marko Rauhamaa wrote:

bartc <b...@freeuk.com>:

On 09/05/2018 06:44, Steven D'Aprano wrote:

But by the time 1.4 came around, Guido had settled on a clean separation
between statements and expressions as part of Python's design.

That separation has gradually weakened over the years,


Presumably it's non-existent now, as it seems you can type any
expression as a statement in its own right:

   "stmt"
   a + b*c
   p == 0


When typing in code (in various languages), I have a habit of typing
"..." at places that need to be implemented. For example:

 if count:
 ...
 else:
 do_something_smart()
 break

the idea being that "..." will surely trigger a syntax error if I forget
to address it.

I was mildly amused when Python happily executed such code. "..." is a
valid expression and thus, a valid statement.


I wondered what it meant, so I typed in:

   print (...)

and it displayed:

   Ellipsis

which wasn't very enlightening.

--
bartc


--
https://mail.python.org/mailman/listinfo/python-list


Re: seeking deeper (language theory) reason behind Python design choice

2018-05-09 Thread bartc

On 09/05/2018 06:44, Steven D'Aprano wrote:

On Tue, 08 May 2018 22:48:52 -0500, Python wrote:




But by the time 1.4 came around, Guido had settled on a clean separation
between statements and expressions as part of Python's design.

That separation has gradually weakened over the years,


Presumably it's non-existent now, as it seems you can type any 
expression as a statement in its own right:


  "stmt"
  a + b*c
  p == 0

 with the addition

of the ternary if operator


If you mean that this is a kind of statement that can be incorporated 
into an expression, then just class it as an operator not a statement. 
(They are different in any case, for example statement-if doesn't return 
a value.)




and comprehensions,


And that one, although it's a more complex example.


It might simply be that Guido (like many people) considered assignment to
fundamentally be an imperative command, not an expression.


You design a language, you get to choose how it works.

(I allow assignments in expressions in my languages, but the two are 
different constructs: statement-assignment returns no value; 
expression-assignment does, although both use ":=" syntax for the 
assignment operator.


That's OK ":=" doesn't really get mixed up with "=" for equality as "=" 
and "==" would do. So one answer would be to use something other than 
"=" for assignments within expressions.


I don't however allow augmented assignments in expressions, as it's not 
that useful, and it would be too much (and I could never get the 
precedences right). But normal assignment is handy.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Minmax tictactoe :c Cannot understand why this does not work

2018-04-25 Thread bartc

On 24/04/2018 23:57, fifii.ge...@gmail.com wrote:


movimientos = []
for i in range (n):
for j in range (n):

 .

auxb = 0



return movimientos[auxb]


What do you mean by 'does not work'?

With input of fila=1, and columna=1, I get an error in the last line 
above. Because movimientos is an empty list, but you are accessing the 
first element as movimientos[0], because auxb is zero.


I don't understand your program (the layout is fine BTW), but you need 
to debug the logic of it. (Being recursive doesn't help.)


Is an empty movimientos allowed at this point or not? Is auxb==0 a legal 
index here or not?


(Tip: I used these lines in the input loop to simply testing rather than 
having to type them in (as 2 and 2) each time:


fila = 1
columna = 1
)

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Issue with python365.chm on window 7

2018-04-24 Thread bartc

On 24/04/2018 20:03, Ethan Furman wrote:

[redirecting to list]

On 04/24/2018 09:42 AM, Erik Martinson wrote:

CHM files have a long history of security issues. Windows blocks them 
by default. To fix, right-click on the file and go

to properties. In the security section, click the box that says Unblock.

- Erik




I've long had issues with .chm files not working including this one.

Your fix worked (on Windows 7, you have to look at the General 
Properties tab, and the Unblock option is at the bottom. Clicking 
Unblock greys it out, then it's Apply or OK).


But it seems unfriendly to say the least for a HELP file of all things 
to just flatly not work like without a word of explanation (or even This 
file is blocked; do you want to unblock it?)



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Finding a text in raw data(size nearly 10GB) and Printing its memory address using python

2018-04-23 Thread bartc

On 23/04/2018 21:45, Stefan Ram wrote:

MRAB <pyt...@mrabarnett.plus.com> writes:

offset += search_length


   Or, offset += 1, to also find overlaps of that kind:

file = "eee"
word = "ee"

   . The above word "ee" occurs at position 0 and 1 in the file.

   My attempt:

#include 
#include 
int main( void )
{ FILE * const haystack = fopen( "filename.dmp", "r" );
   if( !haystack )goto end;
   char const * const needle = "bd:mongo:";
   int offset = 0;
   int const l =( int )strlen( needle );
   { int o[ l ]; /* VLA */
 for( int i=0; i < l; ++i )o[ i ]= -1;
 o[ 0 ]= 0;
 next: ;
 int const x = fgetc( haystack );
 if( x == EOF )goto out;
 ++offset;
 for( int i=0; i < l; ++ i )
 if( o[ i ]>= 0 )
 { char const ch = needle[ o[ i ] ];
   if( ch == x )++o[ i ];
   if( o[ i ]==( int )strlen( needle ))
   { printf( "found at %d\n", offset -( int )strlen( needle )); }}
 for( int i = l; i; --i )o[ i ]= o[ i - 1 ];
 o[ 0 ]= 0;
 goto next;
 out: fclose( haystack ); }
   end: ; }



Did you say that you teach programming?

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Can't install Python

2018-04-21 Thread bartc

On 21/04/2018 22:36, janlydi...@gmail.com wrote:

Hi!

I installed Python and anaconda by following the instructions of the site, but 
when I open Pysos it is written in the shell:

'c: \ users \ lyjan \ miniconda3 \ python.exe' is not recognized as an internal 
command
or external, an executable program or a batch file.

The process failed to start (invalid command?). (1 =


I do not know what to do ... Thanks in advance



Try clicking the Start button and typing python.exe into the search box.

(Unless this is Windows 10, then try right-clicking Start and choose 
Search.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: # of Months between two dates

2018-04-06 Thread bartc

On 06/04/2018 07:16, Steven D'Aprano wrote:


- instead of counting days, with all the difficulty that
   causes, we could just count how many times the month
   changes;

- in which case, Jan 31 to Feb 1 is one month.


If you book airport parking in the UK, the charge period runs from the 
midnight before the drop-off time to the midnight following the pick-up 
time (so that the first and last days are a full 24 hours).


So parking for 24 hours from 2pm one day to 2pm the next, you are 
charged for 48 hours.


That suggests you will also be charged for two full days for the two 
minutes' parking between 23:59 and 00:01, but I haven't tried it to see 
one happens.


Agree with the messiness of dealing with months, but I think it's one of 
those quirks that make life more interesting, like imperial measure. I 
don't think we really want metric applied to time and to dates. (10-day 
weeks? I don't think so.)



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: check if bytes is all nulls

2018-04-01 Thread bartc

On 01/04/2018 18:55, Arkadiusz Bulski wrote:

What would be the most performance efficient way of checking if a bytes is
all zeros? Currently its `key == b'\x00' * len(key)` however, because its
Python 2/3 compatible:


That doesn't too efficient, if you first have to construct a compatible 
object of all zeros.




sum(key) == 0 is invalid
key == bytes(len(key)) is invalid

I already considered precomputing the rhs value.
Length of key is unknown,


(So how can you precompute the table?)

 could be few bytes, could be megabytes.




How likely would be a block of all zeros? Or one that is all zeros but 
with a sparse smattering of non-zeros?


If not very, then just check the bytes one by one. If non-zero, you will 
know as soon as you see the first non-zero byte, possibly the first one.


def allzeros(a):
for x in a:
if x: return 0
return 1


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python Developer Survey: Python 3 usage overtakes Python 2 usage

2018-03-31 Thread bartc

On 31/03/2018 16:58, Etienne Robillard wrote:



Le 2018-03-31 à 11:40, Michael Torrie a écrit :

On 03/31/2018 08:58 AM, Etienne Robillard wrote:

I was just wondering, could the fact that the Python community is
willing to discontinue using and developing Python 2 softwares, does
  that mean we are stopping to support standard computers and laptops
as well?

I've tried several times but I can't make sense of that paragraph.
Please give me a break. That was just a simple question. Besides, I 
really don't understand why the Python development community is dropping 
support for Python 2 unless for stopping to support standard computers 
altogether...



Furthermore, does it bother you to develop code primarly oriented
towards mobile devices in Python 3 while most of the world still
cannot afford theses expensive products?

Or this one.  What are you talking about?


Are you trolling? Do you understand that a modern mobile device 
typically require a Internet subscription and an additional subscription 
for the smart phone?


AIUI, a smartphone or tablet will still work as a small computer without 
an internet or phone connection (ie. without any of WiFi/GSM/3G/4G).


But a temporary WiFi link (eg. a free one at McDonald's) can be useful 
to download extra free apps then they can be used off-line.


Of source it might not be very popular without access to social media if 
that's the main purpose of the device.


--
bartc

--
https://mail.python.org/mailman/listinfo/python-list


Re: How to fill in a dictionary with key and value from a string?

2018-03-31 Thread bartc

On 30/03/2018 21:13, C W wrote:

Hello all,

I want to create a dictionary.

The keys are 26 lowercase letters. The values are 26 uppercase letters.

The output should look like:
{'a': 'A', 'b': 'B',...,'z':'Z' }



I know I can use string.ascii_lowercase and string.ascii_uppercase, but how
do I use it exactly?
I have tried the following to create the keys:
myDict = {}
 for e in string.ascii_lowercase:
 myDict[e]=0


If the input string S is "cat" and the desired output is {'c':'C', 
'a':'A', 't':'T'}, then the loop might look like this:


   D = {}
   for c in S:
   D[c] = c.upper()

   print (D)

Output:

{'c': 'C', 'a': 'A', 't': 'T'}


But, how to fill in the values? Can I do myDict[0]='A', myDict[1]='B', and
so on?


Yes, but the result will be {0:'A', 1:'B',...} for which you don't need 
a dict; a list will do.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-30 Thread bartc

On 27/03/2018 04:49, Richard Damon wrote:

On 3/26/18 8:46 AM, bartc wrote:


Hence my testing with CPython 3.6, rather than on something like PyPy 
which can give results that are meaningless. Because, for example, 
real code doesn't repeatedly execute the same pointless fragment 
millions of times. But a real context is too complicated to set up.


The bigger issue is that these sort of micro-measurements aren't 
actually that good at measuring real quantitative performance costs. 
They can often give qualitative indications, but the way modern 
computers work, processing environment is extremely important in 
performance, so these sorts of isolated measure can often be misleading. 
The problem is that if you measure operation a, and then measure 
operation b, if you think that doing a then b in the loop that you will 
get a time of a+b, you will quite often be significantly wrong, as cache 
performance can drastically affect things. Thus you really need to do 
performance testing as part of a practical sized exercise, not a micro 
one, in order to get a real measurement.


That might apply to native code, where timing behaviour of a complicated 
 chip like x86 might be unintuitive.


But my comments were specifically about byte-code executed with CPython. 
Then the behaviour is a level or two removed from the hardware and with 
slightly different characteristics.


(Since the program you are actually executing is the interpreter, not 
the Python program, which is merely data. And whatever aggressive 
optimisations are done to the interpreter code, they are not affected by 
the Python program being run.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-30 Thread bartc

On 26/03/2018 16:31, Chris Angelico wrote:

On Mon, Mar 26, 2018 at 11:46 PM, bartc <b...@freeuk.com> wrote:

On 26/03/2018 13:30, Richard Damon wrote:


On 3/26/18 6:31 AM, bartc wrote:




The purpose was to establish how such int("...") conversions compare in
overheads with actual arithmetic with the resulting numbers.


Of course if this was done in C with a version that had builtin bignum
ints or an aggresive enough optimizer (or a Python that did a similar level
of optimizations) this function would just test the speed of starting the
program, as it actually does nothing and can be optimized away.



Which is a nuisance. /You/ are trying to measure how long it takes to
perform a task, the compiler is demonstrating how long it takes to /not/
perform it! So it can be very unhelpful.


Yeah. It's so annoying that compilers work so hard to make your code
fast, when all you want to do is measure exactly how slow it is.
Compiler authors are stupid.


In some ways, yes they are. If they were in charge of Formula 1 pre-race 
speed trials, all cars would complete the circuit in 0.00 seconds with 
an average speed of infinity mph.


Because they can see that they all start and end at the same point so 
there is no reason to actually go around the track.


And in actual computer benchmarks, the compilers are too stupid to 
realise that this is not real code doing a useful task that is being 
done, and the whole thing is optimised away as being pointless.


Optimisation is a very bad idea on microbenchmarks if the results are 
going to be misleading.



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-26 Thread bartc

On 26/03/2018 13:30, Richard Damon wrote:

On 3/26/18 6:31 AM, bartc wrote:


The purpose was to establish how such int("...") conversions compare 
in overheads with actual arithmetic with the resulting numbers.


Of course if this was done in C with a version that had builtin bignum 
ints or an aggresive enough optimizer (or a Python that did a similar 
level of optimizations) this function would just test the speed of 
starting the program, as it actually does nothing and can be optimized 
away.


Which is a nuisance. /You/ are trying to measure how long it takes to 
perform a task, the compiler is demonstrating how long it takes to /not/ 
perform it! So it can be very unhelpful.


Hence my testing with CPython 3.6, rather than on something like PyPy 
which can give results that are meaningless. Because, for example, real 
code doesn't repeatedly execute the same pointless fragment millions of 
times. But a real context is too complicated to set up.


 Yes, something like this can beused to measure the base time to do
something, but the real question should be is that time significant 
compared to the other things that the program is doing, Making a 200x 
improvement on code that takes 1% of the execution time saves you 
0.995%, not normally worth it unless your program is currently running 
at 100.004% of the allowed (or acceptable) timing, if acceptable timing 
can even be defined that precisely.


I'm usually concerned with optimisation in a more general sense than a 
specific program.


Such a with a library function (where you don't know how it's going to 
be used); or with a particular byte-code in an interpreter (you don't 
know how often it will be encountered); or a generated code sequence in 
a compiler.


But even 200x improvement on something that takes 1% of the time can be 
worthwhile if it is just one of dozens of such improvements. Sometimes 
these small, incremental changes in performance can add up.


And even if it was just 1%, the aggregate savings across one million 
users of the program can be substantial, even if the individuals won't 
appreciate it. 1% extra battery life might be a handy five minutes for 
example.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-26 Thread bartc

On 26/03/2018 03:35, Richard Damon wrote:

On 3/25/18 9:37 PM, bartc wrote:


So the overhead /can/ be substantial, and /can/ be significant 
compared with doing bignum calculations.


Of course, once initialised, C might be used a hundred times, then the 
overhead is less significant. But it is not small enough to just dismiss.


And my point is that writing a program to just add or multiply once two 
FIXED big long numbers (since they are in the source code, that seems 
fixed), a million times seems  unlikely (and even then the cost isn't 
that bad, since that sounds like a run once program).


Similar overheads occur when you use string=>int even on small numbers:

This code:

C = int("12345")
D = C+C  # or C*C; about the same results

takes 5 times as long (using my CPython 3.6.x on Windows) as:

C = 12345
D = C+C

Your arguments that this doesn't really matter would equally apply here.

Yet you don't see Python code full of 'int("43")' instead of just '43' 
on the basis that the extra overhead is not significant, as the program 
might be run only once.


A slightly worrying attitude in a language that has issues with 
performance, but no longer a surprising one.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-26 Thread bartc

On 26/03/2018 10:34, Steven D'Aprano wrote:

On Mon, 26 Mar 2018 02:37:44 +0100, bartc wrote:



If I instead initialise C using 'C = int("288712...")', then timings
increase as follows:


Given that the original number given had 397 digits and has a bit length
of 1318, I must admit to some curiosity as to how exactly you managed to
cast it to a C int (32 bits on most platforms).

It is too big for an int, a long (64 bits), a long-long (128 bits) or
even a long-long-long-long-long-long-long-long-long-long-long-long-long-
long-long-long (1024 bits), if such a thing even exists.


So what exactly did you do?


I'm not sure why you think the language C came into it. I did this:

def fn():
C = int(
"28871482380507712126714295971303939919776094592797"
"22700926516024197432303799152733116328983144639225"
"94197780311092934965557841894944174093380561511397"
"4215424169339729054237110027510420801349667317"
"5515285922696291677532547505856101949404200039"
"90443211677661994962953925045269871932907037356403"
"22737012784538991261203092448414947289768854060249"
"76768122077071687938121709811322297802059565867")

#   C = 2887148238050771212671429... [truncated for this post]

D=C+C

for i in range(100):
fn()

The purpose was to establish how such int("...") conversions compare in 
overheads with actual arithmetic with the resulting numbers.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-25 Thread bartc

On 26/03/2018 00:27, Richard Damon wrote:

On 3/25/18 8:32 AM, bartc wrote:


Using CPython on my machine, doing a string to int conversion that 
specific number took 200 times as long as doing a normal assignment.


That conversion took 4 microseconds.

Not significant if it's only done once. But it might be executed a 
million times.




The other half of that thought is how does the 4 microseconds to create 
the constant compare to the operations USING that number. My guess is 
that for most things the usage will swamp the initialization, even if 
that is somewhat inefficient.


Calling a function that sets up C using 'C = 288714...' on one line, and 
that then calculates D=C+C, takes 0.12 seconds to call 100 times.


To do D=C*C, takes 2.2 seconds (I've subtracted the function call 
overhead of 0.25 seconds; there might not be any function call).


If I instead initialise C using 'C = int("288712...")', then timings 
increase as follows:


0.12  =>   3.7 seconds
2.2   =>   5.9 seconds

So the overhead /can/ be substantial, and /can/ be significant compared 
with doing bignum calculations.


Of course, once initialised, C might be used a hundred times, then the 
overhead is less significant. But it is not small enough to just dismiss.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-25 Thread bartc

On 25/03/2018 16:47, Grant Edwards wrote:

On 2018-03-25, bartc <b...@freeuk.com> wrote:

On 25/03/2018 02:47, Steven D'Aprano wrote:


The Original Poster (OP) is concerned about saving, what, a tenth of a
microsecond in total? Hardly seems worth the effort, especially if you're
going to end up with something even slower.


Using CPython on my machine, doing a string to int conversion that
specific number took 200 times as long as doing a normal assignment.

That conversion took 4 microseconds.

Not significant if it's only done once. But it might be executed a
million times.


Which adds up to 4 seconds.

Still not worth spending hours (or even a few minutes) to optimize.


If this is a program that will only ever be used by one person, and that 
person will only ever run it once, and you know that bit is executed 1M 
time and not 50M, then you might be right.


Remember that 4 seconds might be on top of dozens of other things that 
don't appear to be worth optimising.


The chances are however that a program in development might be run 
hundreds or thousands of times.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-25 Thread bartc

On 25/03/2018 15:53, Joe Pfeiffer wrote:

ast <n...@gmail.com> writes:



C = int(
"28871482380507712126714295971303939919776094592797"
"22700926516024197432303799152733116328983144639225"
"94197780311092934965557841894944174093380561511397"
"4215424169339729054237110027510420801349667317"
"5515285922696291677532547505856101949404200039"
"90443211677661994962953925045269871932907037356403"
"22737012784538991261203092448414947289768854060249"
"76768122077071687938121709811322297802059565867")



After following the thread for a while...  you will, of course, simply
have to do a string to int conversion no matter what approach you take
to writing it.


What, even with you write this:

  C = 12

?


The number is a string of digits; it has to be converted
to the internal representation.


Which is usually done by a compiler, and it will only do it once not 
each time the line is encountered in a running program. (Although a 
compiler will also do the conversion when the line is never executed!)



Even if you write

C = 
28871482380507712126714295971303939919776094592797227009265160241974323037991527331163289831446392259419778031109293496555784189494417409338056151139742154241693397290542371100275104208013496673175515285922696291677532547505856101949404200039904432116776619949629539250452698719329070373564032273701278453899126120309244841494728976885406024976768122077071687938121709811322297802059565867

the conversion happens.


But not at runtime.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-25 Thread bartc

On 25/03/2018 15:01, Christian Gollwitzer wrote:

Am 25.03.18 um 14:32 schrieb bartc:
Using CPython on my machine, doing a string to int conversion that 
specific number took 200 times as long as doing a normal assignment.


That conversion took 4 microseconds.

Not significant if it's only done once. But it might be executed a 
million times.


Honestly, why should it be executed a million times?


Because it's inside a function that is called a million times?

 Do you have a
million different 400 digit numbers as constants in your code? If so, I 
suggest to store them in a database file accompanied with the code.


If there are few different only, then don't do the conversion a million 
times. Convert them at module initialization and assign them to a global 
variable.


That's just another workaround. You don't really want the global 
namespace polluted with names that belong inside functions. And you 
might not want to do all that initialisation on behalf of hundreds of 
functions that may or may not be called. Neither do you want the code at 
module level to be cluttered with all these giant constants.


The real problem is in writing a very long constant that doesn't 
comfortably fit on one screen line, and where the editor used doesn't 
offer any help (in displaying on multiple lines) and neither does the 
language.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Entering a very large number

2018-03-25 Thread bartc

On 25/03/2018 02:47, Steven D'Aprano wrote:

On Sun, 25 Mar 2018 00:05:56 +0100, Peter J. Holzer wrote:

[...]

yes, good idea


Not if you want to avoid that string to int conversion (as you stated).

That is still there, but in addition you now split the string into a
list and then join the list into a different string.


I'm glad I wasn't the only one who spotted that.

There's something very curious about somebody worried about efficiency
choosing a *less* efficient solution than what they started with. To
quote W.A. Wulf:

"More computing sins are committed in the name of efficiency (without
necessarily achieving it) than for any other single reason — including
blind stupidity."

As Donald Knuth observed:

"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil."

The Original Poster (OP) is concerned about saving, what, a tenth of a
microsecond in total? Hardly seems worth the effort, especially if you're
going to end up with something even slower.


Using CPython on my machine, doing a string to int conversion that 
specific number took 200 times as long as doing a normal assignment.


That conversion took 4 microseconds.

Not significant if it's only done once. But it might be executed a 
million times.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Thank you Python community!

2018-03-19 Thread bartc

On 19/03/2018 16:08, Etienne Robillard wrote:

You guys just made me realize something very obvious. :-)

I'm in the process right now of watching the excellent documentary named 
"Drugs Inc." on Netflix and I'm basically stunned and deeply concerned 
about the major opioid epidemic in the US.


I would like to thank you guys sincerely for helping a lot of people to 
stay clean, and focus on programming high-level stuff in Python instead 
of doing some really nasty drugs.


I'm also wondering, could we exploit this strategy even further to help 
people willing to stop doing drugs by teaching them some stuff in Python?


You either code in Python, or you're forced to take drugs? Got it.

The trick I think is to let people use programming as a way to socialize 
more in order to stimulate or distract their minds whiling keeping them 
away from drugs.


I've often wondered what the guys who invented C (around 1970) must have 
been smoking to have come up with some of those ideas.



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Enumerating all 3-tuples

2018-03-10 Thread bartc

On 10/03/2018 20:06, Ben Bacarisse wrote:

bartc <b...@freeuk.com> writes:


[repost as original seems to have gone to email; my newsreader has
somehow acquired a 'Reply' button where 'Followup' normally goes.]


[I thought it was intended but my reply bounced.]


On 10/03/2018 14:23, Ben Bacarisse wrote:

Ben Bacarisse <ben.use...@bsb.me.uk> writes:



Off topic: I knocked up this Haskell version as a proof-of-concept:

import Data.List

pn n l = pn' n (map (:[]) l)
 where pn' n lists | n == 1 = lists
   | otherwise = diag (pn' (n-1) lists) lists
   diag l1 l2 = zipWith (++) (concat (inits l1))
 (concat (map reverse (inits l2)))




What's the output? (And what's the input; how do you invoke pn, if
that's how it's done?)


You pass a number and a list which should probably be infinite like
[1..].  You'd better take only a few of the resulting elements then:

*Main> let triples = pn 3 [1..]
*Main> take 20 triples
[[1,1,1],[1,1,2],[1,2,1],[1,1,3],[1,2,2],[2,1,1],[1,1,4],[1,2,3],[2,1,2],[1,3,1],[1,1,5],[1,2,4],[2,1,3],[1,3,2],[2,2,1],[1,1,6],[1,2,5],[2,1,4],[1,3,3],[2,2,2]]

or you can index the list to look at particular elements:

*Main> triples !! 1000
[70,6,1628]

but, as I've said, the order of the results is not the usual one (except
for pairs).


OK. I ran it like this:

 main = print (take 20 (pn 3 [1..]))

But I'm trying to understand the patterns in the sequence. If I use:

 main = print (take 50 (pn 2 [1..]))

then group the results into sets of 1, 2, 3, etc pairs, showing each 
group on a new line, then this gives sequences which are equivalent to 
the diagonals of the OP's 2D grid. (Except they don't alternate in 
direction; is that what you mean?)


I'll have to study the pn 3 version some more. (pn 10 gives an 
interesting view of it too.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Enumerating all 3-tuples

2018-03-10 Thread bartc
[repost as original seems to have gone to email; my newsreader has 
somehow acquired a 'Reply' button where 'Followup' normally goes.]


On 10/03/2018 14:23, Ben Bacarisse wrote:

Ben Bacarisse <ben.use...@bsb.me.uk> writes:



Off topic: I knocked up this Haskell version as a proof-of-concept:

   import Data.List

   pn n l = pn' n (map (:[]) l)
where pn' n lists | n == 1 = lists
  | otherwise = diag (pn' (n-1) lists) lists
  diag l1 l2 = zipWith (++) (concat (inits l1))
(concat (map reverse (inits l2)))

Notes:

map (:[]) l turns [1, 2, 3, ...] into [[1], [2], [3], ...]

inits gives the list of initial segments of l.  I.e. (inits "abc") is
["", "a", "ab", "abc"].

concat joins a list of lists into one list.

zipWith (++) l1 l2 makes a list by pair-wise appending the elements of
l1 and l2.



What's the output? (And what's the input; how do you invoke pn, if 
that's how it's done?)



--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Enumerating all 3-tuples

2018-03-09 Thread bartc

On 10/03/2018 01:13, Steven D'Aprano wrote:

I am trying to enumerate all the three-tuples (x, y, z) where each of x,
y, z can range from 1 to ∞ (infinity).

This is clearly unhelpful:

for x in itertools.count(1):
 for y in itertools.count(1):
 for z in itertools.count(1):
 print(x, y, z)

as it never advances beyond x=1, y=1 since the innermost loop never
finishes.

Georg Cantor to the rescue! (Well, almost...)

https://en.wikipedia.org/wiki/Pairing_function

The Russian mathematician Cantor came up with a *pairing function* that
encodes a pair of integers into a single one. For example, he maps the
coordinate pairs to integers as follows:

1,1  ->  1
2,1  ->  2
1,2  ->  3
3,1  ->  4
2,2  ->  5

and so forth. He does this by writing out the coordinates in a grid:

1,1  1,2  1,3  1,4  ...
2,1  2,2  2,3  2,4  ...
3,1  3,2  3,3  3,4  ...
4,1  4,2  4,3  4,4  ...
...

...

But I've stared at this for an hour and I can't see how to extend the
result to three coordinates. I can lay out a grid in the order I want:

1,1,1   1,1,2   1,1,3   1,1,4   ...
2,1,1   2,1,2   2,1,3   2,1,4   ...
1,2,1   1,2,2   1,2,3   1,2,4   ...
3,1,1   3,1,2   3,1,3   3,1,4   ...
2,2,1   2,2,2   2,2,3   2,2,4   ...
...



I can't see the patterns here that I can see in the 2-D grid (where the 
first number in each pair in the n'th row is n, and the second number in 
the n'th column is n).


Maybe it needs to be 3-D? (Eg if the 3rd number in the triple is the 
Plane number, then plane 1 looks like:


  1,1,1   1,2,1   1,3,1
  2,1,1   2,2,1   2,3,1
  3,1,1   3,2,1   3,3,1 ...
  ...

But whether that has an equivalent traversal path like the diagonals of 
the 2-D, I don't know. I'm just guessing.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Console

2018-03-07 Thread bartc

On 07/03/2018 15:34, Wolfgang Maier wrote:

On 03/07/2018 03:41 PM, Jeremy Jamar St. Julien wrote:
I had an problem when trying to start the python GUI. It said there 
was a subprocess startup error. I was told to start IDLE in a console 
with idlelib and see what python binary i was runnning IDLE with. Im 
using windows 10 and i guess console refers to the terminal window and 
i have no idea what they meant by "the binary its running with"




The simplest way to open the console on Windows is to press Win+R (the 
Windows key together with the r key that is). In the run dialog that 
appears then, type cmd and the console window should appear.


With Windows 10 all the useful apps that are now hard to find are listed 
by right-clicking the Start button.


The console one is called Command Prompt.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: RFC: Proposal: Deterministic Object Destruction

2018-03-05 Thread bartc

On 05/03/2018 13:58, Ooomzay wrote:

On Monday, 5 March 2018 11:24:37 UTC, Chris Angelico  wrote:

On Mon, Mar 5, 2018 at 10:09 PM, Ooomzay wrote:

Here is an example of a composite resource using RAII:-

class RAIIFileAccess():
 def __init__(self, fname):
 print("%s Opened" % fname)
 def __del__(self):
 print("%s Closed" % fname)

class A():
 def __init__(self):
 self.res = RAIIFileAccess("a")

class B():
 def __init__(self):
 self.res = RAIIFileAccess("b")

class C():
 def __init__(self):
 self.a = A()
 self.b = B()

def main():
 c = C()

Under this PEP this is all that is needed to guarantee that the files "a"
and "b" are closed on exit from main or after any exception has been handled.


Okay. And all your PEP needs is for reference count semantics, right?
Okay. I'm going to run this in CPython, with reference semantics. You
guarantee that those files will be closed after an exception is
handled? Right.


def main():

... c = C()
... c.do_stuff()
...

main()

a Opened
b Opened
Traceback (most recent call last):
   File "", line 1, in 
   File "", line 3, in main
AttributeError: 'C' object has no attribute 'do_stuff'


  
Uhh I'm not seeing any messages about the files getting closed.


Then that is indeed a challenge. From CPython back in 2.6 days up to 
Python36-32 what I see is:-

a Opened
b Opened
Traceback (most recent call last):
...
AttributeError: 'C' object has no attribute 'dostuff'
a Closed
b Closed


Maybe exceptions aren't as easy to handle as you think?


Well there is a general issue with exceptions owing to the ease
with which one can create cycles that may catch out newbs. But
that is not the case here.


Or maybe you
just haven't tried any of this (which is obvious from the bug in your
code


Or maybe I just made a typo when simplifying my test case and failed to retest?

Here is my fixed case, if someone else could try it in CPython and report back 
that would be interesting:-

class RAIIFileAccess():
 def __init__(self, fname):
 print("%s Opened" % fname)
 self.fname = fname

 def __del__(self):
 print("%s Closed" % self.fname)

class A():
 def __init__(self):
 self.res = RAIIFileAccess("a")

class B():
 def __init__(self):
 self.res = RAIIFileAccess("b")

class C():
 def __init__(self):
 self.a = A()
 self.b = B()

def main():
 c = C()
 c.dostuff()

main()


I get A and B closed messages when running on CPython 2.7 and 3.6, with 
the code run from a .py file. (I never use interactive mode.)


But not when running on a PyPy version of 2.7 (however that is not CPython).

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: RFC: Proposal: Deterministic Object Destruction

2018-03-04 Thread bartc

On 04/03/2018 14:11, Ooomzay wrote:


Well I see a lot of posts that indicate peeps here are more comfortable with
the "with" idiom than the RAII idiom but I have not yet seen a single
linguistic problem or breakage.

As it happens I have used RAII extensively with CPython to manage a debugging 
environment with complex external resources that need managing very efficiently.


I have bit of problem with features normally used for the housekeeping 
of a language's data structures being roped in to control external 
resources that it knows nothing about.


Which also means that if X is a variable with the sole reference to some 
external resource, then a mere:


   X = 0

will close, destroy, or discard that resource. If the resource is 
non-trivial, then it perhaps deserves a non-trivial piece of code to 
deal with it when it's no longer needed.


It further means that when you did want to discard an expensive 
resource, then X going out of scope, calling del X or whatever, will not 
work if a copy of X still exists somewhere.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: RFC: Proposal: Deterministic Object Destruction

2018-03-02 Thread bartc

On 02/03/2018 08:15, Paul Rubin wrote:


If someone says "but
limited memory", consider that MicroPython runs on the BBC Micro:bit
board which has 16k of ram, and it uses gc.


The specs say it also has 256KB of flash memory (ie. 'ROM'), so I 
suspect much of the program code resides there.



--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: Functions unnecessarily called in Python/pylifecycle.c:_Py_InitializeCore() ?

2018-03-01 Thread bartc

On 01/03/2018 09:57, Thomas Nyberg wrote:

Hello,

I was playing around with cpython and noticed the following. The
`_PyFrame_Init()` and `PyByteArray_Init()` functions are called in these
two locations:


https://github.com/python/cpython/blob/master/Python/pylifecycle.c#L693-L694

https://github.com/python/cpython/blob/master/Python/pylifecycle.c#L699-L700

But here are their function definitions and they appear to do nothing:


https://github.com/python/cpython/blob/master/Objects/frameobject.c#L555-L561

https://github.com/python/cpython/blob/master/Objects/bytearrayobject.c#L24-L28

I can understand leaving the functions in the source for
backwards-compatibility, but why are they still being called in
`_Py_InitializeCore()`? Seems like it just adds noise for those new to
the cpython internals. Is there some consistency doc that requires this
or something?


If they're only called once, then it probably doesn't matter too much in 
terms of harming performance.


As for leaving them in, there might be a number of reasons. One, if one 
day some special initialisation does need to be done, then this gives a 
place to put it.


I quite often have an initialisation routine for a module, that 
sometimes ends up empty, but I keep it in anyway as often things can get 
added back.


(Old CPython source I have does do something in those functions. For 
example:


int _PyFrame_Init()
{
builtin_object = PyUnicode_InternFromString("__builtins__");
if (builtin_object == NULL)
return 0;
return 1;
}

)

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-27 Thread bartc

On 27/02/2018 02:27, Chris Angelico wrote:

On Tue, Feb 27, 2018 at 12:57 PM, bartc <b...@freeuk.com> wrote:

On 27/02/2018 00:35, Chris Angelico wrote:



Anyway, even this pure Python version can deliver pseudo random numbers at
some 200,000 per second, while the built-in generator does 450,000 per
second, so it's not bad going.


The built-in generator is using a completely different algorithm
though, so rate of generation isn't the entire story. How long is the
period of the one you're using? (How long before it loops?)


I believe it's 5*2**1320480*(2**64-1) according to the author's comment.

I haven't tested that.

(By looping I understand that to mean, before the same sequence starts 
again. Because as the numbers are 64 bits, individual numbers will 
inevitably be repeated from time to time.)



--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 27/02/2018 00:35, Chris Angelico wrote:

On Tue, Feb 27, 2018 at 11:17 AM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:

On Tue, 27 Feb 2018 02:09:53 +1100, Chris Angelico wrote:


You're still reimplementing the C code in Python, which is inefficient.
Have you considered going back to the *actual algorithm* and
implementing that idiomatically in Python? I think you'll find that (a)
the code is more readable, and (b) the run time is much lower.


Chris, I think this is incredibly optimistic, if not naive. We're talking
about a PRNG by Marsaglia, so my guess is that the "original algorithm"
*is* the C code. Or possibly Fortran.

Even if not, even if there is actually a language-neutral algorithm, its
a PRNG which means its going to be filled with bit-twiddling and number-
crunching operations. Pure Python without a JIT is never going to be
competitive with C or Fortran for code like that.



I may have been a little unclear. It's highly unlikely that the run
time of the properly-implemented Python code will be lower than the
original C or Fortran. But it most certainly CAN be more efficient
than the Python reimplementation of the C implementation, which would
narrow the gap considerably. Implementing Python idiotically instead
of idiomatically gives you suboptimal performance.


Nonsense. You shouldn't need to care about such things. An algorithm is 
an algorithm. And the better ones should be easily written in any language.


This particular one, which was of interest because the calculations 
tended to overflow into undesirable bignums, is just a bunch of 
calculations. 'Bit-twiddling'.


Anyway, even this pure Python version can deliver pseudo random numbers 
at some 200,000 per second, while the built-in generator does 450,000 
per second, so it's not bad going.


Of course, the C version will generate them at over 100 million per second.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 20:27, bartc wrote:

On 26/02/2018 19:50, Chris Angelico wrote:

On Tue, Feb 27, 2018 at 6:37 AM, Rick Johnson



So what? Latency is latency. And whether it occurs over the
course of one heavily recursive algorithm that constitutes
the depth and breadth of an entire program (a la fib()), or
it is the incremental cumulative consequence of the entire
program execution, the fact remains that function call
overhead contributes to a significant portion of the latency
inherent in some trivial, and *ALL* non-trivial, modern
software.


[Sorry, the bit of your (Chris') post I replied to got chopped by 
mistake. Here is my post again with the right context:]


CA:
> By saying "the fact remains", you're handwaving away the work of
> actually measuring that function call overhead is "significant". Can
> you show me those numbers? Steve's point is that it is NOT
> significant, because non-toy functions have non-trivial bodies. If you
> wish to disagree, you have to demonstrate that the function call is
> *itself* costly, even when there's a real body to it.
>

Take the last bit of Python I posted, which was that RNG test.

It uses this function:

  def i64(x): return x & 0x

This is a small function, but it can't be a toy one as it was suggested 
by Ned Batchelder. And the test program wasn't at all recursive.



Running the program with N=10_000_000 took 46 seconds.

Replacing the i64() call with '' (where m is a global set to 
0x), it took 38 seconds.


So putting that fix into a function, which is convenient for coding, 
maintenance and readability, cost 20% in runtime.


Going the other way, having i64() call another function i64a() which 
does the work, common when using wrapper functions, it took 60 seconds. 
A 30% slowdown, even though most of the work is doing numeric shifts and 
adds.


So function calls do seem to be expensive operations in CPython, and any 
benchmarks which highlight that fact should be welcomed.


(Note than none of this seems to affect PyPy, which ran at the same 
speed with i64(), without it, or with both i64() and i64a().)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 19:50, Chris Angelico wrote:

On Tue, Feb 27, 2018 at 6:37 AM, Rick Johnson



So what? Latency is latency. And whether it occurs over the
course of one heavily recursive algorithm that constitutes
the depth and breadth of an entire program (a la fib()), or
it is the incremental cumulative consequence of the entire
program execution, the fact remains that function call
overhead contributes to a significant portion of the latency
inherent in some trivial, and *ALL* non-trivial, modern
software.


Take the last bit of Python I posted, which was that RNG test.

It uses this function:

  def i64(x): return x & 0x

This is a small function, but it can't be a toy one as it was suggested 
by Ned Batchelder. And the test program wasn't at all recursive.



Running the program with N=10_000_000 took 46 seconds.

Replacing the i64() call with '' (where m is a global set to 
0x), it took 38 seconds.


So putting that fix into a function, which is convenient for coding, 
maintenance and readability, cost 20% in runtime.


Going the other way, having i64() call another function i64a() which 
does the work, common when using wrapper functions, it took 60 seconds. 
A 30% slowdown, even though most of the work is doing numeric shifts and 
adds.


So function calls do seem to be expensive operations in CPython, and any 
benchmarks which highlight that fact should be welcomed.


(Note than none of this seems to affect PyPy, which ran at the same 
speed with i64(), without it, or with both i64() and i64a().)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 17:05, Ben Bacarisse wrote:

bartc <b...@freeuk.com> writes:


A C version is given below. (One I may have messed around with, which
I'm not sure works properly. For an original, google for Marsaglia and
KISS64 or SUPRKISS64.)


The version I know uses unsigned integers.  Did you change then to signed?


Yeah, I don't know what I was doing with that version.


For a Python version, go back to the original C and work from there.


The original C makes confusing use of macros and comma operators.

A version without macros or comma expressions is here (tweaked from 
generated C):


  https://pastebin.com/raw/k4jFK5TN

This runs the 1-billion loop in 6 seconds.

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 15:09, Chris Angelico wrote:

On Tue, Feb 27, 2018 at 2:02 AM, bartc <b...@freeuk.com> wrote:

On 26/02/2018 14:04, bartc wrote:


On 26/02/2018 13:42, Ned Batchelder wrote:




   Well, once you notice that the


Python code had N=1e5, and the C code had N=1e9 :)   If you want to
experiment, with N=1e5, the final number should be 5255210926702073855.



OK, I'll try that.



I have that Python version working now. It's necessary to apply that masking
function to wherever numbers can get bigger.

I don't know how long a 1-billion loop will take, but a 10-million loop took
46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple of years
ago. (And on Windows, which has a somewhat slower CPython than Linux.)


You're still reimplementing the C code in Python, which is
inefficient. Have you considered going back to the *actual algorithm*
and implementing that idiomatically in Python? I think you'll find
that (a) the code is more readable, and (b) the run time is much
lower.


Do you really think that?

The algorithm seems to require this sequence of calculations to be gone 
through. Otherwise anything more efficient can also be done in any language.


So how would idiomatic Python help, by seeing if such a routine already 
exists in some library? That wouldn't be playing the game.


If it helps, I remember playing with a version in Algol 68 Genie 
(interpreted). Still buggy as the output is different, it does about the 
same amount of work.


A 10-million loop would take an estimated 1000 seconds, on the same 
machine that CPython took 46 seconds. So four magnitudes slower than C.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 14:04, bartc wrote:

On 26/02/2018 13:42, Ned Batchelder wrote:



  Well, once you notice that the
Python code had N=1e5, and the C code had N=1e9 :)   If you want to 
experiment, with N=1e5, the final number should be 5255210926702073855.


OK, I'll try that.


I have that Python version working now. It's necessary to apply that 
masking function to wherever numbers can get bigger.


I don't know how long a 1-billion loop will take, but a 10-million loop 
took 46 seconds on Python 3.6, and 21 seconds on PyPy 2.7 from a couple 
of years ago. (And on Windows, which has a somewhat slower CPython than 
Linux.)


Result should be x=11240129907685265998.

By comparison, the C version compiled with -O3 took 0.11 seconds.

(The C version I posted will work, if adjusted to a 1000 loop, but 
you have to change 'signed' to 'unsigned'. Apparently they weren't 
interchangeable after all. I've no idea why I used 'signed' there.


That version is rather cryptic, but it can be better written and without 
the macros, and it will run just as fast. (Marsaglia may have been hot 
with random number routines, but his C could have done with some work...)


My interpreter, using 64-bit numbers, managed 4.8 seconds. But unsigned 
arithmetic, which is uncommon, is not accelerated.)


---

Q=0
carry=36243678541
xcng=12367890123456
xs=521288629546311
indx=20632

def i64(x): return x & 0x

def refill():
global Q, carry, indx
for i in range(20632):
h = carry & 1
z = i64((  i64((Q[i]<<41))>>1)+(i64((Q[i]<<39))>>1)+(carry>>1))
carry = i64((Q[i]>>23)+(Q[i]>>25)+(z>>63))
Q[i] = i64(~(i64(i64(z<<1)+h)))

indx=1
return Q[0]

def start():
global Q, carry, xcng, xs, indx
Q=[0,]*20632

for i in range(20632):

xcng=i64(6906969069 * xcng + 123)

xs = i64(xs ^ (xs<<13))
xs = i64(xs ^ (xs>>17))
xs = i64(xs ^ (xs<<43))

Q[i] = i64(xcng + xs)

N = 1000
for i in range(N):
if indx<20632:
s = Q[indx]
indx+=1
else:
s = refill()
xcng=i64(6906969069 * xcng + 123)
xs = i64(xs ^ (xs<<13))
xs = i64(xs ^ (xs>>17))
xs = i64(xs ^ (xs<<43))
    x = i64(s+xcng+xs)
print ("Does x= 4013566000157423768")
print (" x=",x)

start()

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 14:34, Chris Angelico wrote:


I'm glad _someone_ funded PyPy, anyhow. It's a great demonstration of
what can be done.


And it's good that /someone/ at least understands how PyPy works, as I 
don't think many people do.


Apparently it doesn't optimise 'hot paths' within a Python program, but 
optimises hot paths in the special Python interpreter. One written in 
[R]Python. Or something...


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 13:42, Ned Batchelder wrote:

On 2/26/18 7:13 AM, bartc wrote:


A C version is given below. (One I may have messed around with, which 
I'm not sure works properly. For an original, google for Marsaglia and 
KISS64 or SUPRKISS64.)


Most integers are unsigned, which have well-defined overflow in C 


With proper 64-bit masking (def only64(x): return x & 
0x), the Python version produces the correct answer 
using a reasonable amount of memory.


I did try sometime like that, but I must have missed something because I 
didn't get quite the same results as a working version.


And with interpreted code, you tend not to test using loops of a billion 
iterations.


 Well, once you notice that the
Python code had N=1e5, and the C code had N=1e9 :)   If you want to 
experiment, with N=1e5, the final number should be 5255210926702073855.


OK, I'll try that.

Also, I note that you said, "Most integers are unsigned", but the C code 
has them all declared as signed?  It doesn't seem to have mattered to 
your result, but I'm not an expert on C portability guarantees.


The C code I first pasted used 'unsigned', but the main program logic 
wasn't right, and I found another version that looked better. That one 
used 'signed' for some reason, which I completely missed.


Even if with C it works with either, the signed version might have 
'undefined behaviour'. As said, google for the original; the ones I can 
see have 'unsigned'. But I can also see a Fortran version that just uses 
'integer*8', which I believe is signed 64-bit.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 12:06, Antoon Pardon wrote:

On 23-02-18 02:27, Steven D'Aprano wrote:

Why do you care about the 50 million calls? That's crazy -- the important
thing is *calculating the Fibonacci numbers as efficiently as possible*.


No necessarily.

David Beazley in his talks sometimes uses an ineffecient algorithm for 
calculating
fibonacci numbers because he needs something that uses the cpu intensively.
calculating the fibonacci numbers in that context as efficiently as possible 
would
defeat that purpose.

So in a context of a benchmark it is not unreasonable to assume those 50 million
calls are the purpose and not calculating the Fibonacci numbers as efficiently 
as
possible.


I don't think Steven is ever going to concede this point.

Because Python performs badly compared to Julia or C, and it's not 
possible to conveniently offload the task to some fast library because 
it only uses a handful of primitive byte-codes.


(I have the same trouble with my own interpreted language. Although 
somewhat brisker than CPython, it will always be much slower than a 
C-like language on such micro-benchmarks.


But I accept that; I don't have an army of people working on 
acceleration projects and tracing JIT compilers. To those people 
however, such a benchmark can be a useful yardstick of progress.)


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 11:40, Chris Angelico wrote:

On Mon, Feb 26, 2018 at 10:13 PM, bartc <b...@freeuk.com> wrote:

Below is the first draft of a Python port of a program to do with random
numbers. (Ported from my language, which in turned ported it from a C
program by George Marsaglia, the random number guy.)

However, running it quickly exhausts the memory in my machine. The reason is
that Python unhelpfully widens all results to bignums as needed. The code
relies on calculations being modulo 2^64.

Note that restricting integer ops to 64 bits probably still won't work, as I
believe the numbers need to be unsigned.


No, it's because the original implementation assumed integer
wrap-around (at least, I think that's what's happening; I haven't
analyzed the code in great detail). That means all your integer
operations are doing two jobs: the one they claim to, and then a
masking to 64 bits signed. That's two abstract operations that happen,
due to the nature of the CPU, to work efficiently together. If you
don't implement both halves of that in your Python port, you have
failed at porting. What if you were porting a program from a 72-bit
chip that assumed Binary Coded Decimal? Would you complain that C's
integers are badly behaved?

And that's without even asking whether a C program that assumes
integer wrap-around counts as portable. At least with Python, you have
a guarantee that integer operations are going to behave the same way
on all compliant implementations of the language.



A C version is given below. (One I may have messed around with, which 
I'm not sure works properly. For an original, google for Marsaglia and 
KISS64 or SUPRKISS64.)


Most integers are unsigned, which have well-defined overflow in C (they 
just wrap as expected). In C, a mixed signed/unsigned op is performed as 
unsigned.


-

/*   SUPRKISS64.c, period 5*2^1320480*(2^64-1)   */
#include 
#include 
#include "timer.c"
 static signed long long Q[20632],
 carry=36243678541LL,
 xcng=12367890123456LL,
 xs=521288629546311LL,
 indx=20632;

 #define CNG ( xcng=6906969069LL*xcng+123 )
 #define XS  ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
 #define SUPR ( indx<20632 ? Q[indx++] : refill() )
 #define KISS SUPR+CNG+XS

 signed long long refill(void) {
  int i; signed long long z,h;
  for(i=0;i<20632;i++) {
h = (carry&1);
z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1);
carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63);
Q[i] = ~((z<<1)+h);
  }
  indx=1;
  return (Q[0]);
 }

 int main() {
  int i; signed long long x;

  for(i=0;i<20632;i++) Q[i]=CNG+XS;

  for(i=0;i<10;i++) x=KISS;

  printf("Does x=4013566000157423768\n x=%llu.\n",x);
}

--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-26 Thread bartc

On 26/02/2018 09:22, Steven D'Aprano wrote:

On Sun, 25 Feb 2018 21:19:19 -0800, Rick Johnson wrote:



I agree with your sarcasm. And that's why these types of auto
conversions should be optional. I agree that most times it's more
practical to let python handle the dirty details. But in some cases,
where you need to squeeze out a few more drops of speed juice, you won't
mind the extra trouble.


And that's where you call out to a library like numpy, or write a C
extension, or use a tool like Numba or Cython to optimise your Python
code to use native ints. (Numba is still a work in progress, but Cython
is a mature working product.)

Or to put it another way... if you want machine ints in Python, the way
you get them is something like:

from numba import jit

@jit
def myfunction(): ...



The core language doesn't have to support these things when there is a
healthy language ecosystem that can do it.


Below is the first draft of a Python port of a program to do with random 
numbers. (Ported from my language, which in turned ported it from a C 
program by George Marsaglia, the random number guy.)


However, running it quickly exhausts the memory in my machine. The 
reason is that Python unhelpfully widens all results to bignums as 
needed. The code relies on calculations being modulo 2^64.


Note that restricting integer ops to 64 bits probably still won't work, 
as I believe the numbers need to be unsigned.


--

Q=0
carry=36243678541
xcng=12367890123456
xs=521288629546311
indx=20632

def refill():
global Q, carry, indx
for i in range(20632):
h = carry & 1
z = ((Q[i]<<41)>>1)+((Q[i]<<39)>>1)+(carry>>1)
carry = (Q[i]>>23)+(Q[i]>>25)+(z>>63)
Q[i] = ~((z<<1)+h)
indx=1
return Q[0]

def start():
global Q, carry, xcng, xs, indx
Q=[0,]*20632

for i in range(20632):

xcng=6906969069 * xcng + 123

xs ^= (xs<<13)
xs ^= (xs>>17)
xs ^= (xs<<43)

Q[i] = xcng + xs

N = 10
for i in range(N):
if indx<20632:
s = Q[indx]
indx+=1
else:
s = refill()
xcng=6906969069 * xcng + 123
xs ^= (xs<<13)
xs ^= (xs>>17)
xs ^= (xs<<43)
x = s+xcng+xs
print ("Does x= 4013566000157423768")
print (" x=",x)

start()

--

(The code performs N iterations of a random number generator. You get 
the result expected, ie. x=401...768, when N is a billion.)


--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-24 Thread bartc

On 24/02/2018 02:05, Steven D'Aprano wrote:

On Fri, 23 Feb 2018 19:25:35 +, bartc wrote:



Python is 10 times slower than a competitor = doesn't matter
My language is 1.5 times slower than the big boys' = matters
a great deal



As for Python's order-of-magnitude speed difference, thank you for being
generous.


Actually that comparison was with a competitor, ie. another dynamic 
language, because I understand such languages work in different fields 
from the Cs and C++s.


I'm sure there must be some that are faster (years since I've looked at 
the field), but I vaguely had in mind mine. Although since then, CPython 
has gotten faster.


Note that there are JIT-based implementations now which can give very 
good results (other than PyPy) with dynamic languages.


My own efforts are still byte-code based so are unlikely to get any 
faster. But they are also very simple.



So it is quite possible to get practical work done and be a competitive,
useful language despite being (allegedly) a thousand or more times slower
than C.


Of course. I've been using a dynamic scripting language as an adjunct to 
my compiled applications since the mid 80s. Then they were crude and 
hopelessly slow (and machines were a lot slower too), but they could 
still be tremendously useful with the right balance.


But the faster they are, the more work they can take over.


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


Re: How to make Python run as fast (or faster) than Julia

2018-02-24 Thread bartc

On 24/02/2018 02:46, Steven D'Aprano wrote:


Take the Fibonacci double-recursion benchmark. Okay, it tests how well
your language does at making millions of function calls. Why? How often
do you make millions of function calls?


Very often. Unless you are writing code 1970s style with everything in 
one big main function.


I've done some tests with my interpreter [sorry, Ned], on one real task:

  Total number of byte-code calls: 3.2 million
  Total number of real x64 calls: 520 million

On this specific benchmark: 48 million and 580 million.

Changing the way those x64 calls were made (using a different call 
convention), made some byte-code programs take up to 50% longer to 
execute. [In this interpreter, each byte-code, no matter what it is, is 
dispatched via a function call.]


 For most application code,

executing the function is far more costly than the overhead of calling
it, and the call overhead is dwarfed by the rest of the application.


Any actual figures?

In the case of interpreters, you want to optimise each byte-code, and 
one way of doing that is to write a small program which features that 
byte-code heavily. And then you start tweaking.


It is true that when working with heavy-duty data, or offloading work to 
external, non-byte-code functions, then the byte-code execution 
overheads are small. But Python's weakness is when it /is/ executing 
actual algorithms using actual byte-code.


And actually, performance of CPython does seem to have improved 
tremendously over the years. So some people have their eye on the ball. 
Clearly not you.



If you have a language with tail recursion elimination, you can bet
that's its benchmarks will include examples of tail recursion and tail
recursion will be a favoured idiom in that language. If it doesn't, it
won't.


Benchmarks need to be honest. But Fibonacci I think can't use that 
optimisation (although gcc seems to have found another way of not that 
much work).


--
bartc
--
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >