Accuracy of multiprocessing.Queue.qsize before any Queue.get invocations?

2022-05-12 Thread Tim Chase
The documentation says[1]

> Return the approximate size of the queue. Because of
> multithreading/multiprocessing semantics, this number is not
> reliable.

Are there any circumstances under which it *is* reliable?  Most
germane, if I've added a bunch of items to the Queue, but not yet
launched any processes removing those items from the Queue, does
Queue.qsize accurately (and reliably) reflect the number of items in
the queue?

  q = Queue()
  for fname in os.listdir():
q.put(fname)
  file_count = q.qsize() # is this reliable?
  # since this hasn't yet started fiddling with it
  for _ in range(os.cpu_count()):
Process(target=myfunc, args=(q, arg2, arg3)).start()

I'm currently tracking the count as I add them to my Queue,

  file_count = 0
  for fname in os.listdir():
q.put(fname)
file_count += 1

but if .qsize is reliably accurate before anything has a chance to
.get data from it, I'd prefer to tidy the code by removing the
redunant counting code if I can.

I'm just not sure what circumstances the "this number is not
reliable" holds.  I get that things might be asynchronously
added/removed once processes are running, but is there anything that
would cause unreliability *before* other processes/consumers run?

Thanks,

-tkc

[1]
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue.qsize





-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Add a method to list the current named logging levels

2022-03-30 Thread Tim Chase
On 2022-03-30 16:37, Barry wrote:
> Is logging.getLevelNamesMapping() what you are looking for?

Is this in some version newer than the 3.8 that comes stock on my
machine?

  $ python3 -q
  >>> import logging
  >>> logging.getLevelNamesMapping()
  Traceback (most recent call last):
File "", line 1, in 
  AttributeError: module 'logging' has no attribute 'getLevelNamesMapping'

-tkc
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Behavior of the for-else construct

2022-03-04 Thread Tim Chase
On 2022-03-04 11:55, Chris Angelico wrote:
> In MS-DOS, it was perfectly possible to have spaces in file names

DOS didn't allow space (0x20) in filenames unless you hacked it by
hex-editing your filesystem (which I may have done a couple times).
However it did allow you to use 0xFF in filenames which *appeared* as
a space in most character-sets.

I may have caused a mild bit of consternation in school computer labs
doing this. ;-)

> Windows forbade a bunch of characters in file names

Both DOS and Windows also had certain reserved filenames

https://www.howtogeek.com/fyi/windows-10-still-wont-let-you-use-these-file-names-reserved-in-1974/

that could cause issues if passed to programs.

To this day, if you poke around on microsoft.com and change random
bits of URLs to include one of those reserved filenames in the GET
path, you'll often trigger a 5xx error rather than a 404 that you
receive with random jibberish in the same place.

  https://microsoft.com/…/asdfjkl → 404
  https://microsoft.com/…/lpt1 → 5xx
  https://microsoft.com/…/asdfjkl/some/path → 404
  https://microsoft.com/…/lpt1/some/path → 5xx

Just in case you aspire to stir up some trouble.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Behavior of the for-else construct

2022-03-03 Thread Tim Chase
On 2022-03-04 02:02, Chris Angelico wrote:
>> I want to make a little survey here.
>>
>> Do you find the for-else construct useful? Have you used it in
>> practice? Do you even know how it works, or that there is such a
>> thing in Python?  
> 
> Yes, yes, and yes-yes. It's extremely useful.

Just adding another data-point, my answer is the same as above.

I use it regularly and frequently wish for it (or a similar "didn't
trigger a break-condition in the loop" functionality) when using
other programming languages.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Timezone jokes (was: All permutations from 2 lists)

2022-03-03 Thread Tim Chase
On 2022-03-03 06:27, Grant Edwards wrote:
> On 2022-03-03, Chris Angelico  wrote:
> > Awww, I was going to make a really bad joke about timezones :)  
> 
> As opposed to all the really good jokes about timezones... ;)

And here I thought you were just Trolling with timezones...

https://en.wikipedia.org/wiki/Troll_(research_station)#cite_ref-1

;-)

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Create a contact book

2021-10-26 Thread Tim Chase
On 2021-10-25 22:40, anders Limpan wrote:
> i would like to create a contact book were you can keep track of
> your friends. With this contact book you will both be able to add
> friends and view which friends that you have added. anyone
> interested in helping me out with this one ?=) --

Python provides the shelve module for just this sort of thing:

  import shelve

  class Contact:
  def __init__(self, name):
  self.name = name
  self.address = ""
  self.phone = ""

  with shelve.open("contacts") as db:
  dave =  Contact("Dave Smith")
  dave.address = "123 Main St\nAnytown, NY 12345"
  dave.phone = "800-555-1212"
  db["dave"] = dave
  ellen = Contact("Ellen Waite")
  ellen.phone = "+1234567890"
  db["ellen"] = ellen

Then at some future point you can use

  with shelve.open("contacts") as db:
  dave = db["dave"]
  print(f"Dave lives at {dave.address}")
  ellen = db["ellen"]
  print(f"Ellen's phonenumber is {ellen.phonenumber}")

I'll leave to you the details of implementing an actual address-book
out of these parts.  Be sure to read the docs for the shelve module

  https://docs.python.org/3/library/shelve.html 

including the various security warnings and caveats.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: string storage [was: Re: imaplib: is this really so unwieldy?]

2021-05-26 Thread Tim Chase
On 2021-05-26 18:43, Alan Gauld via Python-list wrote:
> On 26/05/2021 14:09, Tim Chase wrote:
>>> If so, doesn't that introduce a pretty big storage overhead for
>>> large strings?  
>> 
>> Yes.  Though such large strings tend to be more rare, largely
>> because they become unweildy for other reasons.  
> 
> I do have some scripts that work on large strings - mainly produced
> by reading an entire text file into a string using file.read().
> Some of these are several MB long so potentially now 4x bigger than
> I thought. But you are right, even a 100MB string should still be
> OK on a modern PC with 8GB+ RAM!...

If you don't decode it upon reading it in, it should still be 100MB
because it's a stream of encoded bytes.  It would only 2x or 4x in
size if you decoded that (either as a parameter of how you opened it,
or if you later took that string and decoded it explicitly, though
now you have the original 100MB byte-string **plus** the 100/200/400MB
decoded unicode string).

You don't specify what you then do with this humongous string, but
for most of my large files like this, I end up iterating over them
piecewise rather than f.read()'ing them all in at once. Or even if
the whole file does end up in memory, it's usually chunked and split
into useful pieces.  That could mean that each line is its own
string, almost all of which are one-byte-per-char with a couple
strings at sporadic positions in the list-of-strings where they are
2/4 bytes per char.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: string storage [was: Re: imaplib: is this really so unwieldy?]

2021-05-26 Thread Tim Chase
On 2021-05-26 08:18, Alan Gauld via Python-list wrote:
> Does that mean that if I give Python a UTF8 string that is mostly
> single byte characters but contains one 4-byte character that
> Python will store the string as all 4-byte characters?

As best I understand it, yes:  the cost of each "character" in a
string is the same for the entire string, so even one lone 4-byte
character in an otherwise 1-byte-character string is enough to push
the whole string to 4-byte characters.  Doesn't effect other strings
though (so if you had a pure 7-bit string and a unicode string, the
former would still be 1-byte-per-char…it's not a global aspect)

If you encode these to a UTF8 byte-string, you'll get the space
savings you seek, but at the cost of sensible O(1) indexing.

Both are a trade-off, and if your data consists mostly of 7-bit ASCII
characters, or lots of small strings, the overhead is less pronounced
than if you have one single large blob of text as a string.

> If so, doesn't that introduce a pretty big storage overhead for
> large strings?

Yes.  Though such large strings tend to be more rare, largely because
they become unweildy for other reasons.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: name for a mutually inclusive relationship

2021-02-24 Thread Tim Chase
On 2021-02-24 08:12, Ethan Furman wrote:
> I'm looking for a name for a group of options that, when one is
> specified, all of them must be specified.
[snip]
> - ???: a group of options where, if one is specified, all must be
> specified (mutually inclusive)
[snip]
> Is there a name out there already to describe that concept?

For two items, the XNOR logical operator (effectively a "not (a xor
b") has the desired truth-table.

  https://en.wikipedia.org/wiki/Xnor

sometimes called a "logical biconditional"

  https://en.wikipedia.org/wiki/Logical_biconditional

But in more human terms, the common phrase "all or nothing" seems to
cover the concept pretty well.

-tkc







-- 
https://mail.python.org/mailman/listinfo/python-list


Re: count consecutive elements

2021-01-15 Thread Tim Chase
On 2021-01-16 03:32, Bischoop wrote:
>> The OP didn't specify what should happen in that case, so it would
>> need some clarification.
> 
> In that case maybe good solution would be to return three of them?

That's the solution I chose in my initial reply, you get a tuple back
of ([list of longest matches], length_of_longest_match)

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: count consecutive elements

2021-01-13 Thread Tim Chase
On 2021-01-13 18:20, Dan Stromberg wrote:
> I'm kind of partial to:
> 
> import collections
> import typing
> 
> 
> def get_longest(string: str) -> typing.Tuple[int, str]:
> """Get the longest run of a single consecutive character."""
> dict_: typing.DefaultDict[str, int] =
> collections.defaultdict(int) for left_ch, right_ch in zip(string,
> string[1:]): if left_ch == right_ch:
> dict_[left_ch] += 1
> 
> maximum = max((value, key) for key, value in dict_.items())
> 
> return maximum

seems to only return one value so seems to get odd results if I
specify something like

  get_longest("aaabcccbbb")

which at least here tells me that "c" is the longest run, even though
aaa, bbb, and ccc are all runs of length 3.  The OP didn't specify
what should happen in that case, so it would need some clarification.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: count consecutive elements

2021-01-13 Thread Tim Chase
On 2021-01-13 21:20, Bischoop wrote:
> I want to  to display a number or an alphabet which appears mostly
> consecutive in a given string or numbers or both
> Examples
> s= ' aabskaaabad'
> output: c
> # c appears 4 consecutive times
>  8bbakebaoa
> output: b
> #b appears 2 consecutive times

I'd break the problem into two parts:

1) iterate over your thing (in this case, a string) and emit the item
and its correpsonding count of consecutive matches.

2) feed that into something that finds the longest run(s) output by
that.

So off the cuff, something like

  def consecutive_counter(seq):
  # I'm not sure if there's something
  # like this already in itertools
  cur = nada = object()
  count = 0
  for x in seq:
  if x == cur:
  count += 1
  else:
  if cur is not nada:
  yield cur, count
  cur = x
  count = 1
  if cur is not nada:
  yield cur, count

  def longest(seq):
  results = []
  biggest = 0
  for item, count in seq:
  if count > biggest:
  results = [item]
  biggest = count
  elif count == biggest:
  results.append(item)
  return results, biggest

  for s in (
  "",
  "a",
  "aaa",
  "aaabbb",
  "aabskaaabad",
  "aabskaaakbad",
  ):
  print("Testing %r" % s)
  print(repr(longest(consecutive_counter(s

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: list() strange behaviour

2020-12-20 Thread Tim Chase
On 2020-12-20 21:00, danilob wrote:
> b = ((x[0] for x in a))

here you create a generator

> print(list(b))
> [1, 0, 7, 2, 0]

and then you consume all the things it generates here which means
that when you go to do this a second time

> print(list(b))

the generator is already empty/exhausted so there's nothing more to
yield, giving you the empty result that you got:

> []


If you want to do this, convert it to a list once:

 >>> c = list(b)

or use a list-comprehension instead creating a generator

 >>> c = [x[0] for x in a]

and then print that resulting list:

 >>> print(c)
 >>> print(c)

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To check if number is in range(x,y)

2020-12-14 Thread Tim Chase
On 2020-12-14 21:21, Schachner, Joseph wrote:
> >>> r = range(10)  
> So r is a list containing 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

In Python 3.x, r is *not* a list.  It is a custom object/class.

>   >>> 2 in r  
>   True
> As expected.

I'm not sure what your replies are suggesting here.  I demonstrated
the OP's edge-cases, especially cases that one might experience coming
from other languages.

>   >>> r = range(1, 10, 2)
>   >>> 2 in r  
>   False
>   >>> list(r)  
>   [1, 3, 5, 7, 9]
> Well, yes, because you started the range at 1.  Start at 0 and
> you'd get 0, 2, 4, 6, 8.

Had I done this, for pedagogical value I would have checked for 3
then:

  >>> r = range(0, 10, 2)
  >>> 3 in r
  False

The goal was to demonstrate that the resulting range object, when
given a step-size of something than the default 1, will have holes in
it, and as such, testing for membership in one of those holes would
fail.  Showing successful membership wouldn't add any value.

> "It also doesn't automatically convert from the string inputs
> you're getting from the input() function:
> 
>   >>> s = "5"
>   >>> s in r  
>   False
>   >>> int(s) in r  
>   True"
> You have just discovered that Python, although it is dynamically
> typed, is STRICTLY typed. 

No, not just now discovered something I've long known.  The goal was
to provide an example for the OP of this exact case since their
original code attempted to use the string returned from input() and
used it as-is (without converting to int) for this exact sort of
comparison.

> Another way to say this: you have discovered that Python isn't the
> same as BASIC.

Additionally, (many? all? some?) BASICs have similarly strict typing.
For example, reaching for the BASIC that I used in the 80s:

  ] S$ = "HELLO"
  ] I = 42
  ] PRINT S$ + I
  ?TYPE MISMATCH ERROR

> "Additionally, note that the endpoint of the range is exclusive so
>   >>> r = range(1, 10)
>   >>> 10 in r  
>   False"
> 
> I don't have to note that

My comment was directed at the OP.  Unless you are Bischoop, that's
not you.

> Now suppose that the end integer was not excluded. Each range call
> would produce 11 integers. 

The goal was to show the OP that while some languages (such as the
aforementioned BASIC) have *inclusive* ranges:
  
  ] FOR I = 1 to 3 : PRINT I : NEXT
  1
  2
  3

Python's ranges are exclusive.  Because a language could have either,
the example demonstrated Python's choice.

> I recommend you read Python 101 and when you've done that, read
> Python 201.   I think they are very good "learn Python" books. If
> you're surprised that the end point is not included in range, you
> need to read Python 101.

Your condescending replies bark up the wrong tree.

-tkc






-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To check if number is in range(x,y)

2020-12-12 Thread Tim Chase
On 2020-12-12 15:12, Bischoop wrote:
> I need to check if input number is 1-5. Whatever I try it's not
> working. Here are my aproaches to the problem: https://bpa.st/H62A
> 
> What I'm doing wrong and how I should do it?

A range is similar to a list in that it contains just the numbers
listed:

  >>> r = range(10)
  >>> 2 in r
  True
  >>> 2.5 in r
  False
  >>> r = range(1, 10, 2)
  >>> 2 in r
  False
  >>> list(r)
  [1, 3, 5, 7, 9]

It also doesn't automatically convert from the string inputs you're
getting from the input() function:

  >>> s = "5"
  >>> s in r
  False
  >>> int(s) in r
  True

Additionally, note that the endpoint of the range is exclusive so

  >>> r = range(1, 10)
  >>> 10 in r
  False
  >>> list(r)
  [1, 2, 3, 4, 5, 6, 7, 8, 9]

If you want numeric-range checks, Python provides the lovely
double-comparison syntax:

  >>> x = 5
  >>> 2 < x < 10
  True
  >>> x = 5.5
  >>> 2 < x < 10
  True
  >>> s = "5"
  >>> 2 < s < 10
  Traceback…
  >>> 2 < int(s) < 10
  True

Hopefully this gives you the hints that you need to troubleshoot.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Returning from a multiple stacked call at once

2020-12-12 Thread Tim Chase
On 2020-12-12 07:39, ast wrote:
> In case a function recursively calls itself many times,
> is there a way to return a data immediately without
> unstacking all functions ?

Not that I'm aware of.   If you use recursion (and AFAIK, Python
doesn't support tail-recursion), you pay all the pushes & pay all the
pops.

If you convert it to an iterative algorithm, you can bail early with
items still in the work queue:

  while queue:
item = queue.get()
if test(item):
  print(f"Found it, bailing early with {len(queue)} item(s)")
  break
more = process(item)
if more:
  queue.extend(more)

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any better way for this removal? [typo correction]

2020-11-07 Thread Tim Chase
On 2020-11-07 10:51, Tim Chase wrote:
>   from string import ascii_lowercase
>   text = ",".join(ascii_lowercase)
>   to_throw_away = 5

[derp]

For obvious reasons, these should be s/\/to_throw_away/g

To throw away the trailing N delimited portions:

>   new_string = text.rsplit(',', n)[0]

  new_string = text.rsplit(',', to_throw_away)[0]

and to remove the leading N fields

>   new_string = text.split(',', n)[-1]

  new_string = text.split(',', to_throw_away)[-1]

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Any better way for this removal?

2020-11-07 Thread Tim Chase
On 2020-11-07 13:46, Bischoop wrote:
> text = "This is string, remove text after second comma, to be
> removed."
> 
> k=  (text.find(",")) #find "," in a string
> m = (text.find(",", k+1)) #Find second "," in a string
> new_string = text[:m]
> 
> print(new_string)

How about:

  new_string = text.rsplit(',', 1)[0]

This also expands to throwing away more than one right-hand bit if
you want:

  from string import ascii_lowercase
  text = ",".join(ascii_lowercase)
  to_throw_away = 5
  new_string = text.rsplit(',', n)[0]

or throwing away `n` left-hand bits too using .split()

  left_portion = text.split(',', n)[-1]

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Best way to determine user's screensize?

2020-10-31 Thread Tim Chase
On 2020-10-31 15:22, Grant Edwards wrote:
> > A MUA may have to display hundreds of mailboxes, and maybe tens of
> > thousands of mails in a single mailbox.  
> 
> No. It doesn't. It has to display a tree widget that shows N items
> and holds tens of thousands of items, or a scrolling list widget
> than shows M items and holds tens of thousands of items.  Pick
> reasonable initial default values for N,M and then let the window
> manager and user do the right thing.

But that's exactly the issue.  On my phone, a "reasonable default N"
might be 7 items and consume the whole screen; whereas on my netbook,
a "reasonable default N" might be 15 in one layout or 25 in another;
and on my daily driver, a "reasonable default N" might well be 50 or
100 depending on layout or monitor orientation.

How does the application determine "reasonable"?  It probes the
system for screen dimensions (hopefully with multi-monitor smarts)
and then makes an attempt to paint the display.

This doesn't free it from being subject to a window manager's
subsequent constraints, but at least allows the application to make
some sensible choices for initial defaults.  Some window-managers are
dumb (glares at Windows), some are accommodating (I like how Fluxbox
behaves), some are dictatorial (e.g. tiling window managers that give
far less wiggle-room for applications' dimensions), and some largely
ignore the user ("maximize" on MacOS annoys me to no end, maximizing
only enough to display all the content; when what I want is to obscure
everything else for single-app focus; requiring me to manually resize
the window with the mouse so it fills the screen).

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pythonic style

2020-09-21 Thread Tim Chase
On 2020-09-21 09:48, Stavros Macrakis wrote:
>>   def fn(iterable):
>> x, = iterable
>> return x
>
> Thanks, Tim! I didn't realize that you could write (x,) on the LHS!
> Very nice, very Pythonic!

It also expands nicely for other cases, so you want the 3-and-only-3
first values with errors for too many or too few?

  x, y, z = iterable
  x, y, z = (1, 2, 3)

The (x,) version is just the single case.  And it's fast—a single
Python UNPACK_SEQUENCE opcode

  >>> dis.dis(fn)
  2   0 LOAD_FAST0 (i)
  2 UNPACK_SEQUENCE  1
  4 STORE_FAST   1 (x)

  3   6 LOAD_FAST1 (x)
  8 RETURN_VALUE

Though now I'm wondering if there's a way to skip the
STORE_FAST/LOAD_FAST instructions and create a function that
generates the opcode sequence

  UNPACK_SEQUENCE 1
  RETURN_VALUE

:-)

(totally tangential ramblings)

-tkc







-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Pythonic style

2020-09-21 Thread Tim Chase
On 2020-09-20 18:34, Stavros Macrakis wrote:
> Consider a simple function which returns the first element of an
> iterable if it has exactly one element, and throws an exception
> otherwise. It should work even if the iterable doesn't terminate.
> I've written this function in multiple ways, all of which feel a
> bit clumsy.
> 
> I'd be interested to hear thoughts on which of these solutions is
> most Pythonic in style. And of course if there is a more elegant
> way to solve this, I'm all ears! I'm probably missing something
> obvious!

You can use tuple unpacking assignment and Python will take care of
the rest for you:

  >>> x, = tuple() # no elements
  Traceback (most recent call last):
File "", line 1, in 
  ValueError: not enough values to unpack (expected 1, got 0)
  >>> x, = (1, )  # one element
  >>> x, = itertools.repeat("hello") # 2 to infinite elements
  Traceback (most recent call last):
File "", line 1, in 
  ValueError: too many values to unpack (expected 1)

so you can do

  def fn(iterable):
x, = iterable
return x

The trailing comma can be hard to spot, so I usually draw a little
extra attention to it with either

  (x, ) = iterable

or

  x, = iterable # unpack one value

I'm not sure it qualifies as Pythonic, but it uses Pythonic features
like tuple unpacking and the code is a lot more concise.

-tim





-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex to change multiple lines

2020-09-03 Thread Tim Chase
Derp, sorry about the noise.  I mistook this message for a similar
dialog over on the Vim mailing list.

For Python, you want

  re.sub(r"%%(.*?)%%", r"\1", s, flags=re.S)

or put the flag inline

  re.sub(r"(?s)%%(.*?)%%", r"\1", s)

-tim

On 2020-09-03 09:27, Tim Chase wrote:
> On 2020-09-03 16:10, Termoregolato wrote:
> > -- original
> > This is the %%text that i must modify%%, on a line, %%but also
> > on the others%% that are following
> > 
> > I need to change to
> > 
> > -- desidered
> > This is the text that i must modify, on a line,
> > but also on the others that are following  
> 
> Should be able to use
> 
>  :%s/%%\(\_.\{-}\)%%/\1<\/del>/g
> 
> It simplifies slightly if you use a different delimiter
> 
>  :%s@%%\(\_.\{-}\)%%@\1@g
> 
> -tim
> 
> 
> 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex to change multiple lines

2020-09-03 Thread Tim Chase
On 2020-09-03 16:10, Termoregolato wrote:
> -- original
> This is the %%text that i must modify%%, on a line, %%but also
> on the others%% that are following
> 
> I need to change to
> 
> -- desidered
> This is the text that i must modify, on a line, but
> also on the others that are following

Should be able to use

 :%s/%%\(\_.\{-}\)%%/\1<\/del>/g

It simplifies slightly if you use a different delimiter

 :%s@%%\(\_.\{-}\)%%@\1@g

-tim



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: True is True / False is False?

2020-07-22 Thread Tim Chase
On 2020-07-22 11:54, Oscar Benjamin wrote:
>> On Wed, Jul 22, 2020 at 11:04 AM Tim Chase wrote:  
>>> reading through the language specs and didn't encounter
>>> anything about booleans returned from comparisons-operators,
>>> guaranteeing that they always return The One True and The One
>>> False.  
>>
>> That said, though, a comparison isn't required to return a bool.
>> If it *does* return a bool, it has to be one of those exact two,
>> but it could return anything it chooses. But for built-in types
>> and most user-defined types, you will indeed get a bool.  
> 
> I'm not sure if this is relevant to the question but thought I'd
> mention concrete examples. A numpy array will return non-bool for
> both of the mentioned operators

that is indeed a helpful data-point.  Do you know of any similar
example in the standard library of things where comparison-operators
return something other than True or False (or None)?

Thanks,

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


True is True / False is False?

2020-07-21 Thread Tim Chase
I know for ints, cpython caches something like -127 to 255 where `is`
works by happenstance based on the implementation but not the spec
(so I don't use `is` for comparison there because it's not
guaranteed by the language spec). On the other hand, I know that None
is a single object that can (and often *should*) be compared using
`is`. However I spent some time reading through the language specs and
didn't encounter anything about booleans returned from
comparisons-operators, guaranteeing that they always return The One
True and The One False.

  x = 3 == 3 # some boolean expression evaluating to True/False
  y = 4 > 0 # another boolean expression
  if x is y:
print("cool, same as always")
  else:
print("is this ever possible?")

Is there some theoretical world in which that `else` branch could ever
be hit because the True referenced by x is not the same True
referenced by y? (assuming non-pathological overriding of dunder
methods to return true-ish or false-ish values; or at least assuming
any dunder-overriding is pure standard-library)

In the big picture, this is likely irrelevant and I should just use
"==" instead, but I got the question stuck in my head and ended up
bothered that I couldn't disinter an answer from docs.

Thanks,

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: New to python - Just a question

2020-07-03 Thread Tim Chase
On 2020-07-03 10:09, Daley Okuwa via Python-list wrote:
> Write an algorithm (choose the language you prefer) that given a
> character string, for instance {‘c’,’a’,’i’,’o’,’p’,’a’}, will
> print out the list of characters appearing at least 2 times. In
> this specific example, it would return {‘a’}. Afterwards, comment
> out the cost in terms of space and time.

What have you tried already?  Where are you having trouble?

Have you written a program that accepts a character string?  Is the
string coming in as a command-line argument or on standard-input?

The example string you give looks more like some sort of
serialization format rather than a string.

Are you having difficulty counting the letters?  Python provides a
"dict()" type that would work well.

Should uppercase letters be counted the same as lowercase letters?
I.e., should "Pop" report that there are 2 "p"s?

If you've counted the duplicates, 

Have you studied space/time complexity and do you know how to
evaluate code for these characteristics?  The problem should be
solvable in roughly O(k) per word.

> Write a bash/python script that takes a directory as an argument
> and output the total lines of code in *.cpp files recursively.

Again, what have you tried?

Have you been able to iterated over a directory? See find(1) or ls(1)
or grep(1) in a shell-script or os.listdir()/os.scandir()/glob.glob()
in Python

Have you been able to open those files?

Can you iterate over the lines in each file?

Do you need to filter out any lines (such as blank lines or comments)?

If you provide what you've tried, folks here on the list are pretty
happy to help.  But most won't do your homework for you.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Friday Finking: Imports, Namespaces, Poisoning, Readability

2020-06-04 Thread Tim Chase
On 2020-06-05 12:15, DL Neil via Python-list wrote:
> Finking/discussion:
> 
> - how do you like to balance these three (and any other criteria)?

For most of what I do, I only ever have one such module so I'm not
trying keep multiple short-names in my head concurrently.  For me,
it's usually tkinter which I import as either "tk" or just "t".  It's
usually cases where the whole content would dump a lot of junk in my
module's namespace (like tkinter or pygame), so I like to keep the
namespacing; but I also don't want to clutter my code with long
references to it.

> - is your preference (or selection) influenced by the facilities
> offered by your favorite editor/IDE?

Not really.  I'm a vi/vim/ed guy.  While vim has some nice "go to
the definition of the thing under the cursor" (gd/gD/tags/etc),
regardless of my $EDITOR, it's not overly taxing to see "tk.thing"
and if I had any question about it, jump to the top of the file where
all the import-statements are, and see that tkinter is being imported
as tk.  It's there explicitly, unlike those abominable "from tkinter
import *" tutorials.

> - does your decision differ according to whether the 'target
> module' is one of yours, from the PSL, from some third party,
> corporate, ...?

I've not given it much thought, but I tend to use the above method
with PSL/3rd-party libraries where there's a lot of things in that
namespace; but for my own code, they're usually pretty svelte, so I
stick to "from mymodule import a, b, c" if there are lots or just an
"import mymodule" without aliasing it shorter.  If the imported names
were to grow to an unweildy size, I'd go the same route as above
("import myutils as u").

My largely irrelevant $0.02 USD, adjusted for tax and inflation,

-tkc






-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Strings: double versus single quotes

2020-05-23 Thread Tim Chase
On 2020-05-23 14:46, Dennis Lee Bieber wrote:
> On Sat, 23 May 2020 11:03:09 -0500, Tim Chase
> >But when a string contains both, it biases towards single quotes:
> >  
> >   >>> "You said \"No it doesn't\""  
> >   'You said "No it doesn\'t"'  
> 
>   This is where using triple quotes (or triple apostrophes)
> around the entire thing simplifies it all... (except for a need to
> separate the four ending quotes)

Unless you're pathological. ;-)

>>> """I said "This contain every type of \"""Python\""" '''triple-quoted''' 
>>> string, doesn't it?\""""
'I said "This contains every type of """Python""" \'\'\'triple-quoted\'\'\' 
string, doesn\'t it."'

And-you-can-quote-me-on-that'ly yers,

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Strings: double versus single quotes

2020-05-23 Thread Tim Chase
On 2020-05-24 01:40, Chris Angelico wrote:
> On Sat, May 23, 2020 at 10:52 PM Abdur-Rahmaan Janhangeer
>  wrote:
> >
> > The interpreter prefers single-quotes
> >  
> > >>> "single or double"  
> > 'single or double'
> >  
> >>> 'not all that strongly, it doesn\'t'  
> "not all that strongly, it doesn't"

But when a string contains both, it biases towards single quotes:

   >>> "You said \"No it doesn't\""
   'You said "No it doesn\'t"'

;-)

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Strings: double versus single quotes

2020-05-19 Thread Tim Chase
On 2020-05-19 20:10, Manfred Lotz wrote:
> Hi there,
> I am asking myself if I should preferably use single or double
> quotes for strings?

I'd say your consistency matters more than which one you choose.

According to a recent observation by Raymond H.

"""
  Over time, the #python world has shown increasing preference
  for double quotes:  "hello" versus 'hello'.

  Perhaps, this is due to the persistent influence of JSON,
  PyCharm, Black, and plain English.

  In contrast, the interpreter itself prefers single quotes:

  >>> "hello"
  'hello'
"""

https://twitter.com/raymondh/status/1259209765072154624

I think the worst choice is to be haphazard in your usage with a
hodgepodge of single/double quotes.

I personally use habits from my C days:  double-quotes for everything
except single characters for which I use a single-quote:

  if 'e' in "hello":

as in indicator that I'm using it as a single character rather than
as a string.

I don't have a firm rule for myself if a string contains
double-quotes.  It doesn't happen often for me in a case where I
couldn't use a triple-quoted string or that I am refering to it as a
single character.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Creating a curses app that interacts with the terminal, yet reads from stdin?

2020-04-18 Thread Tim Chase
I know that vim lets me do things like

 $ ls | vim -

where it will read the data from stdin, but then take over the screen
TUI curses-style, and interact directly with the keyboard input
without being limited to input from stdin.

I've played around with something like

import sys
import curses
d = list(sys.stdin) # read/process/store it
def main(stdscr):
stdscr.clear()
for i, s in enumerate(d):
stdscr.addstr(i, 1, s.strip())
stdscr.refresh()
stdscr.getkey()
curses.wrapper(main)

but it complains when I run it with

  $ seq 10 | python myprog.py
  ...
  _curses.error: no input

I've tried adding (after the imports)

tty = open('/dev/tty', 'r+b')
curses.setupterm(fd=tty.fileno())

to coerce it, but it doesn't seem to do anything fruitful.

Any advice for how to go about replicating this vim-like behavior in
Python?

Thanks!

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python3 module with financial accounts?

2020-04-01 Thread Tim Chase
On 2020-04-01 19:27, Peter Wiehe wrote:
> Is there a Python3 module with financial accounts?

You'd have to be more specific.  For interacting with online accounts
with financial institutions?  For tracking financial data locally?

There's beancount (http://furius.ca/beancount/ and written in Python)
which does plaintext accounting (https://plaintextaccounting.org/)
for wrangling financial account information using double-entry
bookkeeping methods.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Idiom for partial failures

2020-02-20 Thread Tim Chase
On 2020-02-20 13:30, David Wihl wrote:
> I believe that it would be more idiomatic in Python (and other
> languages like Ruby) to throw an exception when one of these
> partial errors occur. That way there would be the same control flow
> if a major or minor error occurred. 

There are a variety of ways to do it -- I like Ethan's suggestion
about tacking the failures onto the exception and raising it at the
end.  But you can also yield a tuple of success/failure iterators,
something like this pseudocode:

  def process(input_iter):
successes = []
failures = []
for thing in input_iter:
  try:
result = process(thing)
  except ValueError as e: # catch appropriate exception(s) here
failures.append((e, thing))
  else:
successes.append((result, thing))
return successes, failures

  def process(item):
if item % 3:
  raise ValueError("Must be divisible by 3!")
else:
  print(item)
  return item // 3

  successes, failures = process(range(10))

  for reason, thing in failures:
print(f"Unable to process {thing} because {reason}")

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Help on dictionaries...

2020-01-29 Thread Tim Chase
On 2020-01-30 06:44, Souvik Dutta wrote:
> Hey I was thinking how I can save a dictionary in python(obviously)
> so that the script is rerun it automatically loads the dictionary.

This is almost exactly what the "dbm" (nee "anydbm") module does, but
persisting the dictionary out to the disk:

  import dbm
  from sys import argv
  with dbm.open("my_cache", "c") as db:
if len(argv) > 1:
  key = argv[1]
  if key in db:
print("Found it:", db[key])
  else:
print("Not found. Adding")
if len(argv) > 2:
  value = argv[2]
else:
  value = key
db[key] = value
else:
  print("There are %i items in the cache" % len(db))

The resulting "db" acts like a dictionary, but persists.

If you really must have the results as a "real" dict, you can do the
conversion:

  real_dict = dict(db)

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Friday Finking: Source code organisation

2019-12-28 Thread Tim Chase
On 2019-12-29 12:52, Greg Ewing wrote:
> On 29/12/19 11:49 am, Chris Angelico wrote:
> > "Define before use" is a broad principle that I try to follow,
> > even when the code itself doesn't mandate this.  
> 
> I tend to do this too, although it's probably just a habit
> carried over from languages such as Pascal and C where you
> have to go out of your way to get things in a different
> order.

Apparently I'm not alone in my Pascal/C-derived habits of
define-before-use.

Inside a class, I tend to roughly follow

  __new__ (if present)
  __init__
  other dunder methods
  subsequent methods alphabetically

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Most elegant way to do something N times

2019-12-22 Thread Tim Chase
On 2019-12-22 23:34, Batuhan Taskaya wrote:
> I encounter with cases like doing a function 6 time with no
> argument, or same arguments over and over or doing some structral
> thing N times and I dont know how elegant I can express that to the
> code. I dont know why but I dont like this
>
> for _ in range(n): do()

The best way to improve it would be to clarify readibility.  Why 6
times instead of 5 or 7?  Are they retries?

  MAX_RETRIES = 6 # defined in RFC-3141592
  for _ in range(MAX_RETRIES):
if successful(do_thing()):
  break
  else:
error_condition()

or because someone likes the number 6?

  MY_FAVORITE_NUMBER = 6
  for _ in range(MY_FAVORITE_NUMBER):
print(f"I love the number {MY_FAVORITE_NUMBER}!")

or days of the week that aren't Sunday?
  
  for day in calendar.c.iterweekdays():
if day == calendar.SUNDAY: continue
do_thing()

or is it because it's the number of whole columns that fit?

  screen_cols = 80 # get this from curses?
  chars_per_col = 13
  for _ in range(screen_cols // chars_per_col):
do_thing()

or because that's how many people there are in a paticular grouping?

  family = "Dad Mom Sis Brother Me Baby".split()
  for _ in family:
do_thing()

Note how each of those conveys what "6" means, not just some
arbitrary number.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: More efficient/elegant branching

2019-12-09 Thread Tim Chase
On 2019-12-09 12:27, Musbur wrote:
> def branch1(a, b, z):
>  """Inelegant, unwieldy, and pylint complains
>  about too many branches"""
>  if a > 4 and b == 0:
>  result = "first"
>  elif len(z) < 2:
>  result = "second"
>  elif b + a == 10:
>  result = "third"
>  return result

Because of the diversity of test-types, I tend to agree with ChrisA
and be strongly tempted to ignore linting warning.  I have some ETL
code that has umpty-gazillion odd if-conditions like this and they
all update some global-to-the-if/local-to-the-loop state, and if I
split each out somehow, I'd either be passing an object around that I
mutate, or I would do a lot of copy-the-data-updating-one-field per
function.

Alternatively, ...

> def branch2(a, b, z):
>  """Elegant but inefficient because all expressions
>  are pre-computed althogh the first one is most likely
>  to hit"""
> def branch3(a, b, z):
>  """Elegant but inefficient because expressions
>  need to be parsed each time"""

If you really want to do lazy evaluation, you can create a
function for each, which might (or might not) make it easier to read:

  def something_descriptive(a, b, z):
return a > 4 and b == 0

  def z_is_short(a, b, z):
return len(z) < 2

  def proper_total(a, b, z)
 return b + a == 10

  def branch4(a, b, z):
for test, result in [
(something_descriptive, "first"),
(z_is_short, "second"),
(proper_total, "third"),
]:
  if test(a, b, z):
return result
return "some default"

or possibly

  def something_descriptive(a, b):
return a > 4 and b == 0

  def z_is_short(z):
return len(z) < 2

  def proper_total(a, b)
 return b + a == 10

  def branch5(a, b, z):
for test, params, result in [
(something_descriptive, (a, b), "first"),
(z_is_short, (z,), "second"),
(proper_total, (a, b), "third"),
]:
  if test(*params):
return result
return "some default"

I'm not sure either of those is necessarily *better*, but they're at
least options that you can try and see if it improves readability in
your particular case.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: increasing the page size of a dbm store?

2019-12-03 Thread Tim Chase
On 2019-12-02 16:49, Michael Torrie wrote:
> On 12/1/19 7:50 PM, Tim Chase wrote:
> > After sparring with it a while, I tweaked the existing job so
> > that it chunked things into dbm-appropriate sizes to limp
> > through; for the subsequent job (where I would have used dbm
> > again) I went ahead and switched to sqlite and had no further
> > issues.  
> 
> How did you replace a key/value store with a relational database?
> Is a SQLite database fast enough at this sort of thing that it
> wasn't really designed for?

It was certainly slower, though it wasn't so bad once I had proper
indexing and submitted queries that pulled back multiple results in
one query.

But even with the slightly slower run-time aspects, it was still
faster than starting a job (expecting it to run to completion
overnight), having it crash, manually deleting my cache, and manually
resuming from where it left off, all multiple times.

And all said, since it was network I/O bound, once I had the
populated cache (resulting cache.db file was about 1TB...thank
goodness for transparent compression with ZFS), turnarounds took more
like 30min rather than 3 days.  More the "go work on something else
and come back" than the "let it run overnight".

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: increasing the page size of a dbm store?

2019-12-01 Thread Tim Chase
> Maybe port to SQLite? I would not choose dbm these days.

After sparring with it a while, I tweaked the existing job so that it
chunked things into dbm-appropriate sizes to limp through; for the
subsequent job (where I would have used dbm again) I went ahead and
switched to sqlite and had no further issues.

I'm not sure if it's worth mentioning the issue in the docs for the
dbm module so others don't bump against it.  I'm not sure if the
limit is on sum(size(key) for key in db) or the number of keys total.
Just not the sort of thing I'd want someone to be depending on,
unaware of the potential pitfalls.

Thanks,

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


increasing the page size of a dbm store?

2019-11-26 Thread Tim Chase
Working with the dbm module (using it as a cache), I've gotten the
following error at least twice now:

  HASH: Out of overflow pages.  Increase page size
  Traceback (most recent call last):
  [snip]
  File ".py", line 83, in get_data
db[key] = data
  _dbm.error: cannot add item to database

I've read over the py3 docs on dbm

https://docs.python.org/3/library/dbm.html

but don't see anything about either "page" or "size" contained
therein.

There's nothing particularly complex as far as I can tell.  Nothing
more than a straightforward

  import dbm
  with dbm.open("cache", "c") as db:
for thing in source:
  key = extract_key_as_bytes(thing)
  if key in db:
data = db[key]
  else:
data = long_process(thing)
db[key] = data

The keys can get a bit large (think roughly book-title length), but
not huge. I have 11k records so it seems like it shouldn't be
overwhelming, but this is the second batch where I've had to nuke the
cache and start afresh.  Fortunately I've tooled the code so it can
work incrementally and no more than a hundred or so requests have to
be re-performed.

How does one increas the page-size in a dbm mapping?  Or are there
limits that I should be aware of?

Thanks,

-tkc

PS: FWIW, this is Python 3.6 on FreeBSD in case that exposes any
germane implementation details.







-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Resources related with web security

2019-11-25 Thread Tim Chase
On 2019-11-25 21:25, Pycode wrote:
> On Sun, 24 Nov 2019 10:41:29 +1300, DL Neil wrote:
>> Are such email addresses 'open' and honest?
> 
> you are not being helpful or answer the question..

What DL Neil seems to be getting at is that there's been an uptick in
questions

1) where we don't know who you (and several other recent posters) are:

  - The pyc.ode domain-name of your email address isn't a
real/registered domain

  - there doesn't seem to be much evidence of you being part of the
Python community with a history of other messages

  Neither factor inspires much confidence.

2) you (and others) are asking to be spoonfed example code that could
cause problems on the internet.

>>> can anyone post links for python resources that contain tools and
>>> scripts related with security and pentesting?

They're the sorts of tools that, if the community deems you a
non-threatening-actor, they might point you in the right direction.
But not knowing who you are (see point #1 above), I imagine folks here
are hesitant.  And almost certainly not going to spoon-feed example
code that could then end up attacking sites on the web.

So I suspect DL Neil was raising awareness to make sure that anybody
who *did* answer your questions might take the time to think about
the potential consequences of the actions.  So DL *is* being helpful,
but rather to the community, even if not necessarily to you in
particular.

> can someone answer? maybe should i ask on the mailing list?

You did.  The usenet & mailing lists are mirrored.  Though perhaps if
you post from a legit mail identity/address (whether to the mailing
list or usenet), it might help folks evaluate whether you're a "white
hat" or a "black hat" (or somewhere in between).


As to your questions, all the basics are available:  materials on
security & pentesting are a web-search away, and Python provides
libraries for both socket-level interfaces & application-specific
protocols.  How you choose to combine them is up to you.  How the
community chooses to assist you in combining them largely depends on
how much they trust you.

-tkc






-- 
https://mail.python.org/mailman/listinfo/python-list


Re: itertools cycle() docs question

2019-08-21 Thread Tim Chase
On 2019-08-21 11:27, Tobiah wrote:
> In the docs for itertools.cycle() there is
> a bit of equivalent code given:
> 
>  def cycle(iterable):
>  # cycle('ABCD') --> A B C D A B C D A B C D ...
>  saved = []
>  for element in iterable:
>  yield element
>  saved.append(element)
>  while saved:
>  for element in saved:
>  yield element
> 
> 
> Is that really how it works?  Why make
> the copy of the elements?  This seems
> to be equivalent:
> 
> 
>  def cycle(iterable):
>  while iterable:
>  for thing in iterable:
>  yield thing

Compare the results of

>>> import itertools as i
>>> def tobiahcycle(iterable):
... while iterable:
... for thing in iterable:
... yield thing
... 
>>> def testiter():
... yield input()
... yield input()
... 

Now, see how many times input() gets called for itertools.islice()

>>> for v in i.islice(i.cycle(testiter()), 6): print(v)

Note that you only provide input twice, once for each yield statement.

Compare that to your tobiahcycle() method:

>>> for v in i.islice(tobiahcycle(testiter()), 6): print(v)

The yield gets called every time through the interator and it
doesn't produce the same results.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP 594 cgi & cgitb removal

2019-05-22 Thread Tim Chase
On 2019-05-22 08:51, Robin Becker wrote:
> In PEP 594 t has been proposed that cgi & cgitb should be removed.
> I suspect I am not the only person in the world that likes using
> cgi and cgitb.

/me waves from the the back row as another cgi/cgitb user...

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Your IDE's?

2019-03-26 Thread Tim Chase
On 2019-03-25 21:38, John Doe wrote:
> What is your favorite Python IDE?

Unix.

https://sanctum.geek.nz/arabesque/series/unix-as-ide/

Namely $EDITOR (for some value of ed/vi/vim), a shell (usually
bash, ksh, or /bin/sh), a VCS (usually git, subversion, rcs, or
fossil, though sometimes CVS or Mercurial), and a whole suite of
other battle-tested tools that work together.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: ConfigParser: use newline in INI file

2019-03-07 Thread Tim Chase
On 2019-03-07 17:19, tony wrote:
> Python 3.5.3 (default, Sep 27 2018, 17:25:39)
> >>> "a\\nb".decode("string-escape")  
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: 'str' object has no attribute 'decode'

Looks like bytestring.decode('unicode_escape') does what you're
describing:

  >>> b"hello\nworld".decode('unicode_escape')
  'hello\nworld'
  >>> print(b"hello\nworld".decode('unicode_escape'))
  hello
  world

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What is your experience porting Python 2.7.x scripts to Python 3.x?

2019-01-24 Thread Tim Chase
On 2019-01-22 19:20, Grant Edwards wrote:
> > For anyone who has moved a substantial bunch of Python 2 to Python
> > 3, can you please reply with your experience?  
> 
> If you used bytes (or raw binary strings) at all (e.g. for doing
> things like network or serial protocols) you're in for a lot of
> pain.

This was my biggest pain point, but it was a good pain.  At $DAYJOB we
had files coming from customers and telecom providers where the
encoding had never been specified.  By going through the conversion
process, we were able to formalize the encoding of various files
meaning fewer crashes when some unexpected character would slip in
and fail to convert.

A painful process, but the end result was better in a multitude of
ways.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Are all items in list the same?

2019-01-07 Thread Tim Chase
On 2019-01-07 17:14, Bob van der Poel wrote:
> I need to see if all the items in my list are the same. I was using
> set() for this, but that doesn't work if items are themselves
> lists. So, assuming that a is a list of some things, the best I've
> been able to come up with it:
> 
> if a.count( targ ) == len(a):
> 
> I'm somewhat afraid that this won't scale all that well. Am I
> missing something?

Since python2.5 you've had any() and all() functions that make this
pretty tidy and they bail early if proven to not be the case (so if
you have hundreds of thousands of items in the list and you know by
the 2nd one that they're not equal, you don't have to touch hundreds
of thousands of items; just the first two).  So I'd do something like

  def all_equal(iterable):
i = iter(iterable)
first = next(i)
return all(x == first for x in i)

It's undefined for an empty list (well, it throws a StopIteration
but you can special-case that), but should hand the cases with
1 element and 2+ elements (both matching and where any item is not
the same). It should work on an iterator as well but will consume the
items in the process.

And I even like how nicely it reads :-)

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: graded randomness

2018-12-28 Thread Tim Chase
On 2018-12-28 17:31, Abdur-Rahmaan Janhangeer wrote:
> do you have something like
> 
> choice(balls)
> 
> >>> red  

Don't specify the "k=" (which defaults to 1 if you omit it) and use
the first element of the results:

>>> from random import choices
>>> distribution = {"green":2, "red": 2, "blue": 1}
>>> data, weights = zip(*distribution.items())
>>> choices(data, weights)[0]
'red'


> and subsequent repetitions for long enough yield approximately 2/5
> times r 2/5 times g and 1/5 b

You can sample it yourself:

>>> from collections import defaultdict
>>> a = defaultdict(int)
>>> for i in range(1):
... a[choices(data, weights=weights)[0]] += 1
... 
>>> dict(a)
{'green': 3979, 'red': 4008, 'blue': 2013}

though if you plan to, then it might be better/faster to use
cum_weights instead, calculating it once and then reusing it rather
than having choices() re-calculate the cumulative-weights on every
call.


> like one without random choice over list/tuples

Not sure what you mean by this.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: graded randomness

2018-12-28 Thread Tim Chase
On 2018-12-28 16:15, Abdur-Rahmaan Janhangeer wrote:
> greetings,
> 
> let us say that i have a box of 5 balls,
> 
> green balls - 2 with probability 2/5
> red balls 2 - with probability 2/5
> blue balls 1 - with probability 1/5
> 
> how to program the selection so that the random choices reflect the
> probabilities?

You're looking for what are called "weighted choices" which the
random.choice() function provides as of Py3.6

https://docs.python.org/3/library/random.html#random.choices

>>> from random import choices
>>> distribution = {"green":2, "red": 2, "blue", 1}
>>> data, weights = zip(*distribution.items())
>>> sum(weights)
5
>>> sorted(choices(data, weights=weights, k=20))
['blue', 'blue', 'blue', 'blue', 'green', 'green', 'green', 'green',
'green', 'green', 'green', 'green', 'red', 'red', 'red', 'red',
'red', 'red', 'red', 'red']

-tim




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program to keep track of success percentage

2018-12-08 Thread Tim Chase
On 2018-12-08 17:54, Avi Gross wrote:
> This may be a bit awkward.

ICWYDT. "awk"ward. :wide-eyed_gaping_grin_with_finger-guns:

You seem to have your knickers in a knot.

> Your solution in AWK assumes  lots of things. You assume the data
> is either on stdin or comes from automatically opening file names
> on the command line to generate a stream of lines. You assume a win
> is any line containing one or more lower or upper case instances of
> the letter W. You let AWK count lines and store that in NR and
> assume anything without a W is a loss. You print a running
> commentary and  only at the end of input state if they exceeded 31%.

1) yes the problem was underdefined.  If all they want is to is tally
wins ("the only user input is win/loss". Note: not blank. Not ties.
Not garbage. Not "victorias/derrotas" nor "νίκες/απώλειες"), the awk
one-liner does exactly that.  I'll grant, that the OP didn't specify
whether this was on Windows, Linux, Mac, BSD, DOS, ULTRIX, or
anything at all about the operating system.  The concepts for the
solution remain, even if awk is unavailable.  If the input isn't from
stdin it's not standard and should be specified in the problem-space
(there's a reason it's called *standard input*).

2) the problem sounded an awful lot like homework.  I'm not going to
answer a *Python* homework problem in *Python*.  I'm willing to give
the solution in another language (done) so the OP can translate
those ideas into Python.  I'm also willing to take what the OP has
already written (nothing provided in the original email) and help the
OP iterate with that.  The original request, while posted to the
Python mailing list, didn't even specify that it had to be in
Python.  If it's not homework, then the one-liner solves their
problem on any modern platform with a CLI that isn't Windows (and
even on Windows if one installs a version of awk there.)

Yes.  It could also have had a sqlite/mysql/postgres database
back-end, and command-line interface for tracking wins/losses, and a
web front-end for displaying statistics and reporting wins/losses,
and a tkinter/wx/whatever GUI for team management, and export PDFs,
and use TensorFlow for AI analysis of the results.  But that's not
what the OP asked for.

> Yours would read much better if spaced out, but you might have
> written it this way when you were 

While I was not in any chemically-altered state of mind, while I
penned it as one line, it would certainly be more readable as a
script.

#!env awk -f
/[wW]/{
 w += 1;
}
{
 printf("W: %i L: %i %i%%\n", w, NR-w, w * 100/NR);
}
END {
 if (w * 100/NR > 31)
  print "More than 31% winning"
}


There.  Happy?

> Would you care to do the same thing as a brief program in that
> language.

I can (and for the record, did), but I don't provide other people
with the prefab answers to their homework.

-tkc








-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Program to keep track of success percentage

2018-12-08 Thread Tim Chase
On 2018-12-08 10:02, Musatov wrote:
> I am thinking about a program where the only user input is
> win/loss. The program let's you know if you have won more than 31%
> of the time or not. Any suggestions about how to approach authoring
> such a program? Thanks. --

Can be done with an awk one-liner:

awk '/[wW]/{w+=1}{printf("W: %i L: %i %i%%\n", w, NR-w, w *
100/NR)}END{if (w * 100/NR > 31) print "More than 31% winning"}'

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-17 Thread Tim Chase
On 2018-07-17 08:37, Marko Rauhamaa wrote:
> Tim Chase :
> > Wait, but now you're talking about vendors. Much of the crux of
> > this discussion has been about personal scripts that don't need to
> > marshal Unicode strings in and out of various functions/objects.  
> 
> In both personal and professional settings, you face the same
> issues. But you don't want to build on something that will
> disappear from the Linux distros.

Right.  Distros are moving away from ASCII-only to proper Unicode
(however it is encoded) support.  Certainly wouldn't want to build on
something that's disappearing from distros, so best to build on
Py3 and Unicode strings.  ;-)

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-17 01:21, Steven D'Aprano wrote:
> > This doesn’t mean that UTF-32 is an awful system, just that it
> > isn’t the magical cure that some were hoping for.  
> 
> Nobody ever claimed it was, except for the people railing that
> since it isn't a magically system we ought to go back to the Good
> Old Days of code page hell, or even further back when everyone just
> used ASCII.

But even ed(1) on most systems is 8-bit clean so even there you're not
limited to ASCII.  I can't say I miss code-pages in the least.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-17 01:08, Steven D'Aprano wrote:
> In English, I think most people would prefer to use a different
> term for whatever "sh" and "ch" represent than "character".

The term you may be reaching for is "consonant cluster"?

https://en.wikipedia.org/wiki/Consonant_cluster

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-16 23:59, Marko Rauhamaa wrote:
> Tim Chase :
> > While the python world has moved its efforts into improving
> > Python3, Python2 hasn't suddenly stopped working.  
> 
> The sword of Damocles is hanging on its head. Unless a consortium is
> erected to support Python2, no vendor will be able to use it in the
> medium term.

Wait, but now you're talking about vendors. Much of the crux of this
discussion has been about personal scripts that don't need to
marshal Unicode strings in and out of various functions/objects.

If you have a py2 script that works with py2 and breaks with py3, and
you don't want to update to py3 unicode-strings-by-default, then
stick with py2.  They even coexist nicely on the same machine.

It doesn't have a self-destruct clause.  As long as py2 continues to
build, it will continue to run which is a long lifetime.  To point,
I still have the "joy" of maintaining some py2.4 code that's in
production.  Would I rather upgrade it to 3.x?  You bet.  But the
powers in place are willing to forego python updates in order to not
rock the boat.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-16 18:31, Steven D'Aprano wrote:
> You say that all you want is a switch to turn off Unicode (and
> replace it with what? Kanji strings? Cyrillic? Shift_JS? no of
> course not, I'm being absurd -- replace it with ASCII, what else
> could any right-thinking person want, right?).

But we already have this.  If I want to turn off Unicode strings, I
type "python2", and if I want to enable Unicode strings, I type
"python3".

While the python world has moved its efforts into improving Python3,
Python2 hasn't suddenly stopped working.  It just stopped receiving
improvements.  If the "old-man shakes-fist at progress" crowd
doesn't like unicode stings in Py3, just keep on using Py2.  You
(generic) won't get arrested.  There are no church^WPython police.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Generate a static back-of-a-book style index?

2018-07-08 Thread Tim Chase
On 2018-07-08 13:34, Cameron Simpson wrote:
> On 07Jul2018 21:57, Tim Chase  wrote:
> >On 2018-07-08 12:12, Cameron Simpson wrote:  
> >> On 07Jul2018 20:11, Skip Montanaro 
> >> wrote:  
> >> >> Have you looked at the ptx command? Might be called "gptx"  
> 
> It's associated with the troff stuff generally. Maybe that's not
> installed?

On my OpenBSD (6.3) boxes, there's no nroff/troff/groff nor any
ptx/gptx.

On my FreeBSD (11.2) boxes, I have nroff/troff/groff available but no
ptx/gptx available in base.  One of the machines has coreutils
installed which provides /usr/local/bin/gptx (and its man-pages).

So I can find it and read about it.  Just curious that it's fallen
out of base on both Free & OpenBSD (don't have a NetBSD machine at
hand to test that).

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Generate a static back-of-a-book style index?

2018-07-07 Thread Tim Chase
On 2018-07-08 12:12, Cameron Simpson wrote:
> On 07Jul2018 20:11, Skip Montanaro  wrote:
> >> Have you looked at the ptx command? Might be called "gptx"  
> >
> >Thanks, Cameron. I was unaware of it. Will check it out.  
> 
> BTW, it well predates the GNU coreutils; I used it on V7 UNIX.

Interesting.  Despite your V7-provenance claim, it doesn't seem to
have persisted into either the FreeBSD boxes or the OpenBSD boxes I
have at hand.  Bringing in anything that involves GNU coreutils adds
it to the machine in question, but just kinda odd that those with a
tighter tie to "real" Unix would have dropped it.

Off to go read `man gptx` now...

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Quick survey: locals in comprehensions (Python 3 only)

2018-06-26 Thread Tim Chase
From: Tim Chase 

On 2018-06-23 23:08, Jim Lee wrote:
>>> On 06/23/2018 10:03 PM, Steven D'Aprano wrote:
>>>> def test():
>>>>   a = 1
>>>>   b = 2
>>>>   result = [value for key, value in locals().items()]
>>>>   return result
>>>>
>>>> what would you expect the result of calling test() to be?
>>>
>>> I would *expect* [1, 2, None], though I haven't actually tried
>>> running it.
>> Interesting. Where do you get the None from?
>
> There are three locals:Γ  a, b, and result.Γ

However at the time locals() is called/evaluated, "result" hasn't yet been
created/defined, so I wouldn't expect to see any representation of "result" in
the return value.  If it existed before the locals() call, I would expect to
see the value it had before the call:

  def test()
a = 1
b = 2
result = "Steven"
result = [value for key, value in locals().items()]
return result
  test() # return [1, 2, "Steven"]


-tkc

--- BBBS/Li6 v4.10 Toy-3
 * Origin: Prism bbs (1:261/38)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Quick survey: locals in comprehensions (Python 3 only)

2018-06-26 Thread Tim Chase
From: Tim Chase 

On 2018-06-24 05:03, Steven D'Aprano wrote:
> I'd like to run a quick survey. There is no right or wrong answer,
> since this is about your EXPECTATIONS, not what Python actually
> does.
>
> Given this function:
>
> def test():
> a = 1
> b = 2
> result = [value for key, value in locals().items()]
> return result
>
> what would you expect the result of calling test() to be?

I'd expect either [1,2] or [2,1] depending on whether it's py2 (where dict
iteration order isn't guaranteed, so could be either) or py3 (where dict order
is more stable/guaranteed)

> Is that the result you think is most useful?

While I have difficulty imagining a case in which I'd find this useful, if I
were writing this code, it's the "useful" result I'd expect.

> In your opinion, is this a useful feature, a misfeature, a bug, or
> "whatever"?

I'd file it in the "whatever" category, possibly useful to someone other than
me.

-tkc

--- BBBS/Li6 v4.10 Toy-3
 * Origin: Prism bbs (1:261/38)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Quick survey: locals in comprehensions (Python 3 only)

2018-06-25 Thread Tim Chase
On 2018-06-23 23:08, Jim Lee wrote:
>>> On 06/23/2018 10:03 PM, Steven D'Aprano wrote:  
 def test():
   a = 1
   b = 2
   result = [value for key, value in locals().items()]
   return result

 what would you expect the result of calling test() to be?
>>>  
>>> I would *expect* [1, 2, None], though I haven't actually tried
>>> running it. 
>> Interesting. Where do you get the None from?
>
> There are three locals:  a, b, and result. 

However at the time locals() is called/evaluated, "result" hasn't yet
been created/defined, so I wouldn't expect to see any representation
of "result" in the return value.  If it existed before the locals()
call, I would expect to see the value it had before the call:

  def test()
a = 1
b = 2
result = "Steven"
result = [value for key, value in locals().items()]
return result
  test() # return [1, 2, "Steven"]


-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Quick survey: locals in comprehensions (Python 3 only)

2018-06-25 Thread Tim Chase
On 2018-06-24 05:03, Steven D'Aprano wrote:
> I'd like to run a quick survey. There is no right or wrong answer,
> since this is about your EXPECTATIONS, not what Python actually
> does.
> 
> Given this function:
> 
> def test():
> a = 1
> b = 2
> result = [value for key, value in locals().items()]
> return result
> 
> what would you expect the result of calling test() to be?

I'd expect either [1,2] or [2,1] depending on whether it's py2 (where
dict iteration order isn't guaranteed, so could be either) or py3
(where dict order is more stable/guaranteed)

> Is that the result you think is most useful?

While I have difficulty imagining a case in which I'd find this
useful, if I were writing this code, it's the "useful" result I'd
expect.

> In your opinion, is this a useful feature, a misfeature, a bug, or
> "whatever"?

I'd file it in the "whatever" category, possibly useful to someone
other than me.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python web server weirdness

2018-06-07 Thread Tim Chase
On 2018-06-07 13:32, Steven D'Aprano wrote:
> I'm following the instructions here:
> 
> https://docs.python.org/3/library/http.server.html
> 
> and running this from the command line as a regular unprivileged
> user:
> 
> python3.5 -m http.server 8000
> 
> What I expected was a directory listing of my current directory.
> 
> What I got was Livejournal's front page.

A couple things to check:

1) you don't mention which URL you pointed your browser at.  I
*presume* it was http://localhost:8000 but without confirmation, it's
hard to tell.  Also, you don't mention if you had anything in the
{path} portion of the URL such as
"http://localhost:8000/livejournal_homepage.html;

2) you don't mention whether your command succeeded with "Serving
HTTP on 0.0.0.0 port 8000" or if it failed because perhaps something
else was listening on that port ("OSError: [Errno 98] Address already
in use").

3) when your browser made the request to that localhost URL, did that
command produce output logging the incoming requests?

4) do you have any funky redirection for localhost in your /etc/hosts
file (or corresponding file location on Windows)

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-07 Thread Tim Chase
On 2018-06-07 22:46, Chris Angelico wrote:
> On Thu, Jun 7, 2018 at 10:18 PM, Steven D'Aprano
>    3. http://localhost:8000/te%00st.html
> >>> Actually, I couldn't even get Chrome to make that request, so it
> >>> obviously was considered by the browser to be invalid.  

It doesn't matter whether Chrome or Firefox can make the request if
it can be made by opening the socket yourself with something as
simple as

  $ telnet example.com 80
  GET /te%00st.html HTTP/1.1
  Host: example.com

If that crashes the server, it's a problem, even if browsers try to
prevent it from happening by accident.

>> It works in Firefox, but Apache truncates the URL:
>>
>> Not Found
>> The requested URL /te was not found on this server.
>>
>> instead of te%00st.html

This is a sensible result, left up to each server to decide what to
do.

>> I wonder how many publicly facing web servers can be induced to
>> either crash, or serve the wrong content, this way?

I'm sure there are plenty. I mean, I discovered this a while back

https://mail.python.org/pipermail/python-list/2016-August/713373.html

and that's Microsoft running their own stack.  They seem to have
fixed that issue at that particular set of URLs, but a little probing
has turned it up elsewhere at microsoft.com since (for the record,
the first set of non-existent URLs return 404-not-found errors while
the second set of reserved filename URLs return
500-Server-Internal-Error pages).  Filename processing is full of
sharp edge-cases.

> Define "serve the wrong content". You could get the exact same
> content by asking for "te" instead of "te%00st.html"; what you've
> done is not significantly different from this:
> 
> http://localhost:8000/te?st.html
> 
> Is that a security problem too?

Depending on the server, it might allow injection for something like

 http://example.com/page%00cat+/etc/passwd

Or it might allow the request to be processed in an attack, but leave
the log files without the details:

 GET /innocent%00malicious_payload
 (where only the "/innocent" gets logged)

Or false data could get injected in log files

 
http://example.com/innocent%00%0a23.200.89.180+-+-+%5b07/Jun/2018%3a13%3a55%3a36+-0700%5d+%22GET+/nasty_porn.mov+HTTP/1.0%22+200+2326

(`host whitehouse.gov` = 23.200.89.180)

It all depends on the server and how the request is handled.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why exception from os.path.exists()?

2018-06-02 Thread Tim Chase
On 2018-06-02 00:14, Steven D'Aprano wrote:
> Since /wibble doesn't exist, neither does /wibble/a\0b
> 
> 
> py> os.path.exists("/wibble")  
> False
> py> os.path.exists("/wibble/a\0b")  
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/storage/torrents/torrents/python/Python-3.6.4/Lib/
> genericpath.py", line 19, in exists
> os.stat(path)
> ValueError: embedded null byte
> 
> Oops.

Existence is a sketchy sort of thing.  For example, sometimes the OS
hides certain directory entries from some syscalls while allowing
visibility from others.  The following comes as on the FreeBSD system
I have at hand:

  >>> import os
  >>> '.bashrc' in os.listdir() # yes, it sees hidden dot-files
  True
  >>> '.zfs' in os.listdir() # but the OS hides .zfs/ from listings
  False
  >>> os.path.exists('.zfs') # yet it exists and can cd into it
  True
  >>> os.chdir('.zfs') # and you can chdir into it
  >>> os.listdir() # but you can't listdir in it
  Traceback (most recent call last):
File "", line 1, in 
  OSError: [Errno 22] Invalid argument
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: The perils of multiple Pythons

2018-04-30 Thread Tim Chase
On 2018-05-01 06:40, Chris Angelico wrote:
>>> >> https://xkcd.com/1987/
>>
>> I feel like this problem is pretty handily solved by virtual
>> environments. Also, if a single project requires all of this,
>> perhaps Python isn't the best choice for the project.  
> 
> Some of it is definitely solved by venvs. But which Python binary do
> you use? 

Pretty sure that all venvs I've used know their corresponding binary.
But...

> And is venv installed? Do you need to install virtualenv
> first? How do you... 

Is it virtualenv?  Or are you using virtualenvwrapper which, last I
checked doesn't work in ksh (OpenBSD's default shell) without jumping
through hoops and getting external ksh-specific files?  Or are you
using pipenv?  Or `python -m venv`?  So many different flavors. :-(

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Converting list of tuple to list

2018-03-29 Thread Tim Chase
On 2018-03-29 20:42, Ganesh Pal wrote:
> I have a list of tuple say   [(1, 2, 1412734464L, 280), (2, 5,
> 1582956032L, 351), (3, 4, 969216L, 425)] .  I need to convert the
> above as ['1,2,1412734464:280',
> '2,5,1582956032:351', '3,4,969216:425']
> 
> Here is my Solution , Any  suggestion or optimizations are welcome .
> 
> Solution 1:
> 
> >>> list_tuple = [(1, 2, 1412734464L, 280), (2, 5, 1582956032L,
> >>> 351), (3, 4, 969216L, 425)]
> >>> expected_list = []  
> >>> for elements in list_tuple:  
> ... element = "%s,%s,%s:%s" % (elements)
> ... expected_list.append(element)

First, I'd do this but as a list comprehension:

  expected_list = [
"%s,%s,%s:%s" % elements
for elements in list_tuple
]

Second, it might add greater clarity (and allow for easier
reformatting) if you give names to them:

  expected_list = [
"%i,%i,%i:%i" % (
  index,
  count,
  timestamp,
  weight,
  )
for index, count, timestamp, weight in list_tuple
]

That way, if you wanted to remove some of the items from formatting
or change their order, it's much easier.

Since your elements seem to be in the order you want to assemble them
and you are using *all* of the elements, I'd go with the first one.
But if you need to change things up (either omitting fields or
changing the order), I'd switch to the second version.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 2.7 EOL = 2020 January 1

2018-03-13 Thread Tim Chase
On 2018-03-13 10:58, Terry Reedy wrote:
> Two days later, Benjamin Peterson, the 2.7 release manager, replied 
> "Sounds good to me. I've updated the PEP to say 2.7 is completely
> dead on Jan 1 2020." adding "The final release may not literally be
> on January 1st".

Am I the only one saddened by this announcement?  I mean, it could
have been something like

"""
"VOOM"?!?  Mate, 2.7 wouldn't "voom" if you put four million volts
through it!  It's bleedin' demised!  It's not pinin'!  It's passed on!
This 2.x series is no more!  It has ceased to be!  It's expired and
gone to meet its maker!  It's a stiff!  Bereft of life, it rests in
peace!  If you hadn't nailed it to your dependencies it'd be pushing
up the daisies!  Its metabolic processes are now 'istory!  It's off
the twig!  It's kicked the bucket, it's shuffled off its mortal coil,
run down the curtain and joined the bleedin' choir invisible!!
THIS IS AN EX-VERSION!!
"""

Pythonically-yers,

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: csv module and NULL data byte

2018-03-01 Thread Tim Chase
On 2018-03-01 23:57, John Pote wrote:
> On 01/03/2018 01:35, Tim Chase wrote:
> > While inelegant, I've "solved" this with a wrapper/generator
> >
> >f = file(fname, …)
> >g = (line.replace('\0', '') for line in f)  
> I wondered about something like this but thought if there's a way
> of avoiding the extra step it would keep the execution speed up.

There shouldn't be noticeable performance issues with using a
generator.  It's also lazy so it's not like it's pulling the entire
file into memory; no more than one line at a time.

> My next thought was to pass a custom encoder to the open() that 
> translates NULLs to, say, 0x01. It won't make any difference to
> change one corrupt value to a different corrupt value.
> >reader = csv.reader(g, …)
> >for row in reader:
> >  process(row)  

...which is pretty much exactly what my generator solution does:
putting a translating encoder between the open() and the
csv.reader() call.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: csv module and NULL data byte

2018-02-28 Thread Tim Chase
On 2018-02-28 21:38, Dennis Lee Bieber wrote:
> >     with open( fname, 'rt', encoding='iso-8859-1' ) as csvfile:  
> 
>   Pardon? Has the CSV module changed in the last year or so?
> 
>   Last time I read the documentation, it was recommended that
> the file be opened in BINARY mode ("rb").

It recommends binary mode, but seems to largely work fine with
text/ascii mode or even arbitrary iterables.  I've not seen the
rationale behind the binary recommendation, but in 10+ years of using
the csv module, I've not found any issues in using text/ascii mode
that were solved by switching to using binary mode.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: csv module and NULL data byte

2018-02-28 Thread Tim Chase
While inelegant, I've "solved" this with a wrapper/generator

  f = file(fname, …)
  g = (line.replace('\0', '') for line in f)
  reader = csv.reader(g, …)
  for row in reader:
process(row)

My actual use at $DAYJOB cleans out a few other things
too, particularly non-breaking spaces coming from client data
that .strip() doesn't catch in Py2.x ("hello\xa0".strip())

-tkc




On 2018-02-28 23:40, John Pote wrote:
> I have a csv data file that may become corrupted (already happened) 
> resulting in a NULL byte appearing in the file. The NULL byte
> causes an _csv.Error exception.
> 
> I'd rather like the csv reader to return csv lines as best it can
> and subsequent processing of each comma separated field deal with
> illegal bytes. That way as many lines from the file may be
> processed and the corrupted ones simply dumped.
> 
> Is there a way of getting the csv reader to accept all 256 possible 
> bytes. (with \r,\n and ',' bytes delimiting lines and fields).
> 
> My test code is,
> 
>      with open( fname, 'rt', encoding='iso-8859-1' ) as csvfile:
>          csvreader = csv.reader(csvfile, delimiter=',', 
> quoting=csv.QUOTE_NONE, strict=False )
>              data = list( csvreader )
>              for ln in data:
>                  print( ln )
> 
> Result
> 
>  >>python36 csvTest.py  
> Traceback (most recent call last):
>    File "csvTest.py", line 22, in 
>      data = list( csvreader )
> _csv.Error: line contains NULL byte
> 
> strict=False or True makes no difference.
> 
> Help appreciated,
> 
> John
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is there are good DRY fix for this painful design pattern?

2018-02-27 Thread Tim Chase
Something like the following might do the trick. As an added benefit,
it's easy to set all the defaults automatically in __init__ as well
without hand-adding "self.dopey = dopey".  On the down side, in the
non-__init__ functions, you have to use kwargs["dopey"] and the like.
It also involves tacking an "__init_args" onto your object so that the
decorator knows what was passed to the __init__ function.

-tkc

from functools import wraps
from inspect import getargspec
def template(original_init_fn):
args, varargs, keywords, defaults = getargspec(original_init_fn)
assert varargs is keywords is None
arg_dict = dict(zip(args[-len(defaults):], defaults))
@wraps(original_init_fn)
def new_init_fn(self, *args, **kwargs):
self.__init_args = arg_dict.copy()
self.__init_args.update(kwargs)
# if you don't want to automatically set attributes
# remove these next two lines
for k, v in self.__init_args.items():
setattr(self, k, v)
return original_init_fn(self, *args, **kwargs)
def templatify(fn):
@wraps(fn)
def new_templated_fn(self, *args, **kwargs):
for k, v in self.__init_args.items():
if k not in kwargs:
kwargs[k] = v
return fn(self, *args, **kwargs)
return new_templated_fn
new_init_fn.templatify = templatify
return new_init_fn

class Foo:
@template
def __init__(self,
bashful=None,
dopey=None,
doc="On definition",
):
pass # look, ma, no manual assignment!

@__init__.templatify
def myfunc(self, **kwargs):
print(kwargs)

f1 = Foo()
f2 = Foo(bashful="on init", dopey="on init")

for fn in (f1, f2):
fn.myfunc()
fn.myfunc(bashful="on myfunc")







On 2018-02-26 14:41, Steven D'Aprano wrote:
> I have a class with a large number of parameters (about ten)
> assigned in `__init__`. The class then has a number of methods
> which accept *optional* arguments with the same names as the
> constructor/initialiser parameters. If those arguments are None,
> the defaults are taken from the instance attributes.
> 
> An example might be something like this:
> 
> 
> class Foo:
> def __init__(self, bashful, doc, dopey, grumpy, 
>happy, sleepy, sneezy):
> self.bashful = bashful  # etc
> 
> def spam(self, bashful=None, doc=None, dopey=None, 
>grumpy=None, happy=None, sleepy=None,
>sneezy=None):
> if bashful is None:
> bashful = self.bashful
> if doc is None:
> doc = self.doc
> if dopey is None:
> dopey = self.dopey
> if grumpy is None:
> grumpy = self.grumpy
> if happy is None:
> happy = self.happy
> if sleepy is None:
> sleepy = self.sleepy
> if sneezy is None:
> sneezy = self.sneezy
> # now do the real work...
> 
> def eggs(self, bashful=None, # etc... 
>):
> if bashful is None:
> bashful = self.bashful
> # and so on
>  
> 
> There's a lot of tedious boilerplate repetition in this, and to add 
> insult to injury the class is still under active development with
> an unstable API, so every time I change one of the parameters, or
> add a new one, I have to change it in over a dozen places.
> 
> Is there a good fix for this to reduce the amount of boilerplate?
> 
> 
> Thanks,
> 
> 
> 
> -- 
> Steve
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Fw: [issue22167] iglob() has misleading documentation (does indeed store names internally)

2018-02-17 Thread Tim Chase
Has anybody else been getting unexpected/unsolicited emails from the
Python bug-tracker?

I'm not associated with (didn't submit/lurk/follow/sign-up-for) this
bug or its notifications but somehow I'm getting messages on this
particular issue.  I've now received two notifications (both on this
same bug) coming to my custom Python Bugs email address.  When I
check the issue's page, I'm not on the Nosy list so I don't really
have a good way to unfollow because it doesn't think I'm following it.

I *am* in the Python Bugs DB associated with other issues, I'm just
confused how I ended up attached to this bug.  Anybody else either
receiving unsolicited bug notifications or happen to know how/why I'm
getting these?

Thanks,

-tkc


Begin forwarded message:

Date: Sat, 17 Feb 2018 08:09:23 +
From: Serhiy Storchaka 
To: python.bugs@[redacted]
Subject: [issue22167] iglob() has misleading documentation (does
indeed store names internally)


Serhiy Storchaka  added the comment:

Unfortunately issue25596 didn't change anything about this issue.
iglob() still stores names (actually DirEntry objects) of all files
in a directory before starting yielding the first of them. Otherwise
we cold exceed the limit of open file descriptors.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: virtualenvwrapper under OpenBSD's ksh & FreeBSD's /bin/sh

2018-02-08 Thread Tim Chase
Giving a nudge here.  I've tried the below process with both
OpenBSD's stock ksh and FreeBSD's stock /bin/sh as my shell and both
seem to have similar errors (the FreeBSD error is less precise about
the line location or the actual error:

/home/tim/.local/bin/virtualenvwrapper.sh: ${}: Bad substitution

but I suspect it's the same).

Is there a way to get `pip install --user virtualenvwrapper` to pull
in a version that supports either ksh or traditional /bin/sh instead
of being yoked to bash/zsh?  Based on that bitbucket.org/dhellmann
link, it sounds like it should at least support ksh.

Thanks,

-tkc


On 2018-02-02 13:10, Tim Chase wrote:
> Under a new user account with ksh (the default) as the shell I
> issued the following:
> 
>   $ pip3 install --user virtualenvwrapper
>   Successfully installed pbr-3.1.1 six-1.11.0 stevedore-1.28.0
> virtualenv-clone-0.2.6 virtualenvwrapper-4.8.2 $ export
> WORKON_HOME=~/code/virtualenvs $ mkdir -p $WORKON_HOME
> 
> Good so far.  Based on
> 
>   https://bitbucket.org/dhellmann/virtualenvwrapper-hg
> 
> it sounds like ksh should be supported.  However when I try to
> enable it, I get:
> 
>   $ . ~/.local/bin/virtualenvwrapper.sh
>   ksh: /home/tim/.local/bin/virtualenvwrapper.sh[97]: ${.sh.file}":
> bad substitution
> 
> The line in question reads
> 
>   virtualenvwrapper.sh: export
> VIRTUALENVWRAPPER_SCRIPT="${.sh.file}"  
> 
> though it's not present in the latest tip version of the source.  I
> tried pulling in that one virtualenvwrapper.sh file from the tip to
> see if that would remedy the issue but it complains
> 
>   $ . ~/tmp/virtualenvwrapper.sh
>   ksh: /home/tim/tmp/virtualenvwrapper.sh[247]: syntax error: `('
> unexpected
> 
> on this line
> 
>  COMPREPLY=( $(compgen -W "`virtualenvwrapper_show_workon_options`"
> -- ${cur}) )
> 
> Is there something I'm missing or need to do to get pip (pip3.6) to
> pull in a working version of virtualenvwrapper for ksh?
> 
> Thanks,
> 
> -tkc
> 
> 
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


virtualenvwrapper under OpenBSD's ksh

2018-02-02 Thread Tim Chase
Under a new user account with ksh (the default) as the shell I issued
the following:

  $ pip3 install --user virtualenvwrapper
  Successfully installed pbr-3.1.1 six-1.11.0 stevedore-1.28.0 
virtualenv-clone-0.2.6 virtualenvwrapper-4.8.2
  $ export WORKON_HOME=~/code/virtualenvs
  $ mkdir -p $WORKON_HOME

Good so far.  Based on

  https://bitbucket.org/dhellmann/virtualenvwrapper-hg

it sounds like ksh should be supported.  However when I try to enable
it, I get:

  $ . ~/.local/bin/virtualenvwrapper.sh
  ksh: /home/tim/.local/bin/virtualenvwrapper.sh[97]: ${.sh.file}": bad 
substitution

The line in question reads

  virtualenvwrapper.sh: export VIRTUALENVWRAPPER_SCRIPT="${.sh.file}"  

though it's not present in the latest tip version of the source.  I
tried pulling in that one virtualenvwrapper.sh file from the tip to
see if that would remedy the issue but it complains

  $ . ~/tmp/virtualenvwrapper.sh
  ksh: /home/tim/tmp/virtualenvwrapper.sh[247]: syntax error: `(' unexpected

on this line

 COMPREPLY=( $(compgen -W "`virtualenvwrapper_show_workon_options`" -- ${cur}) )

Is there something I'm missing or need to do to get pip (pip3.6) to
pull in a working version of virtualenvwrapper for ksh?

Thanks,

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Goto

2017-12-28 Thread Tim Chase
On 2017-12-29 08:42, Ben Finney wrote:
> Duram  writes:
> 
> > How to use goto in python?  
> 
> Step 0: what is goto in Python?
> 
> Step 1: that's not something that exists in Python. So why are you
> asking how to use something that doesn't exist?

so quick to shoot down a poor soul.

http://entrian.com/goto/

Gives you both GOTO and COMEFROM ;-)

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


__contains__ classmethod?

2017-12-18 Thread Tim Chase
Playing around, I had this (happens to be Py2, but gets the same
result in Py3) code

class X(object):
  ONE = "one"
  TWO = "two"
  _ALL = frozenset(v for k,v in locals().items() if k.isupper())
  @classmethod
  def __contains__(cls, v):
return v in cls._ALL
print(dir(X))
print(X._ALL)

Running this gives

  ['ONE', 'TWO', '_ALL', '__class__', '__contains__', '__delattr__',
  '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__',
  '__getattribute__', '__gt__', '__hash__', '__init__', '__le__',
  '__lt__', '__module__', '__ne__', '__new__', '__reduce__',
  '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',
  '__str__', '__subclasshook__', '__weakref__']

And then, depending on whether it's Py2 or Py3, I get either

  frozenset({'one', 'two'})
  frozenset(['two', 'one'])

Which I expect.  Hey, look. There's a __contains__ method. And it
has been specified as a @classmethod.  So I want to test it:

  print("one" in X)

However that fails with

  Traceback (most recent call last):
File "x.py", line 10, in 
  print("one" in X)
  TypeError: argument of type 'type' is not iterable

My understanding was that "in" makes use of an available __contains__
but something seems to preventing Python from finding that.

What's going on here?

Thanks,

-tkc








-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What use is of this 'cast=float ,'?

2017-10-27 Thread Tim Chase
[rearranging for easier responding]

On 2017-10-27 13:35, Robert wrote:
> self.freqslider=forms.slider(
>  parent=self.GetWin( ),
>  sizer=freqsizer,
>  value=self.freq,
>  callback= self.setfreq,
>  minimum=−samprate/2,
>  maximum=samprate/2,
>  num_steps=100,
>  style=wx.SL_HORIZONTAL,
>  cast=float ,
>  proportion=1,
> )
> I am interested in the second of the last line.
> 
>  cast=float ,

The space doesn't do anything.  You have a parameter list, so the
comma just separates "cast=float" from the next parameter,
"proportion=1". The "cast=float" passes the "float()" function as a
way of casting the data.  In this case, it likely expects a function
that takes a number or string, and returns a number that can be used
to render the slider's value/position.

You could create one that works backwards:

  def backwards_float(input):
return -float(input) # note the "-" inverting the interpreted
  value

  forms.slider(
…
cast=backwards_float,
…
)

> I've tried it in Python. Even simply with 
> 
> float

This is just the float() function.

> it has no error, but what use is it?

It's good for passing to something like the above that wants a
function to call.  The body of the function likely has something like

   resulting_value = cast(value)

which, in this case is the same as

   resulting_value = float(value)

> I do see a space before the comma ','. Is it a typo or not?

I think it's unintended.


The whole question started off peculiar because outside of a
function-invocation

   thing = other,

with the comma creates a one-element tuple and assigns the resulting
tuple to "thing"

  >>> x = 42
  >>> x
  42
  >>> y = 42,
  >>> y
  (42,)

Usually people will be more explicit because that comma is easy to
miss, so they'll write

  >>> z = (42,)
  >>> z
  (42,)

so that later people reading the code know that it's intended to be a
one-element tuple.

-tkc






-- 
https://mail.python.org/mailman/listinfo/python-list


Re: choice of web-framework

2017-10-22 Thread Tim Chase
On 2017-10-22 15:26, Patrick Vrijlandt wrote:
> The version control I was referring to, is indeed users' data. I
> plan to use Mercurial for the source code. The questionnaires being
> developed will go through many revisions. The questionnaires being
> filled in, are enough work to have a provision for mistakes. The
> idea is much like the "revert" option that MoinMoin and other wikis
> provide.

Depends on how much version-control'y'ness you want.  Having a
"current version with previous version" and if "resurrect version
$CURRENT-$N as the most recent version" is sufficient, then it's
pretty straight-forward.  If you also want to implement diffing,
merging diffs between various versions, diffing more than a single
text-field blob (a model spread across multiple normalized tables,
versioning changes there), etc, you're looking at a whole different
game.

Additionally, one needs to consider how responses get tied to a
questionnaire.  If I make a questionnaire that reads

 "Do you like ice cream?"
 [ ] Yes
 [ ] No

and you answer "Yes", but then I edit that question so that it reads

 "Do you like to kick puppies?"

you have a problem if it keeps your "Yes" answer and thinks it's
linked to the "same" question.

All that to say that version-control is often domain-specific and
non-trivial.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: choice of web-framework

2017-10-22 Thread Tim Chase
On 2017-10-22 12:24, Patrick Vrijlandt wrote:
> I would like your recommendation on the choice of a web framework.

Might depend on what skills you already bring to the table.  If you
already know an ORM like SQLAlchemy or a template language like
Jinja, you might want to take the "bring the pieces I know/like
together" approach.  For this, Bottle & Flask are the top contenders
last I saw (I did some CherryPy contract work but the docs were
pretty horrible at the time).

If you are genuinely coming to this greenfield, Django's
docs/tutorials make it really easy to get up to speed with all of the
parts involved as it has its own ORM and templating language.  They
can be swapped out if you later need to, but for the average project,
they're sufficient and well documented.

I happen to be in the Django camp, but based on my experiments with
Bottle/Flask, can also recommend them without much hesitation.

> The project is completely new, there are no histories to take into 
> account (current solutions are paper-based). The website involves 
> questionnaires that will be developed, filled out and stored. Users
> are not programmers or developers. They should be authenticated.
> Version control is required.

I'm not sure what "version control is required" means in this
context.  Is this version-control of the users' answers? Or
version-control of the source code.  If it's the source code, the web
framework won't help you there, but git, mercurial, or subversion are
all good/reasonable choices.  If you want to version your user's
answers or other aspects of your application, you'll need to design
it into your app.  There might be plugins/modules to facilitate this
on either side of the Django / Flask/Bottle/SQLAlchemy divide.

> I'm targeting a single deployment (maybe a second on a
> development machine). I usually work on Windows, but Linux can be
> considered.

While both *can* be deployed on Windows, a Unix-like OS (whether
Linux, a BSD, or even a Mac) will likely give you a better deployment
experience and better docs.

> I'm not afraid to learn a (=one) new framework (that would actually
> be fun) but trying out a lot of them is not feasible. My current
> goal is a demonstration version of the project as a proof of
> concept.

I personally find that Django excels at these fast proof-of-concept
projects, as you have less concern about integrating disparate pieces
together at that stage.

> I'm an experienced python programmer but not an experienced web 
> developer.

Both sides offer a "Pythonic" feel to them (some feel less Pythonic)
so it's easy to come up to speed on either.

> A few years ago I read some books about Zope and Plone,

Hah, Zope was my first introduction to Python and I ran screaming.  I
eventually came back around, but it was a jarring first experience.

> The problem seems too complicated for micro frameworks like bottle
> of Flask. Django could be the next alternative.

Django is the top contender, so if you only have time to investigate
one, I'd urge you in that direction.  But Flask or Bottle can also
certainly handle a project like the one you describe.

> Finally, for a new project, I would not like to be confined to
> Python 2.7.

Flask/Bottle and Django are both Python3 ready.  Django, since v2.0
is now 3.0 only.

-tkc






-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Grumpy-pants spoil-sport [was Re: [Tutor] beginning to code]

2017-09-26 Thread Tim Chase
On 2017-09-26 18:25, alister via Python-list wrote:
>>> We've been asked nicely by the list mod to stop :)  
>> 
>> Perhaps we could agree on a subject line tag to be used in all
>> threas arguing about what to call the Python argument passing
>> scheme?  That way the other 99% of us could pre-emptively plonk
>> it?  
> 
> so you are suggesting a system where we could reject by
> reference :-)

I think the suggestion is to bind a name to the thread and reject by
name-binding.

Grinning-ducking-and-running'ly yers...

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Grumpy-pants spoil-sport [was Re: [Tutor] beginning to code]

2017-09-25 Thread Tim Chase
On 2017-09-26 02:29, Steve D'Aprano wrote:
> x = Parrot(name="Polly")
> 
> (using Python syntax for simplicity) and somebody tries to tell me
> that the value of x is anything but a Parrot instance named "Polly",

So this is a Polly-morphic constructor?

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Change project licence?

2017-09-23 Thread Tim Chase
On 2017-09-23 19:14, Chris Angelico wrote:
> On Sat, Sep 23, 2017 at 7:07 PM, Kryptxy 
> wrote:
> > Thank you all! I opened a ticket about the same (on github).
> > I got response from most of them, and all are agreeing to the
> > change. However, one contributor did not respond at all. I tried
> > e-mailing, but no response.
> > Can I still proceed changing the licence? It has been more than a
> > week since the ticket was opened.  
> 
> Nope. Contributions made under the GPL have a guarantee that they
> will only and forever be used in open source projects. You're
> trying to weaken that guarantee, so you have to get clear
> permission from everyone involved.
> 
> Unless you can show that the contributions in question are so
> trivial that there's no code that can be pinpointed as that
> person's, or you replace all that person's code, you can't proceed
> to relicense it without permission.

Alternatively, you can rip out that contributor's code and re-code it
from scratch in a clean-room without consulting their code.  Then
their code is under their license while your re-implementation code
is under whatever license you like.

If their contributions were minor, this might be a nice route to go.
If they were a major contributor, you could be looking at a LOT of
work.

But those are your options:

- keep the project as GPL

- get *ALL* contributors to formally agree to the license change,

- confirm that the contributions of recalcitrant contributor(s)
  are limited to trivial changes, or

- recreate all GPL code in a clean-room under your own license


-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Old Man Yells At Cloud

2017-09-17 Thread Tim Chase
On 2017-09-18 01:41, INADA Naoki wrote:
> > > That said, I'm neither here nor there when it comes to
> > > using print-as-a-statement vs print-as-a-function.  I like
> > > the consistency it brings to the language, but miss the
> > > simplicity that Py2 had for new users.  I'd almost want to
> > > get it back as a feature of the REPL, even if it wasn't
> > > part of the language itself,  
> >
> > Agreed on that point. At least bring it back in the REPL.
> > The removal of the finger friendly print statement has
> > caused more tension in this community than any feature of
> > such small stature should ever cause.
> >
> >  
> >>> x = 42
> >>> x  
> 42
> 
> x (1 keystroke) is easy to type than `print x` (7 keystrokes).
> While sometimes print(x) is different x (which uses repr), it's
> enough for most cases.
> So I can't agree it's REPL unfriendly.

Compare PDB for things that are PDB commands.

  (Pdb) list
  1 j = 42
  2 x = 31
  3 import pdb; pdb.set_trace()
  4  -> j = 15
  5 x = 99
  [EOF]
  (Pdb) x
  31
  (Pdb) j
  *** The 'jump' command requires a line number
  (Pdb) print j
  *** SyntaxError: Missing parentheses in call to 'print'
  (Pdb) j 1
  > /home/tkc/test.py(1)()
  -> j = 42
  (Pdb) j(2)
  *** The 'jump' command requires a line number

You (obstinate interpreter) know what I want, know what I mean, can't
possibly interpret it as some other command. Yet insist on pedantic
parens when you (interpreter) know full well the intended parsing,
accepting "j 1" to jump to line one instead of making me type "j(1)".

:grumble:

old-man-shaking-his-fist-at-the-sky'ly yers,

-tkc














-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Old Man Yells At Cloud

2017-09-17 Thread Tim Chase
On 2017-09-17 16:15, Rick Johnson wrote:
> > I've wanted to do all those things, and more. I love the
> > new print function. For the cost of one extra character,
> > the closing bracket,  
> 
> Oops, _two_ characters! What about the opening "bracket"?

  >>> print(len('print "hello"'))
  13
  >>> print(len('print("hello")'))
  14

Plus two parens, minus one space = net +1 character.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Old Man Yells At Cloud

2017-09-17 Thread Tim Chase
On 2017-09-17 14:16, bartc wrote:
> print() is used for its side-effects; what relevant value does it
> return?

depending on the sink, errors can be returned (at least for the
printf(3) C function).  The biggest one I've encountered is writing
to a full disk.  The return value is how many characters were
written.  In an ideal world,

  data = "Hello" * 1000
  results = print(data)
  assert len(results) == len(data)

but if your disk is nearly full and you're redirecting data to it:

   $ python myprog.py > output.txt

it's feasible that you instruct print() to send 5000 characters but
only 4000 of them get written to the disk.  You might want to check
that condition and handle it in some more graceful way.

-tkc





-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Standard for dict-contants with duplicate keys?

2017-09-15 Thread Tim Chase
On 2017-09-15 17:45, Terry Reedy wrote:
> On 9/15/2017 3:36 PM, Tim Chase wrote:
> >d = {
> >   "a": 0,
> >   "a": 1,
> >   "a": 2,
> >}
> > 
> > In my limited testing, it appears to always take the last one,
> > resulting in
> > 
> >{"a": 2}
> > 
> > Is this guaranteed by the language spec
> 
> https://docs.python.org/3/reference/expressions.html#dictionary-displays
> If a comma-separated sequence of key/datum pairs is given, they are 
> evaluated from left to right to define the entries of the
> dictionary: each key object is used as a key into the dictionary to
> store the corresponding datum. This means that you can specify the
> same key multiple times in the key/datum list, and the final
> dictionary’s value for that key will be the last one given.

Ah, I'd checked the "Data Structures" and "Built-in types" pages, but
missed the "expressions" page.  At least that means that the botched
data in our system is at least *consistently* botched which eases my
work a bit.

Many thanks,

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Standard for dict-contants with duplicate keys?

2017-09-15 Thread Tim Chase
Looking through docs, I was unable to tease out whether there's a
prescribed behavior for the results of defining a dictionary with the
same keys multiple times

  d = {
 "a": 0,
 "a": 1,
 "a": 2,
  }

In my limited testing, it appears to always take the last one,
resulting in

  {"a": 2}

as if it iterated over the items, adding them to the dict, tromping
atop any previous matching keys in code-order.

Is this guaranteed by the language spec, or do I have a long weekend
of data-cleaning ahead of me?  (this comes from an unwitting coworker
creating such dicts that mung customer data, and I am trying to
determine the extent of the damage...whether it's a consistent issue
or is at the arbitrary whims of the parser)

Thanks,

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Case-insensitive string equality

2017-08-31 Thread Tim Chase
On 2017-09-01 00:53, MRAB wrote:
> What would you expect the result would be for:
> 
>>> "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("F")

0

>>> "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("I)

0.5

>>> "\N{LATIN SMALL LIGATURE FFI}".case_insensitive_find("I)

0.6

;-)

(mostly joking, but those are good additional tests to consider)

-tkc





-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Case-insensitive string equality

2017-08-31 Thread Tim Chase
On 2017-08-31 07:10, Steven D'Aprano wrote:
> So I'd like to propose some additions to 3.7 or 3.8.

Adding my "yes, a case-insensitive equality-check would be useful"
with the following concerns:

I'd want to have an optional parameter to take locale into
consideration.  E.g.

  "i".case_insensitive_equals("I") # depends on Locale
  "i".case_insensitive_equals("I", Locale("TR")) == False
  "i".case_insensitive_equals("I", Locale("US")) == True

and other oddities like

  "ß".case_insensitive_equals("SS") == True

(though casefold() takes care of that later one).  Then you get
things like

  "III".case_insensitive_equals("\N{ROMAN NUMERAL THREE}")
  "iii".case_insensitive_equals("\N{ROMAN NUMERAL THREE}")
  "FI".case_insensitive_equals("\N{LATIN SMALL LIGATURE FI}")

where the decomposition might need to be considered.  There are just
a lot of odd edge-cases to consider when discussing fuzzy equality.

> (1) Add a new string method,

This is my preferred avenue.

> Alternatively: how about a === triple-equals operator to do the
> same thing?

No.  A strong -1 for new operators.  This peeves me in other
languages (looking at you, PHP & JavaScript)

> (2) Add keyword-only arguments to str.find and str.index:
> 
> casefold=False
> 
> which does nothing if false (the default), and switches to a
> case- insensitive search if true.

I'm okay with some means of conveying the insensitivity to
str.find/str.index but have no interest in list.find/list.index
growing similar functionality.  I'm meh on the "casefold=False"
syntax, especially in light of my hope it would take a locale for the
comparisons.

> Unsolved problems:
> 
> This proposal doesn't help with sets and dicts, list.index and the
> `in` operator either.

I'd be less concerned about these.  If you plan to index a set/dict
by the key, normalize it before you put it in.  Or perhaps create a
CaseInsensitiveDict/CaseInsensitiveSet class.  For lists and 'in'
operator usage, it's not too hard to make up a helper function based
on the newly-grown method:

  def case_insensitive_in(itr, target, locale=None):
return any(
  target.case_insensitive_equals(x, locale)
  for x in itr
  )

  def case_insensitive_index(itr, target, locale=None):
for i, x in enumerate(itr):
  if target.case_insensitive_equals(x, locale):
return i
raise ValueError("Could not find %s" % target)

-tkc








-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Case-insensitive string equality

2017-08-31 Thread Tim Chase
On 2017-08-31 18:17, Peter Otten wrote:
> A quick and dirty fix would be a naming convention:
> 
> upcase_a = something().upper()

I tend to use a "_u" suffix as my convention:

  something_u = something.upper()

which keeps the semantics of the original variable-name while hinting
at the normalization.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Case-insensitive string equality

2017-08-31 Thread Tim Chase
On 2017-08-31 23:30, Chris Angelico wrote:
> The method you proposed seems a little odd - it steps through the
> strings character by character and casefolds them separately. How is
> it superior to the two-line function? And it still doesn't solve any
> of your other cases.

It also breaks when casefold() returns multiple characters:

>>> s1 = 'ss'
>>> s2 = 'SS'
>>> s3 = 'ß'
>>> equal(s1,s2) # using Steve's equal() function
True
>>> equal(s1,s3)
False
>>> equal(s2,s3)
False
>>> s1.casefold() == s2.casefold()
True
>>> s1.casefold() == s3.casefold()
True
>>> s2.casefold() == s3.casefold()
True


-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python list name in subject

2017-08-22 Thread Tim Chase
On 2017-08-22 08:21, Rick Johnson wrote:
> Grant Edwards wrote:
> > Abdur-Rahmaan Janhangeer wrote:  
> > > 
> > > Hi all,  i am subscribed to different python lists and
> > > they put their names in the subject [name] subject  hence
> > > i can at a glance tell which mail belongs to which list.
> > > A requests to admins to implement if possible  
> > 
> > Please don't. It wastes space which is better used on the
> > subject.  If you want the mailing list prepended, then
> > configure procmail (or whatever) to do it for you.  
> 
> Although, considering that the BDFL has now made type-hints
> an official part of the language, a "forum-of-origin" type-
> hint, may be more Pythonic than we care to realize. 

Checking mailing list headers...yep, the "forum-of-origin" type hint
is already present in standards-compliant fashion defined by
RFC4021[1]:

List-Id: General discussion list for the Python programming language
 
List-Unsubscribe: ,
 
List-Archive: 
List-Post: 
List-Help: 
List-Subscribe: ,
 

Just need a mail client that knows about standards and isn't
fettered. ;-)

-tkc


[1] https://tools.ietf.org/html/rfc4021#section-2.1.31
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-13 Thread Tim Chase
On 2017-08-14 00:36, Steve D'Aprano wrote:
> Python's comprehensions are inspired by Haskell's, but we made
> different choices than they did: we make the fact that a
> comprehension is a loop over values explicit, rather than implicit,
> and we use words instead of cryptic symbols.

I keep wanting to like Haskell for some of the concepts &
optimizations it can do, but the whole "words instead of cryptic
symbols" readability thing keeps me coming back to Python.

-tkc





-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Proposed new syntax

2017-08-12 Thread Tim Chase
On 2017-08-11 00:28, Steve D'Aprano wrote:
> What would you expect this syntax to return?
> 
> [x + 1 for x in (0, 1, 2, 999, 3, 4) while x < 5]

[1, 2, 3]

I would see this "while-in-a-comprehension" as a itertools.takewhile()
sort of syntactic sugar:

>>> [x + 1 for x in takewhile(lambda m: m < 5, (0,1,2,999,3,4))]
[1, 2, 3]



> For comparison, what would you expect this to return?
[snip]
> [x + y for x in (0, 1, 2, 999, 3, 4) while x < 5 for y in (100,
> 200)]

This one could make sense as either

[100, 200, 101, 201, 102, 202]

or

[100, 101, 102]

(I think the default evaluation order of nested "for"s in a
comprehension would produce the former rather than the latter)

Thus it would be good to define behavior for both of these cases:

[x + y for x in (0, 1, 2, 999, 3, 4) while x < 5 for y in (100, 200)]

vs.

[x + y for x in (0, 1, 2, 999, 3, 4) for y in (100, 200) while x < 5]

-tkc



Things would get even weirder when you have nested loopings like
that and one of the sources is an iterator.

-tkc




-- 
https://mail.python.org/mailman/listinfo/python-list


Re: how to get partition information of a hard disk with python

2017-07-07 Thread Tim Chase
Strange.  The OP's message didn't make it here, but I'm seeing
multiple replies

> On Wednesday, September 22, 2010 at 4:01:04 AM UTC+5:30, Hellmut
> Weber wrote:
> > Hi list,
> > I'm looking for a possibility to access the partiton inforamtion
> > of a hard disk of my computer from within a python program.

You don't specify whether your disk has MBR, GPT, or some other
partitioning scheme.  However, at least for MBR, I threw together
this code a while back:

https://mail.python.org/pipermail/python-list/2009-November/559546.html

I imagine something similar could be done in the case of a GPT.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: School Management System in Python

2017-07-05 Thread Tim Chase
On 2017-07-06 11:47, Gregory Ewing wrote:
> The only reason I can think of to want to use tsv instead
> of csv is that you can sometimes get away without having
> to quote things that would need quoting in csv. But that's
> not an issue in Python, since the csv module takes care of
> all of that for you.

I work with thousands of CSV/TSV data files from dozens-to-hundreds
of sources (clients and service providers) and have never encountered
a 0x09-as-data needing to be escaped.  So my big reason for
preference is that people say "TSV" and I can work with it without a
second thought.

On the other hand, with "CSV", sometimes it's comma-delimited as it
says on the tin.  But sometimes it's pipe or semi-colon delimited
while still carrying the ".csv" extension.  And sometimes a
subset of values are quoted. Sometimes all the values are quoted.
Sometimes numeric values are quoted to distinguish between
numeric-looking-string and numeric-value.  Sometimes escaping is done
with backslashes before the quote-as-value character. Sometimes
escaping is done with doubling-up the quoting-character.  Sometimes
CR(0x0D) and/or NL(0x0A) characters are allowed within quoted values;
sometimes they're invalid.  Usually fields are quoted with
double-quotes; but sometimes they're single-quoted values.  Or
sometimes they're either, depending on the data (much like Python's
REPL prints string representations).

And while, yes, Python's csv module handles most of these with no
issues thanks to the "dialects" concept, I still have to determine
the dialect—sometimes by sniffing, sometimes by customer/vendor
specification—but it's not nearly as trivial as

  with open("file.txt", "rb") as fp:
for row in csv.DictReader(fp, delimiter='\t'):
  process(row)

because there's the intermediate muddling of dialect determination or
specification.

And that said, I have a particular longing for a world in which
people actually used the US/RS/GS/FS (Unit/Record/Group/File
separators; AKA 0x1f-0x1c) as defined in ASCII for exactly this
purpose.  Sigh.  :-)

-tkc










-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Customise the virtualenv `activate` script

2017-06-16 Thread Tim Chase
On 2017-06-16 15:53, Ben Finney wrote:
> > I must admit my initial preference would be the differently named
> > wrapper. Surely users of the codebase will be invoking stuff via
> > something opaque which sources the requisite things?  
> 
> That “something opaque” is the ‘$VENV/bin/activate’ script; many
> people who join the team will already know that, and I'm trying to
> make use of that existing convention.
> 
> > Actually, on trying to write something simple and flexible, since
> > once made the venv is basicly state WRT the activate script, I'm
> > leaning towards hacking the activate script, probably by keeping
> > a distinct file off the the side and modifying activate to source
> > it.  
> 
> Yeah, I'd much prefer to be told there's a hook to use, so that
> someone who creates a standard Python virtualenv the conventional
> way will not need to then hack that virtualenv.

At least within virtualenvwrapper (I'm not sure whether they come
with virtualenv proper), in my $WORKON_HOME and
$WORKON_HOME/$VIRTUALENV/bin directories, I have a bunch of pre* and
post* templates including preactivate and postactivate hooks in which
I can put various bash scripting.  I've only used them once or twice
and don't have an example readily at hand, but it seems would give you
what you're looking for.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Check for regular expression in a list

2017-05-26 Thread Tim Chase
On 2017-05-26 13:29, Cecil Westerhof wrote:
> To check if Firefox is running I use:
> if not 'firefox' in [i.name() for i in list(process_iter())]:
> 
> It probably could be made more efficient, because it can stop when
> it finds the first instance.
> 
> But know I switched to Debian and there firefox is called
> firefox-esr. So I should use:
> re.search('^firefox', 'firefox-esr')
> 
> Is there a way to rewrite
> [i.name() for i in list(process_iter())]
> 
> so that it returns True when there is a i.name() that matches and
> False otherwise?
> And is it possible to stop processing the list when it found a
> match?

this sounds like an ideal use-case for any():

  if any("firefox" in p.name for p in process_iter()):
do_stuff()

or

  if any(p.name.startswith("firefox") for p in process_iter()):
do_stuff()

The any() call stops after the True-ish match (and the all() call
stops after the first False-ish match).

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Concatenating files in order

2017-05-23 Thread Tim Chase
On 2017-05-23 13:38, woo...@gmail.com wrote:
> It is very straight forward; split on "_", create a new list of
> lists that contains a sublist of [file ending as an integer, file
> name], and sort
> 
> fnames=["XXX_chunk_0",
> "XXX_chunk_10",
> "XXX_chunk_1",
> "XXX_chunk_20",
> "XXX_chunk_2"]
> sorted_list=[[int(name.split("_")[-1]), name] for name in
> fnames] print "before sorting", sorted_list
> sorted_list.sort()
> print "after sorting ", sorted_list

Which is great until you have a file named "XXX_chunk_header" and
your int() call falls over ;-)

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   5   6   7   8   9   10   >